🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Balanced sampling decision trees"

ddrUser: "ddr"
New Altair Community Member
Updated by Jocelyn
Hi everyone,

I'm just starting to use rapidminer and I have a problem with decision trees. I am working with a somewhat large dataset (approximately 500.000 cases). I am trying to use decision trees to predict if users are willing or not to buy a product. The problem is that the buying rate is very low 0.5%. When using stratified sampling with a ratio of 50% with the "sample" operator as pointed out somewhere in a similar thread in the forum, my tree is always biased towards the majority class so the results are totally useless. Is there any way I can balance the outcome variable with a rate of 50-50% do the modeling, and then rebalance the samples to their original rates? I have searched over the forum but trying all the answers and searching over many operators in rapidminer didn't gave me any results.

Thanks a lot in advance!

Find more posts tagged with

Sort by:
1 - 3 of 31