🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

discretize by variance?

User: "omoratto"
New Altair Community Member
Updated by Jocelyn
Hi. I have a DB, each row represents a person. One of the columns is the income. I tried to apply a K-Means to group the data set. Originally, I normalized and applyied logs to the income column, but the either way, results are not logical, because it groups people very dissimilar in terms of income. Although income is not the only variable, it is an important one. Because income has a big coefficient of variation (1000%), I though I can construct bins with similar coefficient of variation, i.e., up to 30%. After discretizing, I should transform the bins to numerical values in order to be used by the k-means operator.

It can be done in rapid miner? Any ideas that can help me.

Find more posts tagged with