Discretize by Density

Question

In the Bayes software Genie there is a discretisation method by giving the number of bins and getting the clusters around most dense areas of an attribute. If you have e.g. two or three separable gauss distributions in your attribute and define three bins the clusters are hierarchical i.e. density based placed around each gauss set.

It would be nice to have this also in RapidMiner.

It seems, that entropy based discretisation is comparable but the number of bins cannot be preselected.

TobiasMalbrecht · Answer

Dear Michael, using a hierarchical clustering on a data set containing only the attribute to be discretized should yield the desired result. Simply flat the cluster model afterwards specifying the number of discrete values you would like to obtain. Please find attached a process that shows how it works: attribute a4 number_of_classes 3 Best, Tobias