Cluster Sampling in RapidMiner
StefanRei
New Altair Community Member
Hi,
i would like to use the Cluster Sampling Method in RapidMiner (e.g. look at Towardsdatascience Article for Sampling Techniques)
Do you have any suggestions?
Thank you very much.
Bes
i would like to use the Cluster Sampling Method in RapidMiner (e.g. look at Towardsdatascience Article for Sampling Techniques)
Do you have any suggestions?
Thank you very much.
Bes
Tagged:
0
Comments
-
You'll have to incorporate this via a python script or R script since there is no native RapidMiner operator that implements this particular algorithm.1
-
Hello @StefanRei
I am not sure if there is a particular operator in RM to do this. If this is implemented in Python or R, you can use the script operators to embed in the RM process.
One disadvantage from my view is that it is selecting entire sampled data from a few clusters which might either over-represent or under-represent the distributions in data. The problem with this is the high variations (low precision) in results. The major advantage is the processing time (fast) as it doesn't go through all the samples in our dataset. If you would like to have more precise results, you can go with stratified sampling.
Based on the concept, one way to do what you need is by using clustering algorithms to generate clusters and select few clusters from that and test your process and observe how it goes. I didn't try this but got an idea based on the concept.
Hope this helps.0