.
1. Conduct any data preparation that you need for your data set. This may
include handling inconsistent data, dealing with missing values, or changing
data types. Remember that in order to calculate means, each attribute in your
data set will need to be numeric. If, for example, one of your attributes
contains the values ‘yes’ and ‘no’, you may need to change these to be 1 and 0
respectively, in order for the k-Means operator to work.
2.Connect a k-Means operator to your data set, configure your parameters
(especially set your k to something
meaningful for your question) and then run your model.
3. Investigate your Centroid Table, Folder View, and the other evaluation
tools.
4. Report your findings for your clusters. Discuss what is interesting
about them and describe what iterations of modeling you went through, such as
experimentation with different parameter values, to generate the clusters.
Explain how your findings are relevant to your original question
5.Experiment with the other k-Means operators in RapidMiner, such as
Kernel or Fast. How are they different from your original model? Did the use of
these operators change your clusters, and if so, how?