HOW TO Validate k-means Clustering?
It seems like a simple question. I have a dataset I am performing a k-means cluster analysis for consumers bankruptcy tendency (k=2). I need to know the best way to validate my models predictive accuracy. I have wasted about 5 hours trying and failing.
My text states the easiest way is by generating a confusion/classification matrix, but for the life of me, I cannot figure out what setting/operator/selection etc. to do this in RM!!!
All I get for my results is shown below. This is not good enough for me to know how well my model is performing against my testing/validation set. I am using a cross validation operator containing my cluster model on the training section, and the apply model and cluster distance performance operator on the training section. All i get is this. Why so little information?
Avg. within centroid distance
Avg. within centroid distance: -6.053 +/- 0.279 (mikro: -6.053)
I have attached my dataset and xml of my process.