A program to recognize and reward our most engaged community members
So far, I've used a leave-one-out cross validation (due to the smallnumber of examples in the learning set which is about 400) to evaluatethe accuracy (classification error), i.e. how many examples in the test setwere incorrectly predicted. However, I don't think that this is sufficientfor a reliable performance evaluation. What else should I measure?
Can a significance test bealso exploited to make performance assumption about a single classifier?If so, what hypothesis should be tested? And how can this be achievedin RapidMiner which for T-Test expects two PerformanceVectors?
One idea is to calculate the expected value of your measure when using a random classifier i.e. a classifier assigning random classes to all instances. Then you can perform a simple one-sided test given an appropriate distribution assumption. Thus you will see whether your classifier is significantly better than random.
Does it make sense to compare a "real" model against a random classifier? Is this a common approachused in practice? Are the accuracy, precision and recall measurements not sufficient?
Any why is it not possible to model your suggested approach in RapidMiner?