hi,
I have two datasets with 4 classes, both are parameterised (tabular) versions of grain structure images... parameters are grain size, etc. The thing is, the second time that the images were parameterised, the dataset scored about 10% better than the first dataset. I now want to understand why that is and would like to compare the two datasets. However, in the visualisation, they appear to have a completely identical range, standard deviation etc. I am using Rapidminer as a tool. I looked the deviation chart and it looked almost the same. My question is now, is there a way to compare two datasets and make reliable conclusions why the one is better than the other? And what is the best way to compare them? how would you proceed?