hi,
I was trying out k-nn on my dataset (4500 example rows and 25 numeric attributes), the thing here is, altough it's said that one should normalize the attributes, the performance decreases sharply when I use different kinds of normalization..
2nd, when I use feature selection with only the best 6 attributes, I have a higher performance, so far, so good.
But the thing that wonders me now is that when I built in a sampling (Bootstrapping) operator before k-nn, I get much higher performance, about 90% opposed to 80% before that... is it because it puts more weight on some instances, and leaves out other instances so that you achieve a much higher classification?