"handling ordinal data and attribute weighting"
yogafire
New Altair Community Member
hello,
my first question is about handling ordinal data.
I have a dataset, described like this...
ID--> ID of each patient --> nominal
Age --> age of each patient --> integer
Test_i (i=1,..,4) -->score/grade of each test.--> ordinal--> [value ranges from 1,...,6, higher grade is more severe]
Class --> {cancer, not cancer) --> class of this dataset.
my question is, how to handle the attribute Test_i, it has ordinal data. Can I simply handle those attributes as integer or maybe I should handle those attribute as weights, or there are another ways?
my second question is about attribute weighting.
in that dataset, I use multiple attribute weighting technique, but i found that one out of 5 attribute resulted in 0 by normalizing weights and resulted in very much less value than other (e.g. the weight of this attribute is 0.00xx, the others is about >= 0.2xxx). Can I simply ignore that attribute?
Thank you very much for your reply.
best regards,
Dimas Yogatama
my first question is about handling ordinal data.
I have a dataset, described like this...
ID--> ID of each patient --> nominal
Age --> age of each patient --> integer
Test_i (i=1,..,4) -->score/grade of each test.--> ordinal--> [value ranges from 1,...,6, higher grade is more severe]
Class --> {cancer, not cancer) --> class of this dataset.
my question is, how to handle the attribute Test_i, it has ordinal data. Can I simply handle those attributes as integer or maybe I should handle those attribute as weights, or there are another ways?
my second question is about attribute weighting.
in that dataset, I use multiple attribute weighting technique, but i found that one out of 5 attribute resulted in 0 by normalizing weights and resulted in very much less value than other (e.g. the weight of this attribute is 0.00xx, the others is about >= 0.2xxx). Can I simply ignore that attribute?
Thank you very much for your reply.
best regards,
Dimas Yogatama
0
Answers
-
Hi,
the answer depends a bit on the learning scheme you intend to (or have to) use. For example, let's say you want to model your problem with a linear regression scheme. In that case I would suggest to read the data as nominal (or transform them to nominal), transform the columns to binominal (Test_1 = 1 with values "true" and "false", Test_1 = 2 with values "true" and "false"...), and then to numerical (leading to 0 and 1). No the learning scheme can handle the fact that the values are ordered by assigning specific weights to each attribute - and hence to each of the ordered values. The same argument would of course also apply if the data is not ordered but simply nominal
my question is, how to handle the attribute Test_i, it has ordinal data. Can I simply handle those attributes as integer or maybe I should handle those attribute as weights, or there are another ways?
If the performance does not drop on an independent test set: yes.
Can I simply ignore that attribute?
Cheers,
Ingo0