Decision Tree Data exploration with numerical value
b00122599
New Altair Community Member
Hey folks,
I am fairly new to data science but wish to use a deicision tree to explore a dataset. The dataset has no label so I am assigning a label that would be a numerical value of 1-20. Would it be possible to have my label to target only high scorers on that attribute so a the class label would only be those objects which are scored 15 - 20 on the attribute I select as a label? If this make sense would anyone have any ideas of how to do so in rapidminer?
Any help is much appreciated.
Neil.
I am fairly new to data science but wish to use a deicision tree to explore a dataset. The dataset has no label so I am assigning a label that would be a numerical value of 1-20. Would it be possible to have my label to target only high scorers on that attribute so a the class label would only be those objects which are scored 15 - 20 on the attribute I select as a label? If this make sense would anyone have any ideas of how to do so in rapidminer?
Any help is much appreciated.
Neil.
Tagged:
0
Best Answer
-
Thanks very much for the pointers guys much appreciated0
Answers
-
Hi @b00122599
Trying to understand what you want, So you are adding a label column whose labels range between 1 and 20 (1,2,3,... 20). But you want to predict only labels between 15 and 20 which you treat as high scores. If you want to apply a decision tree for classification purpose it will train based on all the labels unless you delete unnecessary labels from the data. You can train a model only on labels from 15 to 20 by filtering examples (your model doesn't train on 1 to 14 labeled samples).1 -
Or perhaps an even better solution would be to discretize your numerical label and turn it into a nominal attribute instead, where values of 15-20 get the class "high" and the others get the class "low." This can be done with multiple operators in RapidMiner including Discretize by User Specification or Generate Attributes.
Then you will simply use that as your label and you will have a typical classification problem, which your Decision Tree learner should handle easily.0 -
Thanks very much for the pointers guys much appreciated0