Why would an attribute in a data set not be included in a generated decision tre
truetaurus
New Altair Community Member
Say I have a data set of customers with information such as bank account, age, telephone, credit history, employment, etc...
Why when I use RapidMiner, are some attributes not in the generated decision tree such as telephone or age? What could be the various reasons for this?
Why when I use RapidMiner, are some attributes not in the generated decision tree such as telephone or age? What could be the various reasons for this?
Tagged:
0
Answers
-
because they did'nt consider as important attributes or other attributes were enough to make a tree
u can change DT settings in order to draw a tree with more branches which may include other attributes0 -
Well im happy some attributes were not, but what is the deeper meaning to it, I know the attribute is not statistically important, but why? If I had to investigate and explain why it is not in my model, what could I say?0
-
Hi,
the decision tree uses a so-called criterion to choose the next attribute for a split. By default, this criterion is the gain_ratio. You will find information about this if you search for e.g. Information Gain. There should be some good article out which explain this measure in detail.
If the Gain (with respect to the chosen criterion) that results from splitting by an attribute is less than the corresponding parameter in the decision tree, then the tree algorithm will simply not include the attribute.
Best, Marius0 -
Btw, I doubt that you'll get better answers on RapidMiner related questions on StackOverflow than here0
-
haha I know im just trying to find a decent answer.
So what would you say would be the reason for an attribute not being included though in a decision tree?0 -
Did you see my first post?0