Why would an attribute in a data set not be included in a generated decision tre

New Altair Community Member

Nov 8, 2012

Updated Nov 5, 2024 by Jocelyn

Say I have a data set of customers with information such as bank account, age, telephone, credit history, employment, etc...

Why when I use RapidMiner, are some attributes not in the generated decision tree such as telephone or age? What could be the various reasons for this?

Find more posts tagged with

AI Studio

Sort by:

1 - 6 of 61

blueearth

New Altair Community Member

Nov 8, 2012

because they did'nt consider as important attributes or other attributes were enough to make a tree
u can change DT settings in order to draw a tree with more branches which may include other attributes

truetaurus

New Altair Community Member

Nov 8, 2012

Well im happy some attributes were not, but what is the deeper meaning to it, I know the attribute is not statistically important, but why? If I had to investigate and explain why it is not in my model, what could I say?

MariusHelf

New Altair Community Member

Nov 9, 2012

Hi,

the decision tree uses a so-called criterion to choose the next attribute for a split. By default, this criterion is the gain_ratio. You will find information about this if you search for e.g. Information Gain. There should be some good article out which explain this measure in detail.

If the Gain (with respect to the chosen criterion) that results from splitting by an attribute is less than the corresponding parameter in the decision tree, then the tree algorithm will simply not include the attribute.

Best, Marius

MariusHelf

New Altair Community Member

Nov 9, 2012

Btw, I doubt that you'll get better answers on RapidMiner related questions on StackOverflow than here

truetaurus

New Altair Community Member

Nov 9, 2012

haha I know im just trying to find a decent answer.

So what would you say would be the reason for an attribute not being included though in a decision tree?

MariusHelf

New Altair Community Member

Nov 9, 2012

Did you see my first post?

Why would an attribute in a data set not be included in a generated decision tre

Find more posts tagged with

Quick Links