"[SOLVED] can't get decision tree to work"

mefisto66
mefisto66 New Altair Community Member
edited November 5 in Community Q&A
Hello,

I'm new to rapid miner and I'm trying to create a decision tree which ultimately shows variables that lead people to churn. I got the data from here:

http://www.dataminingconsultant.com/data/churn.txt

I first saved it as a csv file. Imported into rapidminer and set everything as nominal.  phone# as ID, churn as label, and everything else as attribute.

from there I put the data linked to to a x-validation operator, and when I run it, the tree only shows up one box (leaf, node?) sorry it might be a stupid question, but is the first process I do in rapid miner.


any help greatly appreciated.

Answers

  • earmijo
    earmijo New Altair Community Member
    What version of RapidMiner are you using?

    I use that dataset to teach DataMining and was having the same problem you are having. So I was going to post a comment about the dataset but decided to give a try one more time with version 5.2. It turns out that the problem has been fixed.

  • mefisto66
    mefisto66 New Altair Community Member
    I just downloaded it five days ago.So I have the latest version, I can't check rigt now because I'm away from my computer.

    if I may ask.what attributes did you give to the data when you imported it? I'm thinking that might be it.
  • earmijo
    earmijo New Altair Community Member
    I used:

    label: churn
    id: phone number

    attributes: all other ,

    I also set the attribute type of area code to polynominal

    One comment: I typically use the gini index as criterion (I learned my trees from Breiman ). If you use the default criterion you get a tree with a single node unless you play with (lower) the minimal gain to, say, 0.01.

    Hope this helps,

    \E.
  • mefisto66
    mefisto66 New Altair Community Member
    No luck. I tried your suggestions and it seems that all comes down to minimal leaf size. if set to 1, it gives a huge tree and if set to any number > 1 it gives one simple node stating the amount of churners and non churners for the data set. Tried playing with the other options and nothing.

    For some reason it seems it is not clustering the data points but rather it analyzes each one and it includes each data point in the decision tree. if that makes any sense.  Even reducing the dimensions I know don't have any weight doesnt make it work. I'll be posting a picture of the results soon
  • mefisto66
    mefisto66 New Altair Community Member
  • mefisto66
    mefisto66 New Altair Community Member
    FIGURED IT OUT!!!

    I played around with the variables and set ID + label = nominal. Everything else was set to real and it gave me a tree. Hooray.

    thanks for the help