"Increase of minimum leaf size in Decision Tree"

b00122599
b00122599 New Altair Community Member
edited November 5 in Community Q&A
Hey folks,

I have increase the minimum leaf size in my decision tree, this has result in a smaller more readable tree, but a small decrease in accuracy. I'm being asked what this says about my dataset I'm presuming I'm overfitting the data but I'm not sure. Would anyone have any idea? 

Thanks in advance,

Neil. 

Best Answer

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓
    Hi,
    Not necessarily.  Increasing the leaf size is just a different way of pruning the tree.  The goal is to find a good balance between generalizing from your training data without missing the underlying patterns.
    I am assuming that you refer to a properly validated test accuracy on an independent data set (e.g. by using cross validation) here.  If this is the case, then this reduction in accuracy is actually not a sign that you have been overfitting before you made the change, but that you now start to miss some of the valid patterns in your data.
    Please also note that changes in accuracy may not be significant at all.  And that there are other criteria for good models (like understandability), so you may even want to go with a less accurate but more understandable model.
    Hope those thoughts helps a bit,
    Ingo

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓
    Hi,
    Not necessarily.  Increasing the leaf size is just a different way of pruning the tree.  The goal is to find a good balance between generalizing from your training data without missing the underlying patterns.
    I am assuming that you refer to a properly validated test accuracy on an independent data set (e.g. by using cross validation) here.  If this is the case, then this reduction in accuracy is actually not a sign that you have been overfitting before you made the change, but that you now start to miss some of the valid patterns in your data.
    Please also note that changes in accuracy may not be significant at all.  And that there are other criteria for good models (like understandability), so you may even want to go with a less accurate but more understandable model.
    Hope those thoughts helps a bit,
    Ingo
  • b00122599
    b00122599 New Altair Community Member
    Thank you very much for your reply you're very kind.