Interpretation of Gradient Boosted Tree Model results

felix_woess
felix_woess New Altair Community Member
edited November 2024 in Community Q&A

Dear Rapidminer Community, 

 

I have a question regarding the understanding how the Gradient Boosted Tree model works. 

 

I am using GBT to make a prediction and I have optimized the model to following setting, which delivers for me the best results:

Number of trees: 220

Max. depth: 10

Number of Bins: 20

Learnin Rate: 0.1

 

The output I can see now in the model is 205 trees (where each has 7 "sub-trees"). 

 

My questions is now, how does it work to get from these vast amount of trees to a final prediction? Is the last tree (Tree 205) the closest to my desired result and therefore the prediction is based on this "final" tree? Or is a prediction made based on an average of all the 205 trees? 

 

I have read Martin's article about GBT (https://community.rapidminer.com/t5/RapidMiner-Studio-Knowledge-Base/A-Practical-Guide-to-Gradient-Boosted-Trees-Part-I-Regression/ta-p/36379) but I still don't get how I can interprete the model results delivered by Rapidminer. 

 

Any help to foster my understanding how this works is greatly appreciated! :)

 

Best regards

Felix

 

Tagged:

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee

    hi @felix_woess,

     

    it's the sum of the individual trees. I like to see it like a taylor's expansion. Every new tree is another step in taylor and approximates the function more (until it overfits..)

     

    BR,

    Martin 

  • felix_woess
    felix_woess New Altair Community Member

    Hi @mschmitz

     

    thank you very much for your reply! :D 

     

    By summing up the trees you mean summing up all the trees of a specific prediction range e.g. (0-0,1) like this:

    Tree (1) (0-0,1)

    Tree (1) (0,1-0,2)

    Tree (2) (0-0,1)

    Tree (2) (0,1-0,2)

    Tree (3) (0-0,1)

    Tree (3) (0,1-0,2)

    ...

     

    Furthermore, when I have a look at the trees in Rapidminer I can see at the bottom of the trees numbers like 0,086, -0,02, 0,5 etc. What do these numbers mean, how can I interprete them? 

     

    Best regards

    Felix

     

  • MartinLiebig
    MartinLiebig
    Altair Employee

    Hi,

    i think the numbers represent the gain in gradient. The number which is added is the average (for MSE als loss measure) of the examples in the leaf. I think these are not the displayed values.

     

    BR,

    Martin

  • felix_woess
    felix_woess New Altair Community Member

    Hi, 

     

    I reread your article about GBT and now I get these numbers! 

     

    Regarding my first question about summing up the trees, is it correct to sum up the trees for the specific prediction range (highlighted in my previous post) or should I take all the trees? 

     

    Sorry to bother you again :-/ 

     

    Best regards

    Felix