Hello all,
In the algorithm description "Weight by Tree Importance" I found a remark concerning the split criterion of trees:
This algorithm is implemented following the idea from "A comparison of random forest and its Gini importance with standard chemometric methods for the feature selection and classification of spectral data" by Menze, Bjoen H et all (2009). It has been extended by additional criterias for computing the benefit created from a certain split. The original paper only mentioned Gini Index, this operator additionally supports the more reliable criterions Information Gain and Information Gain Ratio. |
Is there any theoretical (or maybe empirical) background, in how far Gain Ratio could be superior to the Gini Index? Would be very interesting because I indeed experienced a better performance for the Gain Ratio, compared to a setup with the Gini Index, but I have no idea why and if its statistically significant. Maybe, who ever wrote this algorithm description/wiki-entry can point me to a reference paper or the basic theory behind this assumption?
Ole