How do Random Forests in RapidMiner support missing values?

dudwell
dudwell New Altair Community Member
edited November 2024 in Community Q&A
Does a random forest predict a missing value or does it exclude it from the final prediction 

Best Answers

  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Generally, random forest algorithms impute missing values by an average of proximity values or mode. But if you are selecting Criterion as gain_ratio it uses C 4.5 algorithm developed by Quinlan, in this it doesn't impute values but it will calculate an impurity score based on missing values and uses it if it encounters missing values in the test set. So looks like it's not removing samples with missing values or it depends on criterion we are selecting.

    correct me this if there any misconception.

    Thanks 

Answers

  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Generally, random forest algorithms impute missing values by an average of proximity values or mode. But if you are selecting Criterion as gain_ratio it uses C 4.5 algorithm developed by Quinlan, in this it doesn't impute values but it will calculate an impurity score based on missing values and uses it if it encounters missing values in the test set. So looks like it's not removing samples with missing values or it depends on criterion we are selecting.

    correct me this if there any misconception.

    Thanks