Find more posts tagged with
Sort by:
1 - 2 of
21
i think we go both ways and take the average of the prediction, but i would need to check
Generally, random forest algorithms impute missing values by an average of proximity values or mode. But if you are selecting Criterion as gain_ratio it uses C 4.5 algorithm developed by Quinlan, in this it doesn't impute values but it will calculate an impurity score based on missing values and uses it if it encounters missing values in the test set. So looks like it's not removing samples with missing values or it depends on criterion we are selecting.
correct me this if there any misconception.
Thanks
correct me this if there any misconception.
Thanks
Sort by:
1 - 2 of
21
i think we go both ways and take the average of the prediction, but i would need to check
Generally, random forest algorithms impute missing values by an average of proximity values or mode. But if you are selecting Criterion as gain_ratio it uses C 4.5 algorithm developed by Quinlan, in this it doesn't impute values but it will calculate an impurity score based on missing values and uses it if it encounters missing values in the test set. So looks like it's not removing samples with missing values or it depends on criterion we are selecting.
correct me this if there any misconception.
Thanks
correct me this if there any misconception.
Thanks