-
Feature Importance for Regression Random Forest
Hello Everyone, I am looking for an operator (or any other way) to find the attribute importance of my model. I have selected an RF model and tried to use the operator "Weight by Tree Importance" to find the weights of my attributes. However, I received the following error message: Attribute Weights cannot be extracted…
-
Gradient Boosted Tree don's give the final prediction
Hello Rapidminer Community ! I want to ask regarding Gradient Boosted model
that i used for my study on predicting corporate default risk. My dependent variable is
default and non default and i use number 1 as default and 0 as non default. I
already setup the data type as binominal. After
i call the related operators such…
-
Gradient Boosted Tree don't show the dependent variable in the resulting trees
Hello, i want to ask regarding Gradient Boosted model that i used for my study on corporate default risk. My dependent variable is default and non default and i use number 1 as default and 0 as non default. I already setup the data type as binominal for the default and non default. After i call the related operators such…
-
Convert categorical variables into dummy variables
Hi, I want to perform a regression task to predict continuous response. I have 4 categorical variables, others are numerical. Categorical variables are:age=(≤20, 21-35, 36-50, ≥51)gender=(Female, Male)income level=(1=insufficient, 2=sufficient)BMI range=(1=<25, 2=>25)*Income level & BMI are keyed in as numerical code in my…
-
Is there a way to tune sample size as a hyperparameter for Random Forest?
Along these hyperparameters, I also want to tune sample size. I tried using sampling operators in RM but those are not random like what would you get for bootstrapping. I there any way to that or am I missing something?
-
how many trees and datasets are used to optimize random forest?
I'm making predictions that produce fast, medium, and slow predictions. I used 100 trees and around 1000 data training. but always returns fast prediction.
-
Optimizing Random Forest using Genetic Algorithm
Hi everyone, I would like to ask about classifier using Genetic Algorithm. I have a process in rapidminer like the attached XML but I'm not sure whether what I am doing is right or wrong. Thank you
-
Optimization Grid with Random Forest - Not Working.
RapidMiner Unicorns 🦄, I trying to run a optimization grid with our my Random Forest model and I am getting an error. It's stating that gain_ratio criterion cannot be used for numeric labels (see pictures below). I checked all my parameters and I am not using gain_ratio in the optimization grid (see pictures below). So,…
-
How to create one final decision tree
Hello, I need to create on final decision tree based on 25 different decision trees but I am not sure how to do that. They need to be constructed in a sequential manner where we update the weights of the training examples based on the prediction and the error rate of the previous decision. Right now I have the random…
-
Random Forest keeps getting worse
Hi, I'm using different Methods to compare the results using the design from this tutorial:https://academy.rapidminer.com/learn/video/automatic-classification-of-documentsDue to the runtime I initially set Random Forest to only 10 trees and only 5-fold cross-validation (Naive Bayes and Decision Tree were run with…
-
Stacking Classification Model
Hello :) I'm working on a binary classification model.I'm trying to improve the performance using a stacking model with RF, Deep Learning and Gradient Boosted Tree. How can I choose the best algorithm for the aggregation's level? Is there a best practice? (Now I'm using RF)
-
Improve Random forest performance
Hello! :) I'm working on a random forest predictive model to predict a binary label. The dataset is about 70% and 30% unbalanced. The attributes are numeric and represent financial statement indices or amounts in euros such as EBITDA. The process includes data reading, selection of features with missing value <10%,…
-
In what order can i use replace missing value and normalization?
Hello! :) I'm trying to build a classification model with RF and I have a data set with many missing values. I tried to use RF without replacing missing values but the performance is not good. For this, I'd like to replace missing values with average. What is the better order to insert the operator? Before or after…
-
What is Needed to Make this Random Forest Model Predict into the Future?
Hi, I've used the Apply Model operator as suggested to try and make this Random Forest model to predict the "future" -- but it still doesn't predict beyond the last data of my data set (15th Sept 2020)? I've attached the process to see if anyone can figure out what is needed to get it to make forward future predictions?…
-
Issue found in feature weight of RandomForest for regression
It seems that there is an issue or a bug in the feature_weights returned by RandomForest operator, but only for regression. I found that problem on one dataset but I reconstructed it on IRIS dataset for which features a3 and a4 are the most important but according to the regression RandomForest these two features are the…
-
Cross Validation with Random Forest
Hello there,I cannot overcome a very simple problem that I have shown in the attached figure.I want to do CrossValidation with RandomForest. When I enter the inside of CV; I cannot get the output of the port (Wei), which gives the weight coefficients of the RF on the left, out of the CV. When I do it with data split, I can…
-
Need to Use Random Forest Classifier to Replace MIssing Value in Spreadsheets
Hi everyone = i have been using =MODE() etc to impute missing values in column arrays. Sick of this. Seems unsophisticated, and not much fun, either. A more high level ML guy told me that he uses the Random Forest Classifier to rid himself of all the empty cells (which for me are all just in the form of tons of zeros). 2…
-
Where to Add Apply Forecast Operator in Random Forest Process
Hi there, I set up a Random Forest (RF) for a Time Series prediction and wondered how to add an Apply Forecast operator to get future predictions for my process? Pls see image. Do I add it to the top row between the RF and Apply Model operators? Cheers for any help, it'll be interesting to see how this RF compares with the…
-
Potential Problem Detected (Learner ignore example weights)
Hi All, I want to try Adaboost with my dataset . RandomForest is available in Adaboost. But "Input example set has example weights,but the learner will ignore them" warning is displayed. I wonder that why does it occur ? Am I using Adaboost correctly ? In this case, Can I use result of algorithm ? Thanks. I attached my…
-
Prediction Error on a text based dataset.
Hello everyone, I am new to RapidMiner and I have been stuck on this problem for many hours and I need help. I am using the movie dataset from UCI http://archive.ics.uci.edu/ml/datasets /Movie It contains different datasets of movies, actors in those movies and the directors of those movies. The main file is the movies…
-
How to compare the SVM and random forest results?
Hi, I am trying to use the prediction in Auto Model but encountered several questions on the results of SVM and random forest. * I wonder why the results of SVM and RF barely match? For example, attribute 1 has the highest weight based on SVM result, but it became one of the attributes having the lowest weight in the RF…
-
Please explain guess_subset_ratio parameter in Random Forest operator
As per operator description, If this parameter is set to true then *int(log(m) + 1) Attributes are selected. What is int? And Please demonstrate this with the help of an example.
-
How to select the optimal/best tree from the collection of trees generated by Random Forest?
I have a process that utilizes three random forest algorithms each using different parameters. I use the operator collect and compare models to select the best of the three learners. However, I would like to know which of the collection of trees was chosen by the random forest learner; hence I can use that tree to explain…
-
Random Forest
Hey RM Family, I want to use the Random Forest here, as a result I get several trees displayed, understandable. But I saw in a tutorial that I can lead them to a result. Meaning, for example, I would need 80% training and 20% testing, so does the approach I brought here via the Split Data Operator work 20/80? I am looking…
-
Random Forest - Attribute Importance
I have built a Random Forest model that shows very good accuracy after many test runs so I think I found a winner for my simple problem. I used "Weight by Tree Importance" operator to see which attributes are most important. Customer Income turned out to be most important. But how do I know if higher or lower income…