🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Regression with Random Forest ?

User: "phivu"
New Altair Community Member
Updated by Jocelyn

Hi RapidMiner,

 

I'm doing regression with 480 input features. I tried to use Deep Learning operator but the training Root Mean Square Error is still quite high. Now I'm trying to use Random Forest because of its Random Subspace approach, but found that the Random Forest operator cannot handle numerical label. How can I deal with this?

 

Thank you very much for your support.

 

Best Regards,

phivu

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "earmijo"
    New Altair Community Member
    Accepted Answer

    You cannot do it in RapidMiner unless you are willing to use R Scripts. However, the latest version of RM has a new operator Gradient Boosted Trees which is competitive with Random Forest and it can handle both numerical and polynominal labels. Explore it. 

    User: "earmijo"
    New Altair Community Member
    Accepted Answer

    Install the R Script Extension. Verify you have R installed in your computer and run the code below. I adapted the code that comes with the application to run Random Forest for a regression problem.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.3.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.3.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" breakpoints="after" class="retrieve" compatibility="7.3.001" expanded="true" height="68" name="Retrieve Polynomial" width="90" x="45" y="34">
    <parameter key="repository_entry" value="//Samples/data/Polynomial"/>
    <description align="center" color="blue" colored="true" width="126">Fetch example data</description>
    </operator>
    <operator activated="true" class="split_data" compatibility="7.3.001" expanded="true" height="103" name="Split Data" width="90" x="179" y="34">
    <enumeration key="partitions">
    <parameter key="ratio" value="0.5"/>
    <parameter key="ratio" value="0.5"/>
    </enumeration>
    <description align="center" color="purple" colored="true" width="126">Split the data in a training and a test set</description>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Learn Model" width="90" x="380" y="34">
    <parameter key="script" value="# train a random Forest on the training data and return the learned model&#10;&#10;rm_main = function(data)&#10;{&#10; library(randomForest) &#10;&#9;Model.rf &lt;- randomForest(label~., data =data,mtry=3,importance=FALSE,na.action=na.omit)&#10; &#9;return(Model.rf)&#10;}&#10;"/>
    <description align="center" color="red" colored="true" width="126">Train a RandomForest model in R and return it as an R object</description>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="103" name="Apply R Model" width="90" x="514" y="238">
    <parameter key="script" value="## load the trained model and apply it on the test data&#10;&#10;rm_main = function(model, data)&#10;{&#10; library(randomForest)&#10; # apply the model and build a prediction&#10; result &lt;-predict(model, data)&#10;&#10; # add the prediction to the example set&#10; data$prediction &lt;- result&#10; &#10; # update the meta data&#10; metaData$data$prediction &lt;&lt;- list(type=&quot;real&quot;, role=&quot;prediction&quot;)&#10; &#10; return(data)&#10;}&#10;"/>
    <description align="center" color="red" colored="true" width="126">Apply the trained model on the test data</description>
    </operator>
    <connect from_op="Retrieve Polynomial" from_port="output" to_op="Split Data" to_port="example set"/>
    <connect from_op="Split Data" from_port="partition 1" to_op="Learn Model" to_port="input 1"/>
    <connect from_op="Split Data" from_port="partition 2" to_op="Apply R Model" to_port="input 2"/>
    <connect from_op="Learn Model" from_port="output 1" to_op="Apply R Model" to_port="input 1"/>
    <connect from_op="Apply R Model" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>