"R extension - how to get started"

marie
marie New Altair Community Member
edited November 2024 in Community Q&A

Hey there,

I am really new to the use of RapidMiner and R, and well I did not find anything in the internet on how to get started with the r Extension in RapidMiner that really breaks it down to the basics. So I just tried some very simple things out like the max of a column. 

the script is the following:

rm_main = function(data)
{
max($Temperature)
return(data)
}

and the error message is:

The script yould not be parsed. Please check your R script.

[1] "script.R:5:5: unexpected '$' (....)"

 

Do you know how to solve it?

Or do you have something were one can learn how to get started with the use of the R extension in RapidMiner with just basic knowledge?

Thanks in advance

Marie

 

Ah and here is the xml:

<?xml version="1.0" encoding="UTF-8"?><process version="7.2.001">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="7.2.001" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="7.2.001" expanded="true" height="68" name="Retrieve Golf" width="90" x="45" y="85">
<parameter key="repository_entry" value="//Samples/data/Golf"/>
</operator>
<operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="246" y="85">
<parameter key="script" value="rm_main = function(data)&#10;{&#10; max($Temperature)&#10;}&#10;"/>
</operator>
<connect from_op="Retrieve Golf" from_port="output" to_op="Execute R" to_port="input 1"/>
<connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answer

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Hi Marie,

     

    That means that the resulting R script created an object that RapidMiner can't visualize in the results tab, this is why I added the print statement where you can see the max temp in Log View.  Take a look at the sample tutorial processes loaded for the Execute R operator. Just right click on the operator and click on description. There will be a link for "Jump to Tutorial Processes."

     

    There about 4 different R examples which explain a bit on how you can embed your scripts inside RapidMiner. Good luck!

Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member

    Hi,

     

    Working with the Execute R operator is pretty straight forward once you understand how RM is delivering the data to the function.  See your sample script modified.

     

    RM is sending it's data to the Execute R script and translates it via the data.tables package. The raw data comes in as "data" via the function(data).  From there I assign it to a golf <- data datafram AND then extract out the column Temperature via output <- max(golf$Temperature)

     

    Then I return the output as an object.

     

    I added a print statement so you can see the results in your LOG view.

     

    <?xml version="1.0" encoding="UTF-8"?><process version="7.2.002">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="7.2.002" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="7.2.002" expanded="true" height="68" name="Retrieve Golf" width="90" x="45" y="85">
    <parameter key="repository_entry" value="//Samples/data/Golf"/>
    </operator>
    <operator activated="true" class="r_scripting:execute_r" compatibility="7.2.000" expanded="true" height="82" name="Execute R" width="90" x="179" y="85">
    <parameter key="script" value="rm_main = function(data)&#10;{&#10;golf &lt;- data&#10;&#10;output &lt;- max(golf$Temperature)&#10;&#10;print(str(output))&#10;&#10;return(output)&#10;}&#10;"/>
    </operator>
    <connect from_op="Retrieve Golf" from_port="output" to_op="Execute R" to_port="input 1"/>
    <connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    </process>
    </operator>
    </process>
  • marie
    marie New Altair Community Member

    Hey Thomas_Ott,

    thank you very much for your quick reply. 

    It seems very logical what you write. 

    But when I copield the XMl all I get in the Resluts view is:

    File

    Memory buffered file

     

    What does that mean?

    With kind regards

    Marie

     

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Hi Marie,

     

    That means that the resulting R script created an object that RapidMiner can't visualize in the results tab, this is why I added the print statement where you can see the max temp in Log View.  Take a look at the sample tutorial processes loaded for the Execute R operator. Just right click on the operator and click on description. There will be a link for "Jump to Tutorial Processes."

     

    There about 4 different R examples which explain a bit on how you can embed your scripts inside RapidMiner. Good luck!

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.