"non numeric argument to binary operator" when running R in Rapidminer

cpmysore
cpmysore New Altair Community Member
edited November 2024 in Community Q&A
Hi ! I am trying to run a simple logistic regression between Promotion (Binary) and Stats(Binary). The code works in R, but when I try to replicate the code in Rapidminer's R extension, it gives the following error message.

CODE (have tried multiple options)
logModel <- glm(Promotion ~ Stats,family="binomial"('logit'),data =data)
logModel <- glm(formula=Promotion ~ Stats,family=binomial(logit),data =data)

I have done preprocessing - (a) naming Promotion as "label", "binomial", (b) nominal to numerical transformation (c) selecting attributes, set role.

Still I end up with the following ERROR
non numeric argument to binary operator


Any suggestions/tips appreciated in advances.
Tagged:

Answers

  • David_A
    David_A New Altair Community Member
    Hi,

    one possible problem might be that when loading the data into R, the classes are not correctly identified as factors in the data frame.
    I tried to reproduce your problem (see the code below) and I had also to set the Promotion label as factor in the R script.


    Best,
    David
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="7.0.001">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="7.0.001" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_data" compatibility="7.0.001" expanded="true" height="68" name="Generate Data" width="90" x="45" y="34">
            <parameter key="target_function" value="random classification"/>
          </operator>
          <operator activated="true" class="rename" compatibility="7.0.001" expanded="true" height="82" name="Rename" width="90" x="246" y="34">
            <parameter key="old_name" value="label"/>
            <parameter key="new_name" value="Promotion"/>
            <list key="rename_additional_attributes">
              <parameter key="att1" value="Stats"/>
            </list>
          </operator>
          <operator activated="true" class="select_attributes" compatibility="7.0.001" expanded="true" height="82" name="Select Attributes" width="90" x="447" y="34">
            <parameter key="attribute_filter_type" value="subset"/>
            <parameter key="attributes" value="Promotion|Stats"/>
          </operator>
          <operator activated="true" class="r_scripting:execute_r" compatibility="6.5.000-SNAPSHOT" expanded="true" height="82" name="Execute R" width="90" x="715" y="34">
            <parameter key="script" value="# rm_main is a mandatory function, &#10;# the number of arguments has to be the number of input ports (can be none)&#10;rm_main = function(data)&#10;{&#10;    data$Promotion &lt;- as.factor(data$Promotion)&#10;    print(str(data))&#10;    &#10;    logModel &lt;- glm(Promotion ~ Stats,family=&quot;binomial&quot;('logit'),data =data)&#10;    return(logModel)&#10;    &#10;}&#10;"/>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Rename" to_port="example set input"/>
          <connect from_op="Rename" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
          <connect from_op="Select Attributes" from_port="example set output" to_op="Execute R" to_port="input 1"/>
          <connect from_op="Execute R" from_port="output 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>