🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"The problem of building regression model from Rapidminer"

User: "winecoding"
New Altair Community Member
Updated by Jocelyn
I tried the linear regression using the following data set,
x y1 z1 label
0 85.2475654 245.1558442 99.69204152
-1 36.00008409 -50.37614679 95.61016949
-2 257.1300917 517.2790698 189
-2 194.4923912 10.50413223 593.6107784
1 602.6111798 410.6153846 345.1538462
1 36.2366869 608.7922078 1.076124567
-5 13.09949256 16.59633028 -4.389830508
-5 660.3381923 468.0886076 353.7486034
3 52.75862603 724.5955056 -20.92633223
-5 37.49788729 64.61607143 -2.71990172
The column "label" is the response variable, and other three columns are predictor variables. I built the Rapidminer workflow as
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
  <context>
    <input/>
    <output/>
    <macros/>
  </context>
  <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
    <process expanded="true" height="224" width="346">
      <operator activated="true" class="read_csv" compatibility="5.2.008" expanded="true" height="60" name="Read CSV" width="90" x="59" y="95">
        <parameter key="csv_file" value="C:\Users\Desktop\training.csv"/>
        <parameter key="column_separators" value=","/>
        <parameter key="first_row_as_names" value="false"/>
        <list key="annotations">
          <parameter key="0" value="Name"/>
        </list>
        <parameter key="encoding" value="windows-1252"/>
        <list key="data_set_meta_data_information">
          <parameter key="0" value="x.true.integer.attribute"/>
          <parameter key="1" value="y1.true.real.attribute"/>
          <parameter key="2" value="z1.true.real.attribute"/>
          <parameter key="3" value="label.true.real.label"/>
        </list>
      </operator>
      <operator activated="true" class="linear_regression" compatibility="5.2.008" expanded="true" height="94" name="Linear Regression" width="90" x="246" y="75"/>
      <connect from_op="Read CSV" from_port="output" to_op="Linear Regression" to_port="training set"/>
      <connect from_op="Linear Regression" from_port="model" to_port="result 1"/>
      <portSpacing port="source_input 1" spacing="0"/>
      <portSpacing port="sink_result 1" spacing="0"/>
      <portSpacing port="sink_result 2" spacing="0"/>
    </process>
  </operator>
</process>
The resulting model is not correct.  On the other side, the R is able to build the linear regression model for this data set without any problem. i am not sure why Rapidminer has problem for this data set. Thanks.

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "earmijo"
    New Altair Community Member
    You should get exactly the same if, in feature selection, you select "None". By default, Rapidminer implements the M5Prime Feature Selection. From what I understand this is sort of equivalent to maximizing the AIC.