How to use Polynomial Regression in rapidminer correctly

rookie
rookie New Altair Community Member
edited November 2024 in Community Q&A

          Hello, everyone. This is my first forum post asking questions about polynomial regression in rapidminer.

The original data is:x:4194.06 3466.45  2070.08   874.98  corresponding to   y:91540.07  109460.36  120338.64  102182.19

As shown in the first flow, the first result expression is obtained by using the polynomial regression operator.

Spoiler

<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">

  <context>

    <input/>

    <output/>

    <macros/>

  </context>

  <operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">

    <parameter key="logverbosity" value="init"/>

    <parameter key="random_seed" value="2001"/>

    <parameter key="send_mail" value="never"/>

    <parameter key="notification_email" value=""/>

    <parameter key="process_duration_for_mail" value="30"/>

    <parameter key="encoding" value="SYSTEM"/>

    <process expanded="true">

      <operator activated="true" class="read_excel" compatibility="9.6.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="85">

        <parameter key="excel_file" value="C:\Users\1\Desktop\question data.xlsx"/>

        <parameter key="sheet_selection" value="sheet number"/>

        <parameter key="sheet_number" value="1"/>

        <parameter key="imported_cell_range" value="A1"/>

        <parameter key="encoding" value="SYSTEM"/>

        <parameter key="first_row_as_names" value="true"/>

        <list key="annotations"/>

        <parameter key="date_format" value=""/>

        <parameter key="time_zone" value="SYSTEM"/>

        <parameter key="locale" value="English (United States)"/>

        <parameter key="read_all_values_as_polynominal" value="false"/>

        <list key="data_set_meta_data_information">

          <parameter key="0" value="x.true.real.attribute"/>

          <parameter key="1" value="y.true.real.attribute"/>

        </list>

        <parameter key="read_not_matching_values_as_missings" value="false"/>

        <parameter key="datamanagement" value="double_array"/>

        <parameter key="data_management" value="auto"/>

      </operator>

      <operator activated="true" class="set_role" compatibility="9.6.000" expanded="true" height="82" name="Set Role" width="90" x="179" y="85">

        <parameter key="attribute_name" value="y"/>

        <parameter key="target_role" value="label"/>

        <list key="set_additional_roles">

          <parameter key="x" value="regular"/>

        </list>

      </operator>

      <operator activated="true" class="polynomial_regression" compatibility="9.6.000" expanded="true" height="82" name="Polynomial Regression" width="90" x="313" y="85">

        <parameter key="max_iterations" value="5000"/>

        <parameter key="replication_factor" value="2"/>

        <parameter key="max_degree" value="2"/>

        <parameter key="min_coefficient" value="-100.0"/>

        <parameter key="max_coefficient" value="100.0"/>

        <parameter key="use_local_random_seed" value="false"/>

        <parameter key="local_random_seed" value="1992"/>

      </operator>

      <connect from_op="Read Excel" from_port="output" to_op="Set Role" to_port="example set input"/>

      <connect from_op="Set Role" from_port="example set output" to_op="Polynomial Regression" to_port="training set"/>

      <connect from_op="Polynomial Regression" from_port="model" to_port="result 1"/>

      <portSpacing port="source_input 1" spacing="0"/>

      <portSpacing port="sink_result 1" spacing="0"/>

      <portSpacing port="sink_result 2" spacing="0"/>

    </process>

  </operator>

</process>


      The second flow, based on the original data, creates a new list of attributes as x^2=z, and uses the linear regression operator to make the second result expression.

<?xml version="1.0" encoding="UTF-8"?><process version="9.6.000">

  <context>

    <input/>

    <output/>

    <macros/>

  </context>

  <operator activated="true" class="process" compatibility="9.6.000" expanded="true" name="Process">

    <parameter key="logverbosity" value="init"/>

    <parameter key="random_seed" value="2001"/>

    <parameter key="send_mail" value="never"/>

    <parameter key="notification_email" value=""/>

    <parameter key="process_duration_for_mail" value="30"/>

    <parameter key="encoding" value="SYSTEM"/>

    <process expanded="true">

      <operator activated="true" class="read_excel" compatibility="9.6.000" expanded="true" height="68" name="Read Excel" width="90" x="45" y="85">

        <parameter key="excel_file" value="C:\Users\1\Desktop\question data.xlsx"/>

        <parameter key="sheet_selection" value="sheet number"/>

        <parameter key="sheet_number" value="1"/>

        <parameter key="imported_cell_range" value="A1"/>

        <parameter key="encoding" value="SYSTEM"/>

        <parameter key="first_row_as_names" value="true"/>

        <list key="annotations"/>

        <parameter key="date_format" value=""/>

        <parameter key="time_zone" value="SYSTEM"/>

        <parameter key="locale" value="English (United States)"/>

        <parameter key="read_all_values_as_polynominal" value="false"/>

        <list key="data_set_meta_data_information">

          <parameter key="0" value="x.true.real.attribute"/>

          <parameter key="1" value="y.true.real.attribute"/>

        </list>

        <parameter key="read_not_matching_values_as_missings" value="false"/>

        <parameter key="datamanagement" value="double_array"/>

        <parameter key="data_management" value="auto"/>

      </operator>

      <operator activated="true" class="generate_attributes" compatibility="9.6.000" expanded="true" height="82" name="Generate Attributes" width="90" x="179" y="85">

        <list key="function_descriptions">

          <parameter key="z" value="x*x"/>

        </list>

        <parameter key="keep_all" value="true"/>

      </operator>

      <operator activated="false" class="rename" compatibility="9.6.000" expanded="true" height="82" name="Rename" width="90" x="246" y="238">

        <parameter key="old_name" value="x"/>

        <parameter key="new_name" value="x^2"/>

        <list key="rename_additional_attributes"/>

      </operator>

      <operator activated="true" class="set_role" compatibility="9.6.000" expanded="true" height="82" name="Set Role" width="90" x="313" y="85">

        <parameter key="attribute_name" value="y"/>

        <parameter key="target_role" value="label"/>

        <list key="set_additional_roles">

          <parameter key="x" value="regular"/>

        </list>

      </operator>

      <operator activated="true" class="linear_regression" compatibility="9.6.000" expanded="true" height="103" name="Linear Regression" width="90" x="514" y="85">

        <parameter key="feature_selection" value="none"/>

        <parameter key="alpha" value="0.05"/>

        <parameter key="max_iterations" value="10"/>

        <parameter key="forward_alpha" value="0.05"/>

        <parameter key="backward_alpha" value="0.05"/>

        <parameter key="eliminate_colinear_features" value="false"/>

        <parameter key="min_tolerance" value="0.05"/>

        <parameter key="use_bias" value="true"/>

        <parameter key="ridge" value="1.0E-8"/>

      </operator>

      <operator activated="false" class="polynomial_regression" compatibility="9.6.000" expanded="true" height="82" name="Polynomial Regression" width="90" x="581" y="238">

        <parameter key="max_iterations" value="5000"/>

        <parameter key="replication_factor" value="2"/>

        <parameter key="max_degree" value="2"/>

        <parameter key="min_coefficient" value="-100.0"/>

        <parameter key="max_coefficient" value="100.0"/>

        <parameter key="use_local_random_seed" value="false"/>

        <parameter key="local_random_seed" value="1992"/>

      </operator>

      <connect from_op="Read Excel" from_port="output" to_op="Generate Attributes" to_port="example set input"/>

      <connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>

      <connect from_op="Set Role" from_port="example set output" to_op="Linear Regression" to_port="training set"/>

      <connect from_op="Linear Regression" from_port="model" to_port="result 1"/>

      <portSpacing port="source_input 1" spacing="0"/>

      <portSpacing port="sink_result 1" spacing="0"/>

      <portSpacing port="sink_result 2" spacing="0"/>

    </process>

  </operator>

</process>

     I want to ask why the results of the two processes are not the same, the original data presents a quadratic nonlinear relationship, and why the quadratic expression cannot be made by polynomial regression. 

Thanks you very much!


Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • rookie
    rookie New Altair Community Member
            First of all, thank you for your answer <3 . According to your description, I am as the data is too little, and not standardized, to lead to the results out? But these four samples are real data , need the four data to construct a yuan quadratic polynomial, Because nonlinear equations can be converted to linear equations , so I use z instead of x2, I have the linear regression equation. But why do with polynomial regression is not to come out, how do you explain that please?Polynomial regression is there any limit to this operator ?
  • rookie
    rookie New Altair Community Member
    hi @yyhuang
           Sorry in advance, I don't know how to use the function of this forum.That's why it took so long to reply
             First of all, thank you for your answer 3 . According to your description, I am as the data is too little, and not standardized, to lead to the results out? But these four samples are real data , need the four data to construct a yuan quadratic polynomial, Because nonlinear equations can be converted to linear equations , so I use z instead of x2, I have the linear regression equation. But why do with polynomial regression is not to come out, how do you explain that please?Polynomial regression is there any limit to this operator ?

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.