Logistic Regression - Normalization does not change Attribute Weights

cem_akyuz
cem_akyuz New Altair Community Member
edited November 2024 in Community Q&A

Hello,

I am new here and in general with statistics and data mining. Apologies if I am asking a really stupid question. 

My question is about logistic regression and normalizing data. I have a data set with some columns skewed and have different scales. So I wanted to apply normalization (including centering, scaling and Box Cox transformation for skewness) prior to logistic regression. But instead I wanted to check to what extent normalization changes the results. 

I see that normalization prior to logistic regression changes the coefficients however attribute weights are exactly same with and without normalization. Am I missing something here?

Attached you can find my design for the analysis. (Logistic Regression and Normalization added with default settings)

Welcome!

It looks like you're new here. Sign in or register to get started.

Best Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Try outputting the PRE port on the Normalization operator, that will tell you how it's normalizing the data.

  • earmijo
    earmijo New Altair Community Member
    Answer ✓

    By default the operator Logistic Regression normalizes the data (but uses the word standardize instead of normalize). Uncheck the option 'standardize'.  It does make a difference to the coefficients whether you normalize or not.  Check the process below

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Sonar" width="90" x="246" y="187">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="103" name="Multiply" width="90" x="447" y="187"/>
    <operator activated="true" class="normalize" compatibility="8.0.001" expanded="true" height="103" name="Normalize" width="90" x="648" y="340"/>
    <operator activated="true" class="h2o:logistic_regression" compatibility="7.6.001" expanded="true" height="124" name="Logistic Regression (2)" width="90" x="849" y="340">
    <parameter key="standardize" value="false"/>
    </operator>
    <operator activated="true" class="h2o:logistic_regression" compatibility="7.6.001" expanded="true" height="124" name="Logistic Regression" width="90" x="849" y="187">
    <parameter key="standardize" value="false"/>
    </operator>
    <connect from_op="Retrieve Sonar" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Logistic Regression" to_port="training set"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
    <connect from_op="Normalize" from_port="example set output" to_op="Logistic Regression (2)" to_port="training set"/>
    <connect from_op="Logistic Regression (2)" from_port="model" to_port="result 2"/>
    <connect from_op="Logistic Regression" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>

Answers

  • Thomas_Ott
    Thomas_Ott New Altair Community Member
    Answer ✓

    Try outputting the PRE port on the Normalization operator, that will tell you how it's normalizing the data.

  • earmijo
    earmijo New Altair Community Member
    Answer ✓

    By default the operator Logistic Regression normalizes the data (but uses the word standardize instead of normalize). Uncheck the option 'standardize'.  It does make a difference to the coefficients whether you normalize or not.  Check the process below

     

    <?xml version="1.0" encoding="UTF-8"?><process version="8.0.001">
    <context>
    <input/>
    <output/>
    <macros/>
    </context>
    <operator activated="true" class="process" compatibility="8.0.001" expanded="true" name="Process">
    <process expanded="true">
    <operator activated="true" class="retrieve" compatibility="8.0.001" expanded="true" height="68" name="Retrieve Sonar" width="90" x="246" y="187">
    <parameter key="repository_entry" value="//Samples/data/Sonar"/>
    </operator>
    <operator activated="true" class="multiply" compatibility="8.0.001" expanded="true" height="103" name="Multiply" width="90" x="447" y="187"/>
    <operator activated="true" class="normalize" compatibility="8.0.001" expanded="true" height="103" name="Normalize" width="90" x="648" y="340"/>
    <operator activated="true" class="h2o:logistic_regression" compatibility="7.6.001" expanded="true" height="124" name="Logistic Regression (2)" width="90" x="849" y="340">
    <parameter key="standardize" value="false"/>
    </operator>
    <operator activated="true" class="h2o:logistic_regression" compatibility="7.6.001" expanded="true" height="124" name="Logistic Regression" width="90" x="849" y="187">
    <parameter key="standardize" value="false"/>
    </operator>
    <connect from_op="Retrieve Sonar" from_port="output" to_op="Multiply" to_port="input"/>
    <connect from_op="Multiply" from_port="output 1" to_op="Logistic Regression" to_port="training set"/>
    <connect from_op="Multiply" from_port="output 2" to_op="Normalize" to_port="example set input"/>
    <connect from_op="Normalize" from_port="example set output" to_op="Logistic Regression (2)" to_port="training set"/>
    <connect from_op="Logistic Regression (2)" from_port="model" to_port="result 2"/>
    <connect from_op="Logistic Regression" from_port="model" to_port="result 1"/>
    <portSpacing port="source_input 1" spacing="0"/>
    <portSpacing port="sink_result 1" spacing="0"/>
    <portSpacing port="sink_result 2" spacing="0"/>
    <portSpacing port="sink_result 3" spacing="0"/>
    </process>
    </operator>
    </process>
  • cem_akyuz
    cem_akyuz New Altair Community Member

    Thanks a lot, when I removed normalize box (which I do not need anymore as logistic regression has standardize in it) I could repeat the process with and without standardize option. Then I can see that attribute weights changed in each iteration.

     

    Thanks a lot!

    Cem

  • wassdullull
    wassdullull New Altair Community Member

    Hi, i wanted to have an explanation on logistic regression results from rapidminer. I wanted to know whether the p-values can be used to calculate odd ratios and how can it be interpreted.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.