Wyh does rapidminer include a variable with a p-value >0,05 in a multiple linear regression?

MariJAM
MariJAM New Altair Community Member
edited November 2024 in Community Q&A
Hello,

I'm doing a multiple linear regression. For my regression I have choosen the M5 prime feature with a min tolerance of 0,05. The final model contains three independent variables. Two of them have a p-value under 0,05 and one is above with a p-value of 0,135 (and t-Stat of 1,543).
Two other independent variables have not been included in the model due to their high p-values und low t-Stat values. 

Can anyone help and tell me why rapid miner includes this one variable eventhough its p-value is above 0,05?

Thanks a lot!

Best Answer

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hey,

    you are coming from a stats background, while RM is more from a DS background. There are quite some assumptions behind the p-value calculation. The mindset of DS is more: If i can prove that this method works better than another one, i take the method. So what you would do is vary the cutoff and check the results.

    Best,
    Martin

Answers

  • MartinLiebig
    MartinLiebig
    Altair Employee
    Answer ✓
    Hey,

    you are coming from a stats background, while RM is more from a DS background. There are quite some assumptions behind the p-value calculation. The mindset of DS is more: If i can prove that this method works better than another one, i take the method. So what you would do is vary the cutoff and check the results.

    Best,
    Martin

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.