Wyh does rapidminer include a variable with a p-value >0,05 in a multiple linear regression?
MariJAM
New Altair Community Member
Hello,
I'm doing a multiple linear regression. For my regression I have choosen the M5 prime feature with a min tolerance of 0,05. The final model contains three independent variables. Two of them have a p-value under 0,05 and one is above with a p-value of 0,135 (and t-Stat of 1,543).
Two other independent variables have not been included in the model due to their high p-values und low t-Stat values.
Can anyone help and tell me why rapid miner includes this one variable eventhough its p-value is above 0,05?
Thanks a lot!
I'm doing a multiple linear regression. For my regression I have choosen the M5 prime feature with a min tolerance of 0,05. The final model contains three independent variables. Two of them have a p-value under 0,05 and one is above with a p-value of 0,135 (and t-Stat of 1,543).
Two other independent variables have not been included in the model due to their high p-values und low t-Stat values.
Can anyone help and tell me why rapid miner includes this one variable eventhough its p-value is above 0,05?
Thanks a lot!
Tagged:
0
Best Answer
-
Hey,you are coming from a stats background, while RM is more from a DS background. There are quite some assumptions behind the p-value calculation. The mindset of DS is more: If i can prove that this method works better than another one, i take the method. So what you would do is vary the cutoff and check the results.Best,Martin1
Answers
-
Hey,you are coming from a stats background, while RM is more from a DS background. There are quite some assumptions behind the p-value calculation. The mindset of DS is more: If i can prove that this method works better than another one, i take the method. So what you would do is vary the cutoff and check the results.Best,Martin1