Predicting ints on titanic dataset.
VenomSwitch
New Altair Community Member
Excuse my noob level of understanding please, I'm brand new.
I am trying to predict mortality on the titanic dataset using cross-validation and linear regression. As you can only use numbers with linear regression, I have converted selected attributes (such as survived) using the 'nominal to numerical' operator. I can see it is working most of the time from looking at the data and rounding it to 1 or 0 however the predicted value is coming back as a double so it's showing as 0 correct predictions.
I suppose my question is how do I make rapidminer return an int instead of a double? I have tried using the 'real to integer' operator but it doesn't like me putting it anywhere!
Open to any suggestions.
1
Best Answer
-
Hi,First, if you use a GLM operator it can handle binominal data. it uses the same trick you are doing here but without any hazzle for you.Then, why exactly do you want a int over a double?Anyway, one way to do it is to use Generate Attributes with
round([prediction(Survived=Yes)])
Best,Martin3
Answers
-
When I say 'double' I actually mean 'real'.
1 -
Hi,First, if you use a GLM operator it can handle binominal data. it uses the same trick you are doing here but without any hazzle for you.Then, why exactly do you want a int over a double?Anyway, one way to do it is to use Generate Attributes with
round([prediction(Survived=Yes)])
Best,Martin3 -
Hi Martin,This is brilliant, GLM is exactly what I needed!I wanted the integer because it was giving me values with the decimal point and becuase they didn't exactly match the '1'/'0' in the survived column it just told me every one was wrong with 0% accuracy (as it wasn't rounded to the '1'/'0' format in the dataset).I couldn't figure out where to place the generate attributes operator but it doesn't really matter as GLM has sorted out my problem.A very handy operator.Cheers!Joel2
-
Hi @VenomSwitch ,you would basically put it after each and every apply model operator you are using. Great that the GLM worked.Best,Martin2
-
I have got it working but now it seems to have a 100% accuracy rate which seems suspicious. I'm just going to stick with the GLM process I had before I think! If it isn't broke, don't fix it haha.You've helped me out today though!
1 -
Are you sure you applied the round on the prediction and not on the label attribute? That would explain it
1 -
Here is my current process using generate attributes with linear regress instead of GLM.My label is 'Survived = Yes'.I tried using the same operator inside the cross-val aswell but same result; 100% correct prediction.1
-
Hi @VenomSwitch ,then the other idea, are you sure that Survived = No is not part of the training ? That would also explain good resultscheers,Martin1