Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Logistic Regression
emann
Has any one got a clue how to run logistic regression on Titanic Dataset? I've tried this literally all day but i don't think im getting the right accuracy so i must be missing a step. In Set Role my attribute name is Sex, in Split Data my ratio is 0.1 and 0.1 for the two partitions and i'm getting 64.53 accuracy - same test is ran on Orange and it was 91.7%
Screenshot attached.
Find more posts tagged with
AI Studio
Regression
Accepted answers
All comments
varunm1
Hello
@emann
Just for clarification. You are trying to predict attribute Sex from Titanic data set. Then you split the data into 90:10 ratio (train:test). Then you applied logistics regression and found that you got an accuracy of 64.53 percent on test data. Am I correct?
I tried similar to what you did and got 35 percent accuracy. It depends on how your data was split. I assume that you are not changing the settings in Logistic Regression in RapidMiner; my results are with default settings. I am not sure what the settings in Orange software you were using for logistic regression. Are the settings in both software for logistic regression same?
Also, Random (90:10) split is not recommended to compare performance as the train, and test data vary when you do it multiple times (my results are an example for this). You need to use cross validation with either 5 fold or 10 fold to test the performance of an algorithm. Also, the settings should be the same when you want to compare different software or algorithms.
Thanks
kypexin
Hi
@emann
I suggest that you go thru all the parameters of logistic regression and understand their meaning (there are quite a few!). Help section for the operator explains them quite well.
I have reproduced exactly the same process with the following parameters of logistic regression and got 80,92% accuracy 'out of the box', see below.
Otherwise it's hard to tell not knowing your parameters settings (also I have no idea how Orange sets up logistic regression by default).
emann
Hi
@kypexin
Thanks for your input. Initially i actually didn't make any changes to the regression parameters but having replicated your parameter setting the accuracy is now 76.76%.
Note: I'm very new to RapidMiner and Data Analytics in general so i'm not to familiar with the parameters and what they should actually be so I'm currently researching this for report purposes. Attached is a screenshot of my current parameter.
Screenshot 2019-03-26 at 21.54.25.png
Screenshot 2019-03-26 at 21.54.52.png
Screenshot 2019-03-26 at 21.54.32.png
Screenshot 2019-03-26 at 21.54.12.png
Telcontar120
Also in your OP you mentioned that you put in 0.1 and 0.1 for the split, but I think you actually need 0.9 and 0.1. Perhaps it was a typo, though. If not, that could definitely be affecting your model since you would only be using 10% of the data for training!
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups