🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Logistic Regression

User: "emann"
New Altair Community Member
Updated by Jocelyn
Has any one got a clue how to run logistic regression on Titanic Dataset? I've tried this literally all day but i don't think im getting the right accuracy so i must be missing a step. In Set Role my attribute name is Sex, in Split Data my ratio is 0.1 and 0.1 for the two partitions and i'm getting 64.53 accuracy - same test is ran on Orange and it was 91.7%

Screenshot attached.

Find more posts tagged with

Sort by:
1 - 4 of 41
    User: "varunm1"
    New Altair Community Member
    Hello @emann

    Just for clarification. You are trying to predict attribute Sex from Titanic data set. Then you split the data into 90:10 ratio (train:test). Then you applied logistics regression and found that you got an accuracy of 64.53 percent on test data. Am I correct?

    I tried similar to what you did and got 35 percent accuracy. It depends on how your data was split. I assume that you are not changing the settings in Logistic Regression in RapidMiner; my results are with default settings. I am not sure what the settings in Orange software you were using for logistic regression. Are the settings in both software for logistic regression same?

    Also, Random (90:10) split is not recommended to compare performance as the train, and test data vary when you do it multiple times (my results are an example for this). You need to use cross validation with either 5 fold or 10 fold to test the performance of an algorithm. Also, the settings should be the same when you want to compare different software or algorithms.

    Thanks
    User: "kypexin"
    New Altair Community Member
    Hi @emann

    I suggest that you go thru all the parameters of logistic regression and understand their meaning (there are quite a few!). Help section for the operator explains them quite well.

    I have reproduced exactly the same process with the following parameters of logistic regression and got 80,92% accuracy 'out of the box', see below.

    Otherwise it's hard to tell not knowing your parameters settings (also I have no idea how Orange sets up logistic regression by default).




    User: "emann"
    New Altair Community Member
    OP
    Updated by emann
    Hi @kypexin

    Thanks for your input. Initially i actually didn't make any changes to the regression parameters but having replicated your parameter setting the accuracy is now 76.76%. 

    Note: I'm very new to RapidMiner and Data Analytics in general so i'm not to familiar with the parameters and what they should actually be so I'm currently researching this for report purposes. Attached is a screenshot of my current parameter.
    User: "Telcontar120"
    New Altair Community Member
    Also in your OP you mentioned that you put in 0.1 and 0.1 for the split, but I think you actually need 0.9 and 0.1.  Perhaps it was a typo, though.  If not, that could definitely be affecting your model since you would only be using 10% of the data for training!