🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Logistic regression threshold

bernardo_pagnonUser: "bernardo_pagnon"
New Altair Community Member
Updated by Jocelyn

Hello all,

I am doing a simple logistic regression exercise (no SVM, simple and pure logistic regression) and I cannot understand how rapidminer defines the threshold for classifying instances as "yes". In similar posts it was mentioned that it chooses automatically 0.5, but that is not the case. I downloaded all the "yes" predictions and sorted them in ascending order: the threshold is 0.3108. Why?

 

I am using the "default" instance from the ISLR library (https://cran.r-project.org/web/packages/ISLR/index.html).

 

Thanks in advance,

Bernardo

 

 

Find more posts tagged with

Sort by:
1 - 1 of 11
    phellingerUser: "phellinger"
    New Altair Community Member
    Accepted Answer

    Hi Bernardo,

     

    Logistic Regression also uses 0.5 as threshold value starting from version 7.6, see https://docs.rapidminer.com/7.6/studio/releases/7.6/changes-7.6.0.html ("Logistic Regression and Generalized Linear Model learners now use 0.5 as the threshold as other binominal learners").

    The old behaviour is kept for backward compatibility reason. You can easily alter the operator's behaviour by increasing its compatibility level. (For whatever reason, it is set to 7.5.000 in your process.)

     

    logreg_threshold.png

     

    The reason for the old behaviour was that one can optimize for maximal F-measure by choosing a different threshold, but this is can be confusing. That's why this alternative threshold is only provided on a "threshold" output port now, and 0.5 is used otherwise.

     

    Best,

    Peter