ROC Bias

jdu
jdu New Altair Community Member
edited November 5 in Community Q&A
I am trying to use ROC (AUC) to evaluate the predictive performance of some models, then I noticed the ROC bias option. It turns out that, the pessimistic, neutral and optimistic AUC produce same results in some cases, but different results in the others. I googled, but could not find further explanation of the ROC Bias. Does anyone know any references for this?

Also, it just occurs to me that, in Weka, there are references (integrated in the software) about where you can find more detailed information about the algorithms. It seems to be a very nice function to me, and I am just wondering by any chance that Rapidminer can have the similar function in the future. Thanks!
Tagged:

Answers

  • Nils_Woehler
    Nils_Woehler New Altair Community Member
    Hi,

    ROC curves are calculated by first ordering the classified examples by confidence.
    Afterwards all examples are taken into account with decreasing confidence to plot the false positive rate on the x-axis and the true positive rate on the y-axis.
    With optimistic, neutral and pessimistic there are three possibilities to calculate ROC curves.
    If there is more then one example for a confidence with optimistic ROC calculation the correct classified examples are taken into account first before looking at the false classification.
    With pessimistic calculation it is the other way round: wrong classifications are taken into account first before looking at correct classifications.
    Neutral calculation is a mix of both calculation methods described above. Here correct and false classifications are taken into account alternately.

    If there are no examples with equal confidence or all examples with equal confidence are assigned to the same class the optimistic, neutral and pessimistic ROC curves will be the same.

    Some of the operators like SVM, ID3 and much more already have this kind of information in their description.
    Never the less we will try to enhance our operator descriptions in the future containing more detailed informations about algorithms the are being used.

    Best,
    Nils