AUC > 1?

wessel
wessel New Altair Community Member
edited November 5 in Community Q&A
Dear All,

How come the ROC can get above 1?
http://img.ctrlv.in/img/51d099898f5d9.jpg

Best regards,

Wessel


image

Tagged:

Answers

  • dan_agape
    dan_agape New Altair Community Member
    Hi,

    Indeed, AUC (the area under the red ROC curve) cannot be more than 1 (in fact the curve itself cannot go above the horizontal line y=1; also the reddish area which may indicate confidence intervals, or some other indicative variation, shouldn't go above that horizontal line).

    By the way, I have just checked again if another error regarding the calculation of AUC that I had reported a couple of years ago http://rapid-i.com/rapidforum/index.php/topic,2237.0.html was corrected, and it seems it was not - perhaps the reported error was not well understood by the guys at RapidI or other participants in that thread. The image below shows that the area under the (red) ROC curve that is clearly 1 is still wrongly calculated by RM as AUC=0.5.

    see image: http://postimg.org/image/9upjmo2ev/

    People can try the following simple process building a perfect classifier (that is, having the accuracy=1) that illustrates the bug. Always the AUC (here 0.5!!) should be a value between the pessimistic AUC (here 1) and the optimistic AUC (here 1). This is so because always the ROC curve is placed between the pessimistic ROC and the optimistic ROC curves. In the particular case of this classifier built below, all the 3 ROC curves are identical (check the process's result), so the 3 areas under the curves should be equal, and they are not.

    Dan
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.008">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.008" expanded="true" name="Process">
        <process expanded="true">
          <operator activated="true" class="generate_churn_data" compatibility="5.3.008" expanded="true" height="60" name="Generate Churn Data" width="90" x="45" y="75">
            <parameter key="number_examples" value="1000"/>
            <parameter key="use_local_random_seed" value="true"/>
          </operator>
          <operator activated="true" class="nominal_to_binominal" compatibility="5.3.008" expanded="true" height="94" name="Nominal to Binominal" width="90" x="179" y="75"/>
          <operator activated="true" class="split_validation" compatibility="5.3.008" expanded="true" height="112" name="Validation" width="90" x="313" y="75">
            <parameter key="sampling_type" value="stratified sampling"/>
            <parameter key="use_local_random_seed" value="true"/>
            <process expanded="true">
              <operator activated="true" class="decision_tree" compatibility="5.3.008" expanded="true" height="76" name="Decision Tree" width="90" x="45" y="30">
                <parameter key="minimal_gain" value="0.04"/>
              </operator>
              <connect from_port="training" to_op="Decision Tree" to_port="training set"/>
              <connect from_op="Decision Tree" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true">
              <operator activated="true" class="apply_model" compatibility="5.3.008" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                <list key="application_parameters"/>
              </operator>
              <operator activated="true" class="performance" compatibility="5.3.008" expanded="true" height="76" name="Performance" width="90" x="155" y="30"/>
              <connect from_port="model" to_op="Apply Model" to_port="model"/>
              <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
              <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Churn Data" from_port="output" to_op="Nominal to Binominal" to_port="example set input"/>
          <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="36"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="162"/>
        </process>
      </operator>
    </process>


  • MariusHelf
    MariusHelf New Altair Community Member
    The actual ROC curve is never above 1, and neither is the AUC.

    The light red area is not a confidence band, but the standard deviation of each data point based on the 10 iterations of the X-Validation. Of course, the actual value +/- the standard deviation can exceed 1/0.

    Dan, as Ingo already posted in the old thread, the calculation of the AUC is not wrong. In the standard implementation (neither optimistic nor pessimistic), we smooth the line by interpolating between the steps of the function. If you have more than 2 confidence levels this works quite well. In this border case the results is admittedly a bit strange, but nevertheless correct. In case of more need of discussion please let's continue in the respective thread at http://rapid-i.com/rapidforum/index.php/topic,2237.0.html

    Best regards,
    Marius
  • wessel
    wessel New Altair Community Member
    Hey Marius,

    Thanks a lot for your information.

    Now I understand why it shows a red spike above 1.
    Its simply because the first part of the ROC has a large variation.
    Therefore mean + standard variation is almost always above 1.

    As a possible variation, you could plot all 10 ROC iterations, and plot a fat line in the middle for average(ROC).
    This maybe be a more faithful display of the ROC distribution.

    Best regards,

    Wessel
  • MariusHelf
    MariusHelf New Altair Community Member
    Well, that may be a good representation, but I fear that the priorities for changing the plot won't be very high.

    Btw, with your next post you will enter the honorable circle of Hero Members. Congratulations!

    ~Marius
  • dan_agape
    dan_agape New Altair Community Member
    the calculation of the AUC is not wrong. In the standard implementation (neither optimistic nor pessimistic), we smooth the line by interpolating between the steps of the function
    Marius, the ROC's first step in the process I provided above happens at x=0, so the result of "smoothing" the line is not the main diagonal (which leads to an area of 0.5) but the horizontal line y=1 (which leads to an area of 1). The point (0,1) is part of the pessimistic, neutral and optimistic ROCs, so your interpolation should take this into account. I am afraid the RM's calculation is indeed wrong.

    If this does not convince you, here is a second intuitive rationale. The AUC is one of the indicators of a model's performance. A model that randomly guesses the class has an AUC of about 0.5. In contrast a model that always predicts the correct class should achieve a much better performance (that is, a higher AUC in this case precisely) than a random guesser, shouldn't it?  Such a perfect model is built by the process above, yet according to RM it is as good as a random guesser if performance is measured by AUC. This is an anomaly, and this anomaly is due to the wrong RM's calculation of AUC. Consult (***)  below for a reference.

    Finally, look at the ROC your software draws in the process I provided: the area under that curve is 1x1=1 indeed, as you have there a rectangle and not a triangle! The drawing is correct, and is inconsistent with the calculation which is clearly wrong.

    Dan

    (***) Reference: Tan, Steinbach, Kumar, Introduction to Data Mining, Addison Wesley, 2005

    Subsection 5.7.2 on ROC: " The area under the ROC curve (AUC) provides another approach for evaluating which model is better on average. If the model is perfect, then its area under the ROC curve would equal 1. If the model simply performs random guessing, then its area under the ROC curve would equal 0.5"
  • MariusHelf
    MariusHelf New Altair Community Member
    Dan, ok, get your point. Of course I know what the AUC is and how to interpret it. However I am not deep enough in the guts of RapidMiner to tell you how exactly the neutral algorithm (is supposed to) work, so I will have to create a ticket for this such that a developer can have a look at it.

    Best regards,
    Marius
  • haddock
    haddock New Altair Community Member