AUC setup

p3dr0
p3dr0 New Altair Community Member
edited November 5 in Community Q&A
Hi all,

I am new on RapidMiner and have an easy and probably stupid question.

I have 3 columns, a label ( 0 or 1), a prediction ( 0 or 1) and a confidence ( between 0 and 1). When I run performance for AUC I only get an horizontal line y=0 and an horizontal line x=1. I can't make the curve behave as it should.

I labelled my label and prediction and don;t know what else I should do.

Please help!!!
Tagged:

Answers

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    edited June 2020
    Hi @p3dr0,

    "....Absolutely no question is stupid...... !!!!"
    Could you share your process and data in order we can reproduce and understand what is going on ?

    In the meantime, have you tried to submit your data to AutoModel ?
    Do you observe the same phenomenon on the results screen ?

    EDIT : 

    " ....Whoever asks a question risks five minutes of looking stupid, who does not ask questions will remain stupid all his life..."
    Chinese proverb

    Regards,

    Lionel
  • lionelderkrikor
    lionelderkrikor New Altair Community Member

    I am new on RapidMiner and have an easy and probably stupid question

    "...."There are no stupid questions, only a stupid answer."....
    Albert Einstein

    Regards,

    Lionel
  • p3dr0
    p3dr0 New Altair Community Member
    edited June 2020
    Hi Lionel,

    Thank you for the pep talk :smiley:

    I started with 2 columns: Data ( Y/N) and Prediction ( a result between 0 and 1, which I don't know how it was calculated).

    I need to find the AUC. I converted both the Data and Prediction to a 0/1 ( prediction threshold of 0.5) value and used the processes building my design:
    Read CSV
    Set Role: ( Set the converted Data as Label and the converted Prediction as prediction and on other attempt I have also set the initial Prediction with a role of weight.)
    Performance: Asked for AUC to be calculated.

    All this gave a correct pivot-table and correct precision (95%)  and recall (47%) but an AUC of 0.000, with only the lines x=0 and y=1 on the graph, with a positive class of 0 (and the lines x=1 and y=0 and AUC of 1.000 with a positive class of 1)

    I am not sure what else to say if I am honest. I tried Auto Model but it wanted to build the model which I do not want/need.

    Thank you in advance for your help.
  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    @p3dr0

    By reading the description of your process, I observe that there is no model in your process and no confidence results.(confidences are calculated after applying a model)
    Without confidence values, it is impossible to calculate the ROC curve and thus the AUC : thus the curve you observe (and AUC = 0) is what is expected in such case : 






    To be honest , it is difficult to help you without your data... Can you share your data and explain exactly what you want to predict.
    This way, we could help you efficiently...

    Thank you for your understanding,

    Regards,

    Lionel

    PS : if you don't want to share your data publicly here in the community, you can send me your data via a private message.

  • lionelderkrikor
    lionelderkrikor New Altair Community Member
    @p3dr0,

    If you are new in data-science and/or with RapidMiner, I suggest you to see the pedagogic videos of the RapidMiner Academy : 
    https://academy.rapidminer.com/

    In particular you can see the video called "finding the right model". In this video, the focus is on  : 
     - How to build and validate a model (thanks to Cross-Validation)
          AND
     - how to determine the best model thanks to the ROC curves (and the associated AUC) which is explained in detail.
    https://academy.rapidminer.com/learn/video/finding-the-right-model

    Regards,

    Lionel
  • p3dr0
    p3dr0 New Altair Community Member
    edited June 2020
    Hi Lionel,

    Thank you so much for your help.
    I got a very similar graph to yours. Exactly the same issue.

    I am happy to send you my data Please find it attached. 
    You will see the first 2 columns, "data" is the true outcome and the "predicted model" is the prediction.

    I will have a look at the videos. Thank you.

    Thank you again...
    Hope you and your family are safe.