Interpretation of ROC Analysis

Muhammed_Fatih_
Muhammed_Fatih_ New Altair Community Member
edited November 5 in Community Q&A
Hello Community, 

I have derived the following ROC curves by considering four classification models: 



As you see, SVM and k-NN generates a curve where shades respectively exist.

Would it be a correct implication out of the graph to say that only k-NN and SVM were able to learn based on the given dataset and the resting two (DT and NB) were not?

What does the shade mean in detail? I would interpret them as the learning interval deviation which generated the curve between the shade course in mean. 

I thank you in advance for your help! 

Best regards, 

Fatih 

Best Answer

  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Hello @Muhammed_Fatih_

    Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum?
    Yep, you got an optimal curve.
    Is it common? Not very common in my works, but I got some ideal results in my studies. Most of the times it relates to a strong hypothesis and what we are looking for in data.

    Can you get an optimal ROC?
    Yes, if the data is very good for the model to train and predict. 

    I understand the reason you are skeptical about good results and it is good to be worried when we get good results. You should analyze deeply when you get these sorts of results. There are many reasons why a model can give very good results, you should carefully check your data, model building and your hypothesis to see if there are no conceptual error while building a model.

    You should consider some pitfalls in analysis, like ignoring temporal relations in data and predictor which is a replica of the target variable. There are many others that you can search on google.

Answers

  • [Deleted User]
    [Deleted User] New Altair Community Member
    Hello

    you can watch this video and I hope can help you
    https://academy.rapidminer.com/learn/video/finding-the-right-model

    All the best
    mbs
  • varunm1
    varunm1 New Altair Community Member
    edited February 2020
    Hello @Muhammed_Fatih_

    Are you sure Decision tree and NB are not learning? I see that their AUC values are 1 or closer to 1 based on the ROC curves. If what I think is correct, then DT and NB are discriminating classes with very high accuracy compared to SVM and KNN.
  • Muhammed_Fatih_
    Muhammed_Fatih_ New Altair Community Member
    Hello @mbs,

    thank you for the link! 

    Helloo @varunm1,

    I am not sure whether they learn or not. But it looks like an indicator for Overfitting when I see that such high values are reached in comparison to SVM and k-NN. How do you see that? Would you interprete DT and NB also as appropriate solutions here? If yes, why? 
  • varunm1
    varunm1 New Altair Community Member
    Hello @Muhammed_Fatih_

    I can comment that based on data and the type of analysis you were doing. If its a split validation, then there is a chance you might get high performance like this randomly. There are also other factors like temporal characteristics in data and many other checks that you need to do when you get this kind of very good results. 

  • Muhammed_Fatih_
    Muhammed_Fatih_ New Altair Community Member
    Hello @varunm1

    thank you for your answer! I have used Cross Validation because studies have shown that it generates more accurate predictions in comparison to Split validation.  
  • varunm1
    varunm1 New Altair Community Member
    Hello @Muhammed_Fatih_

    Cross-validation is a good validation method, but if your data has some temporal (time-dependent) characteristics and confounding relationships then it might overestimate performance some times. But if you think there is none, then the models might be doing good. Different models work well for different types of data. 

    You can also split your original data 70:30 or 80:20 based on the size of your data and then cross-validated on the major portion and test the minor portion to see how the model is doing.
  • [Deleted User]
    [Deleted User] New Altair Community Member
  • Muhammed_Fatih_
    Muhammed_Fatih_ New Altair Community Member
    edited February 2020
    Hello @varunm1
    hello @mbs

    thank you for your answers!

    To come back and to refine the initial question: Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum? Is this possible in general?   


  • varunm1
    varunm1 New Altair Community Member
    Answer ✓
    Hello @Muhammed_Fatih_

    Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum?
    Yep, you got an optimal curve.
    Is it common? Not very common in my works, but I got some ideal results in my studies. Most of the times it relates to a strong hypothesis and what we are looking for in data.

    Can you get an optimal ROC?
    Yes, if the data is very good for the model to train and predict. 

    I understand the reason you are skeptical about good results and it is good to be worried when we get good results. You should analyze deeply when you get these sorts of results. There are many reasons why a model can give very good results, you should carefully check your data, model building and your hypothesis to see if there are no conceptual error while building a model.

    You should consider some pitfalls in analysis, like ignoring temporal relations in data and predictor which is a replica of the target variable. There are many others that you can search on google.