Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

Interpretation of ROC Analysis

Hello Community,

I have derived the following ROC curves by considering four classification models:

Image: https://us.v-cdn.net/6030995/uploads/editor/o0/ix82t0e7bhdr.png

As you see, SVM and k-NN generates a curve where shades respectively exist.

Would it be a correct implication out of the graph to say that only k-NN and SVM were able to learn based on the given dataset and the resting two (DT and NB) were not?

What does the shade mean in detail? I would interpret them as the learning interval deviation which generated the curve between the shade course in mean.

I thank you in advance for your help!

Best regards,

Fatih

Find more posts tagged with

AI Studio

Classification

AUC ROC

Process Documents

Text Mining + NLP

Accepted answers

varunm1

Hello @Muhammed_Fatih_

Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum?

Yep, you got an optimal curve.
Is it common? Not very common in my works, but I got some ideal results in my studies. Most of the times it relates to a strong hypothesis and what we are looking for in data.

Can you get an optimal ROC?
Yes, if the data is very good for the model to train and predict.

I understand the reason you are skeptical about good results and it is good to be worried when we get good results. You should analyze deeply when you get these sorts of results. There are many reasons why a model can give very good results, you should carefully check your data, model building and your hypothesis to see if there are no conceptual error while building a model.

You should consider some pitfalls in analysis, like ignoring temporal relations in data and predictor which is a replica of the target variable. There are many others that you can search on google.

All comments

[Deleted User]

Hello

you can watch this video and I hope can help you
https://academy.rapidminer.com/learn/video/finding-the-right-model

All the best
mbs

varunm1

Hello @Muhammed_Fatih_

Are you sure Decision tree and NB are not learning? I see that their AUC values are 1 or closer to 1 based on the ROC curves. If what I think is correct, then DT and NB are discriminating classes with very high accuracy compared to SVM and KNN.

Muhammed_Fatih_

Hello @mbs,

thank you for the link!

Helloo @varunm1,

I am not sure whether they learn or not. But it looks like an indicator for Overfitting when I see that such high values are reached in comparison to SVM and k-NN. How do you see that? Would you interprete DT and NB also as appropriate solutions here? If yes, why?

varunm1

Hello @Muhammed_Fatih_

I can comment that based on data and the type of analysis you were doing. If its a split validation, then there is a chance you might get high performance like this randomly. There are also other factors like temporal characteristics in data and many other checks that you need to do when you get this kind of very good results.

Muhammed_Fatih_

Hello @varunm1,

thank you for your answer! I have used Cross Validation because studies have shown that it generates more accurate predictions in comparison to Split validation.

varunm1

Hello @Muhammed_Fatih_

Cross-validation is a good validation method, but if your data has some temporal (time-dependent) characteristics and confounding relationships then it might overestimate performance some times. But if you think there is none, then the models might be doing good. Different models work well for different types of data.

You can also split your original data 70:30 or 80:20 based on the size of your data and then cross-validated on the major portion and test the minor portion to see how the model is doing.

[Deleted User]

@Muhammed_Fatih_

Hello

This is for more information

. A good article from @sgenzer
https://community.rapidminer.com/discussion/54621/cross-validation-and-its-outputs-in-rm-studio

Good luck

Muhammed_Fatih_

Hello @varunm1,
hello @mbs,

thank you for your answers!

To come back and to refine the initial question: Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum? Is this possible in general?

Image: https://us.v-cdn.net/6030995/uploads/editor/e5/o73j993mg8ye.jpg

varunm1

Hello @Muhammed_Fatih_

Do you think that the marked ROC course is common if the ROC curve goes hand in hand with the optimum?