Which Confusion matrix is better?
NatalySimth
New Altair Community Member
I got the following results for two model:
And I got the lift chart as well as attached. How can I classify my result based on the lift chart and my calculation? Knowing that the term is in identifying spam messages.
Accuracy Model1
|
92.22%
|
/ Model 2 96.95%
|
Recall
|
94.20%
|
/94.20%
|
Precision
|
64.33%
|
/84.74%
|
F1-Score
|
76.45%
|
/89.21%
|
And I got the lift chart as well as attached. How can I classify my result based on the lift chart and my calculation? Knowing that the term is in identifying spam messages.
2
Best Answer
-
Ok, a couple of things:
- Without knowing anything else, Model 2 is more likely to produce better predictions (with "better" meaning having more impact). This is based on a) the higher accuracy and b) the higher Precision with the same Recall which c) also results in a higher F1-Score.
- However, the impact of both models may be the same or possibly Model 1 has even bigger business impact. What is the cost of missing an important email because it was falsely classified as spam? What is the cost of spam mails which make it through the filter? Based on those values, you could (and should) actually determine the most important thing: what is the impact of the model? Which one has more? And is it any better than not doing anything and treat everything as "no spam"? You would be surprised how often models even with low errors rates actually are not performing better than not using the model in general...
- Your lift charts look strange TBH. You could try to use Lift Chart (Simple) which has been introduced a while ago and see if they look any better. Otherwise those charts look like pretty much all confidence values are either 0 and 1 which often happens for text classification and models like NB etc.
Hope those pointers help...
Cheers,
Ingo1
Answers
-
Ok, a couple of things:
- Without knowing anything else, Model 2 is more likely to produce better predictions (with "better" meaning having more impact). This is based on a) the higher accuracy and b) the higher Precision with the same Recall which c) also results in a higher F1-Score.
- However, the impact of both models may be the same or possibly Model 1 has even bigger business impact. What is the cost of missing an important email because it was falsely classified as spam? What is the cost of spam mails which make it through the filter? Based on those values, you could (and should) actually determine the most important thing: what is the impact of the model? Which one has more? And is it any better than not doing anything and treat everything as "no spam"? You would be surprised how often models even with low errors rates actually are not performing better than not using the model in general...
- Your lift charts look strange TBH. You could try to use Lift Chart (Simple) which has been introduced a while ago and see if they look any better. Otherwise those charts look like pretty much all confidence values are either 0 and 1 which often happens for text classification and models like NB etc.
Hope those pointers help...
Cheers,
Ingo1