Auto Model Criteria

Newbie_01
Newbie_01 New Altair Community Member
edited November 2024 in Community Q&A
I was wondering what "Best Gains" mean regarding the models used to predict in the automodel section, as the percentage for accuracy is already determing the "best model" isn't it? How can this be different in some cases? Can somebody explain the exact differences?

Thank you very much


Best Answer

Answers

  • YYH
    YYH
    Altair Employee
    Answer ✓
    Hi @Newbie_01,

    Welcome! 

    This is a great question. Before we go deeper, hope you can benefit from the previous discussions and especially the insights from Ingo & Lionel
    https://community.rapidminer.com/discussion/56170/rm-9-4-feedback-official-release-costs-benefits-calculation
    https://community.rapidminer.com/discussion/comment/62164#Comment_62164

    Accuracy is one of the performance measurements created from the confusion matrix. Sometimes, you may have to think about the potential gain if you correctly predict or potential cost if you incorrectly predict. These business values can be defined in your 'cost matrix' to help you align the model performances with the business decision making.

    HTH!

    YY

  • Newbie_01
    Newbie_01 New Altair Community Member
    I read the post you provided. But just to clarify i get it right I want to sum up the situation.

    I got a result for the logistic regression with an accuracy of 93% and a gain of 10
    Also i got a result for the gradiant boosted tree with an accuracy of 97% and a gain of 6

    The cost matrix is +1 and -1 as given.

    1) The accuracy is the percentage of this model predicting right, right? or does it say that for 93% the logistic regression is the right model for this task in general regarding the provided data?

    2) The gain for both approaches are given by the difference between "Profits from Model = 26/24" and "Profits for Best Option = 20/14". What does it say? What is the difference?
    I still dont understand the transfer from the confusion matrix to this outcome as I acutally had 100 rows of data and the confusion matrix seems to only have analysed 38 rows.

    Thanks for the fast response before! :)