RM 9.4 feedback (official release) : Costs/Benefits calculation
lionelderkrikor
New Altair Community Member
Dear all,
First thanks you for implementing the costs/benefits calculus in this new release - I think lot of users (including me) waited for this new feature.
2 months ago I had several questions in this thread about the Costs/Benefits calcultation and thanks to @IngoRM to answer me, that's was clear :
https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releases
But in this official release , I'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?
The "Total Cost/Benefit (expected)" and the associated average are replaced by :
- "Total for best option"
- "Gain"
My second question is : can you explain how this 2 numbers are calculated (despite my efforts i was not able to retrieve them) and why these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?
Here my attempt to retrieve these 2 numbers with the Titanic Dataset with all options by default in AutoModel with NB model :
Third question : in the new column called "cost" why the cost is not counted as negative when the prediction is wrong (I suppose the following cost matrix as the following) :
Thanks you for your listening,
Regards,
Lionel
First thanks you for implementing the costs/benefits calculus in this new release - I think lot of users (including me) waited for this new feature.
2 months ago I had several questions in this thread about the Costs/Benefits calcultation and thanks to @IngoRM to answer me, that's was clear :
https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releases
But in this official release , I'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?
The "Total Cost/Benefit (expected)" and the associated average are replaced by :
- "Total for best option"
- "Gain"
My second question is : can you explain how this 2 numbers are calculated (despite my efforts i was not able to retrieve them) and why these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?
Here my attempt to retrieve these 2 numbers with the Titanic Dataset with all options by default in AutoModel with NB model :
Third question : in the new column called "cost" why the cost is not counted as negative when the prediction is wrong (I suppose the following cost matrix as the following) :
Thanks you for your listening,
Regards,
Lionel
Tagged:
2
Best Answers
-
Hi Lionel,
Sure, let's get into this. BTW, I will spend a good amount of my October producing some new white papers / blog posts on the stuff I have been working on in the past years. This will include the new validation scheme we are using in AM, our new LIME variant, our new model-specific & model-agnostic weights, and the new Profit-Sensitive Scoring and cost calculations. Stay tuned for this please, and the answers below are already half of the workI'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?Because we collected the feedback from users and observed them in usability testing and decided that the new way of presenting the costs / profits can be a) easier understood and b) shows the ACTUAL business impact instead of expected values which are probabilistic. At the end of the day, that's exactly why we do those tests and beta releaseswhy these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?Given that those numbers are supposed to show true business impact and to facilitate communication also with business stakeholders, we decided that it is important to support both a) and b) above over a more statistical approach here. I hope you will agree after reading the explanation below...The next two questions I will answer in reverse order since it makes more sense to me that way...in the new column called "cost" why the cost is not counted as negative when the prediction is wrongThat's exactly the problem we saw with business stakeholders as well :-) The thing is that the costs in this column are the expected costs from the point of the view of the model. The model - when making the prediction - does not know if the prediction is going to be right or wrong. It just knows it's prediction and the confidence for it (as well as for the other possible outcomes) and calculates the expected costs based on those confidences. This is what I explained in the post here in the "expected" part of the answer: https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releasesSo from the model's perspective the expected cost simply is not negative in this cases - otherwise it would have been going with a different prediction if possible.The problem now is that this expected cost is not what the business will actually see of course. In most situations, the business can not make an 80% decision just because the confidence is 80% (there are some use cases where this is actually possible, but they are the minority). For example, let's use Churn as our use case and also let's say you predict "churn" with 80% confidence. You will still offer the full discount (or whatever it is you are doing in this case) for the account the prediction was made for. Same for 75% confidence. Or 90%. The confidence typically does not matter for the specific action that much. And the same is true for the outcome: it is a binary outcome, not a probabilistic one. Either your customer churns or not. They don't churn 72% (and again - there are cases like that, but that is not typical).So while the table shows the expected cost to explain why the model made the decision, from a business perspective this just does not make a lot of sense. Hence we wanted to give this less spotlight on the expected costs from a model-perspective but instead show the actual business impact. See below for that now :-)can you explain how this 2 numbers are calculatedHere we go. First of all: forget the expected cost and the corresponding column in the data for this. This column is - as explained above - only showing why the model made a specific decision and has not much to do with the actual business outcomes.Both numbers can be calculated from the confusion matrix and the cost matrix only. The attributes or specific predictions actually don't matter. Neither do the confidences.All you do is you multiply the values in the fields of the confusion matrix with the corresponding cost matrix and sum this all up. That is the Total Cost / Benefit delivered by the model.Now you assume a default model for each of the possible classes, i.e. a model which always predict that class. Since you know the distribution of the actual outcomes from the confusion matrix already, you can calculate the costs / benefits for always predicting each of the possible classes. The class with the lowest cost / highest profits is the one which is the best default option. Its total cost is shown as Total for Best (class). This is typically what is already implemented in your organization since otherwise the business would be typically gone at that point :-)The difference between what the model produced and the best default option is the Gain. That's it.And since this may be hard to follow from a textual description: check out the little example in the attached Excel file. This shows all the calculations and should make it clear.Hope this helps,
Ingo2 -
You got me thereYou are right: the ACTUAL calculations are NOT coming from the confusion matrix but from the predictions table. I just realized yesterday while I was writing the answer that it would be easier to explain by starting from the confusion matrix, not from the table with the predictions...There is inconsistency one way or the other: either the $$$ are consistent with the predictions or with the confusion matrix - but never with both... I would agree, however, that it may be better to have it consistent with the confusion matrix because they are both presented in the same place.I will put the change on the backlog for the next release then.Thanks for pointing this out,
Ingo2 -
Hey Brian,The performance calculation we are discussing in this thread is actually not (that much) different than the one done by Performance (Cost).But the more exciting thing is actually the operator Cost-Sensitive Scoring. This is new approach which actually changes the way how the model is creating the prediction in order to optimize the expected cost. In that regard, this new CSS operator is similar to MetaCost. Also like MetaCost, this new operator can also work with more than two classes. Unlike MetaCost, however, it has two huge advantages: a) it does not increase training times and b) it does not require the training of an ensemble which would make the model more complex.And, yes, if you use the CSS operator you can add benefits as negative numbers. If you define costs in Auto Model you need to put in costs as negative and benefits as positive numbers though. AM will do the conversion for you.Cheers,
Ingo1
Answers
-
Hi Lionel,
Sure, let's get into this. BTW, I will spend a good amount of my October producing some new white papers / blog posts on the stuff I have been working on in the past years. This will include the new validation scheme we are using in AM, our new LIME variant, our new model-specific & model-agnostic weights, and the new Profit-Sensitive Scoring and cost calculations. Stay tuned for this please, and the answers below are already half of the workI'm seeing that "Total Cost/Benefit (expected) and the associated average were abandoned. My first question is why ?Because we collected the feedback from users and observed them in usability testing and decided that the new way of presenting the costs / profits can be a) easier understood and b) shows the ACTUAL business impact instead of expected values which are probabilistic. At the end of the day, that's exactly why we do those tests and beta releaseswhy these 2 new numbers are more relevant than the "Total Cost/Benefit (expected)" ?Given that those numbers are supposed to show true business impact and to facilitate communication also with business stakeholders, we decided that it is important to support both a) and b) above over a more statistical approach here. I hope you will agree after reading the explanation below...The next two questions I will answer in reverse order since it makes more sense to me that way...in the new column called "cost" why the cost is not counted as negative when the prediction is wrongThat's exactly the problem we saw with business stakeholders as well :-) The thing is that the costs in this column are the expected costs from the point of the view of the model. The model - when making the prediction - does not know if the prediction is going to be right or wrong. It just knows it's prediction and the confidence for it (as well as for the other possible outcomes) and calculates the expected costs based on those confidences. This is what I explained in the post here in the "expected" part of the answer: https://community.rapidminer.com/discussion/55904/questions-on-rapidminer-9-4-beta-new-releasesSo from the model's perspective the expected cost simply is not negative in this cases - otherwise it would have been going with a different prediction if possible.The problem now is that this expected cost is not what the business will actually see of course. In most situations, the business can not make an 80% decision just because the confidence is 80% (there are some use cases where this is actually possible, but they are the minority). For example, let's use Churn as our use case and also let's say you predict "churn" with 80% confidence. You will still offer the full discount (or whatever it is you are doing in this case) for the account the prediction was made for. Same for 75% confidence. Or 90%. The confidence typically does not matter for the specific action that much. And the same is true for the outcome: it is a binary outcome, not a probabilistic one. Either your customer churns or not. They don't churn 72% (and again - there are cases like that, but that is not typical).So while the table shows the expected cost to explain why the model made the decision, from a business perspective this just does not make a lot of sense. Hence we wanted to give this less spotlight on the expected costs from a model-perspective but instead show the actual business impact. See below for that now :-)can you explain how this 2 numbers are calculatedHere we go. First of all: forget the expected cost and the corresponding column in the data for this. This column is - as explained above - only showing why the model made a specific decision and has not much to do with the actual business outcomes.Both numbers can be calculated from the confusion matrix and the cost matrix only. The attributes or specific predictions actually don't matter. Neither do the confidences.All you do is you multiply the values in the fields of the confusion matrix with the corresponding cost matrix and sum this all up. That is the Total Cost / Benefit delivered by the model.Now you assume a default model for each of the possible classes, i.e. a model which always predict that class. Since you know the distribution of the actual outcomes from the confusion matrix already, you can calculate the costs / benefits for always predicting each of the possible classes. The class with the lowest cost / highest profits is the one which is the best default option. Its total cost is shown as Total for Best (class). This is typically what is already implemented in your organization since otherwise the business would be typically gone at that point :-)The difference between what the model produced and the best default option is the Gain. That's it.And since this may be hard to follow from a textual description: check out the little example in the attached Excel file. This shows all the calculations and should make it clear.Hope this helps,
Ingo2 -
Hi Ingo,
Yes, your long and detailed explanation helps me a lot to understand these new concepts of Benefits/Costs. #noblackboxes
Thank you for spending your time answering my questions.
Now you'll think I'm picky about the details, but I will quote the deutsch philosopher Friedrich Nietzsche : "The Devil is in the details"
I begin :
The 3 money indicators (Total Cost/Benefits, Total for Best Option, Gain) are calculated on the whole validation set (ie for the Titanic dataset on 524 examples [1309 examples x 40%]) :
But the displayed confusion matrix is NOT builded on the whole validation test :
Here we can see that the number of examples used to build this confusion matrix (always for the Titanic) is
219 + 135 + 7 + 14 = 375 examples A priori due to the factor 5 /7 introduced by the Performance Average (Robust) operator.
My question is for a question of homogeneity of the results, should the 3 moneys indicators not be calculated with this displayed confusion matrix ? In other words, actually, the displayed money indicators don't correspond directly to the displayed confusion matrix ...
Thanks you for your patience and your listening...
Regards,
Lionel
2 -
You got me thereYou are right: the ACTUAL calculations are NOT coming from the confusion matrix but from the predictions table. I just realized yesterday while I was writing the answer that it would be easier to explain by starting from the confusion matrix, not from the table with the predictions...There is inconsistency one way or the other: either the $$$ are consistent with the predictions or with the confusion matrix - but never with both... I would agree, however, that it may be better to have it consistent with the confusion matrix because they are both presented in the same place.I will put the change on the backlog for the next release then.Thanks for pointing this out,
Ingo2 -
You got me there
So Friedrich Nietzsche was right .....
More seriouly, I agree with your point of view, Ingo, and once again, thanks for taking the time to answer me.
Regards,
Lionel3 -
This is a very interesting discussion. I haven't had a chance to dive into this new operator yet, but I had a couple of questions.
@IngoRM how is the new operator different from the existing Performance(Costs) operator? Or is it?
It appears that they require the same inputs (a class order and then a misclassification cost matrix). In this framework, are you still allowed to enter benefits as negative costs?
1 -
Hey Brian,The performance calculation we are discussing in this thread is actually not (that much) different than the one done by Performance (Cost).But the more exciting thing is actually the operator Cost-Sensitive Scoring. This is new approach which actually changes the way how the model is creating the prediction in order to optimize the expected cost. In that regard, this new CSS operator is similar to MetaCost. Also like MetaCost, this new operator can also work with more than two classes. Unlike MetaCost, however, it has two huge advantages: a) it does not increase training times and b) it does not require the training of an ensemble which would make the model more complex.And, yes, if you use the CSS operator you can add benefits as negative numbers. If you define costs in Auto Model you need to put in costs as negative and benefits as positive numbers though. AM will do the conversion for you.Cheers,
Ingo1