please need help to understand MetaCost
ilaria_gori
New Altair Community Member
Dear all,
I have some problem in understanding how the MetaCost works. Could you help me, please?
I will try to describe what it's not clear:
Here is what I understood: I read that MetaCost is a "bagging with cost": first step: N models are built with bagging and they are used, together with the cost matrix, to associate to each training instance a "prediction" minimizing the expected cost. Second step: these predictions are used as labels to train another single classifier which is the final model.
Here is my experience: if I train a classifier with cost_matrix [0 1;1 0] or with [0 2;1 0] from the same training set, I obtain two models which do not differ, i.e.when I apply them to the same set, I have the same ROC curve, the same outputs for each example. The only thing that changes is the operative point on the ROC curve to which sens and spec are calculated.
This should be true with the first step of Metacost (bagging and construction of "predictions" minimizing the expected cost), but how it can be true with the second step? That is to say, how it can be true if a "final model" is learnt by using as labels the predictions obtained in the first step?
I would be very grateful to you if you could explain me what I did not understand about this procedure.
Thanks a lot
ilaria
I have some problem in understanding how the MetaCost works. Could you help me, please?
I will try to describe what it's not clear:
Here is what I understood: I read that MetaCost is a "bagging with cost": first step: N models are built with bagging and they are used, together with the cost matrix, to associate to each training instance a "prediction" minimizing the expected cost. Second step: these predictions are used as labels to train another single classifier which is the final model.
Here is my experience: if I train a classifier with cost_matrix [0 1;1 0] or with [0 2;1 0] from the same training set, I obtain two models which do not differ, i.e.when I apply them to the same set, I have the same ROC curve, the same outputs for each example. The only thing that changes is the operative point on the ROC curve to which sens and spec are calculated.
This should be true with the first step of Metacost (bagging and construction of "predictions" minimizing the expected cost), but how it can be true with the second step? That is to say, how it can be true if a "final model" is learnt by using as labels the predictions obtained in the first step?
I would be very grateful to you if you could explain me what I did not understand about this procedure.
Thanks a lot
ilaria
Tagged:
0
Answers
-
Hi,
are you sure that the inner learner is able to handle example weights? This is necessary because otherwise the inner model will never change. You can check this in the learner capabilities of all learning schemes.
Cheers,
Ingo0 -
Hello Ilaria,
I just posted an example which makes use of MetaCost under http://rapid-i.com/rapidforum/index.php/topic,790.0.html. Perhaps this might be helpful for your understanding of the process.
When I change my cost matrix, the results of the model do indeed differ.
Perhaps you need to change your cost matrix more drastically, e.g. from [0 1; 1 0] to [-30 1; 1 0] to see an effect (this should lead the model to always predicting the first class, isn't it?)
@Ingo:
I did not look in the Java code, but as far as I understood Domingos work, MetaCost does not necessary need models with example weights, it can deal in principle with any models predicting classes, isn't it?
Regards
Wolfgang0 -
Hi,
Ah, you are right. I had a quick scan of the source code and the example weights did only come into the game for sampling with replacement. I had read the paper some years ago and mixed things up with the sampling. Thanks for pointing that out.
I did not look in the Java code, but as far as I understood Domingos work, MetaCost does not necessary need models with example weights, it can deal in principle with any models predicting classes, isn't it?
Cheers,
Ingo0