unclear matrix for MetaCost operator
dan_agape
New Altair Community Member
Any answer/comment to the question below would be appreciated. Many thanks,
Dan
When defining the cost matrix of the MetaCost operator, the names class 1 and class 2 appear (suppose you have 2 classes only). How do you know which is what (for instance class 1 is "Yes" and class 2 is "No")? Perhaps there is an obvious/user friendly solution - but I do not see it.
Tagged:
0
Answers
-
Hi Dan,
Sadly there isn't really a better answer than the source code, which I think passes the matrix from the parameters to the model without changing the order.
In the operator we have this before the learning...( MetaCost.java )//get cost matrix
and this to produce the actual model...
double[][] costMatrix = getParameterAsMatrix(PARAMETER_COST_MATRIX);return new MetaCostModel(inputSet, models, costMatrix);
and this to show that the same thing gets stored in the new model...public MetaCostModel(ExampleSet exampleSet, Model[] models, double[][] costMatrix) {
Actually the same answer probably applies to your other questions this evening, you need to check the code out for yourself; and believe me, if I can manage it, anyone can!
super(exampleSet);
this.models = models;
this.costMatrix = costMatrix;
}
So I leave you with the pleasures of Dark Vega 8)0 -
Hi Haddock,
Thanks - that's been useful.
RM is an excellent and impressing DM suite on many aspects - however user friendly-ness is essential for a software to become significantly important on this competitive market. I wander however if, in the commercial versions, the meaning of the columns/rows in the confusion matrix is obvious (otherwise one can include a particular higher cost but one does not know for which class).
By the way I have tried to use the Weka MetaCost operator instead - just to stick to the process of modeling via the GUI, but there the inner operator that builds the model cannot be linked to the outer operator to get the dataset and return the model.
Best,
Dan
0 -
confusion matrix above to be read cost matrix0
-
Hi there,
The RM Freebie does not have different functionality from the commercial version as far as I know. I'm with you on the need for handy help; equally we could make it ourselves... after all this is open source software 8)0 -
Hi all,
on the first sight your problem seems to be easily solvable: Read the data, fetch possible class labels and show them to the user. Just: It's not that easy. Before the data reaches the operator, whose parameters you are going to set, it passes many other operators. So RM would have to execute all of them to be sure, which labels actually are present. This might take any arbitrary time (as usual for data mining processes). Since this, it's much more complicated. We started first steps with the so called MetaData transformation, where only data about the data is handled, which already solves many problems like attribute selection, etc. A priceless feature if you ever tried software without it...
But you cannot rely on this for such an important feature, because many transformation cannot be simulated without taking the real data into account.
As a way out, you can explicitly remap your label attribute, so that you know the order of the classes.
Anyway, we are working hard on further improve the user friendly ness and ease of use of our software. You might add a feature request to our bug tracker, so that we can't forget this. And if you become enterprise customer, I promise you, we will immideately attach any information in the meta data to this matrix. (Just to show you, why it might be worth to become enterprise customer. It makes us jump if you call...)
Greetings,
Sebastian
0 -
Hi Sebastian,
Thanks very much for your answer. I am quite new to RM (I am exploring some mature DM suites to consider for my business in the future). Can you please tell me how to explicitly remap the label attribute such that the order of its values is known? Thanks.
Best
Dan
0 -
Hi,
if you have binominal labels, you can simply use the remap binominals operator to define which class is negative (=first) or positive(=second).
Anyway you can take a look at the meta data view of your example set. In the column Range a list of all possible nominal values is given. This list is in order of internal mapping.
Greetings,
Sebastian0 -
Sebastian I have tested your first suggestion, it worked.
However, I am still not quite convinced about your second suggestion. What I could observe using several datasets, is that if you evaluate a model built with the MetaCost operator, then always both the confusion matrix and the cost matrix respect the same order of classes as columns. However, this order is not always the same with the one of the values in the list of the label attribute in the meta data, as you suggest. I used no remapping in this case.
Best,
Dan0