"Performance Operator presents wrong value in the table view and description"

Alexey
New Altair Community Member
Somehow I ran into following situation:
after applying a model trained with a decision tree in a cross validation the performance result is looking very "strange". Suddenly on the precision and recall view the rows and the columns are changed causing all the values as precision and recall to become (100% - x).
I attach the pictures from the view. This happened with the recent Rapidminer version downloaded from the website on my Macbook Pro with Mac OSX 10.10.1.
Accuracy View

Recall View

Description View
after applying a model trained with a decision tree in a cross validation the performance result is looking very "strange". Suddenly on the precision and recall view the rows and the columns are changed causing all the values as precision and recall to become (100% - x).
I attach the pictures from the view. This happened with the recent Rapidminer version downloaded from the website on my Macbook Pro with Mac OSX 10.10.1.
Accuracy View

Recall View

Description View

Tagged:
0
Answers
-
Hi Alexey,
Could you try to use
1. Reorder attributes right in front of the apply model and the training of the model
2. use remap binominal before the cross validation
Do you do anything special inside the x-val which could change the meta-data (Append, Union,...)?
By the way - are you working on a IACT like HESS, Magic or Veritas?
Best,
Martin0 -
Hey,
I've tried both, reordering the attributes and remapping binomial before the cross validation. In both cases nothing changed the output.
In the cross validation I just learn the decision tree, apply the model, select the recall, apply the threshold and then calculate the performance. I suppose what is causing the problem is sample(bootstrapping). As we have less examples from one class, I was trying to use bootstrapping in order to get somehow similar amount of both classes and then training the model. This was just a try and it didn't really worked that well as expected, but never mind. The way was as following: get all examples of one class, use Sample(bootstrapping), use Union with the other unmatched data and then sampling data for training from that unified data. Just in case of bootstrapping I get this strange result.
Yes, I'm working with the FACT data, somehow base on the work from Marius HelfAsking in english, as everything here is in english and maybe someone else would run into the same configuration.
0 -
Hi
The problem is not the sample but the union. The Union is changing the meta data and then it might be, that the labels are switched in there internal representation. You might put a remap binominal after the union and before the x-val and map them by hand to the internal positive/negative values.
If it does not work, try, to use the simple sample operator. There you can use "balance classes" and define the ratios for gamma and proton seperatly. This should do the trick.
Did you try to use weights? Should be fine for a decision tree.
Ohh it's FACT :-). I looove the project. As you might know i did my phd on icecube but was "a bit" involded in the data analysis of FACT (because of the coffee machine).
Are you at Wolfgangs or Katharinas chair? In the physics department it is always useful to talk with Tim about the problems.
Best,
Martin0 -
Hi Alexey, nice to see that my work is finally being reused. So investing in your scholarship finally pays off
Good luck for your thesis and happy mining!
~Marius0 -
I've tried this trick, but this doesn't solve the problem.Martin Schmitz wrote:
The problem is not the sample but the union. The Union is changing the meta data and then it might be, that the labels are switched in there internal representation. You might put a remap binominal after the union and before the x-val and map them by hand to the internal positive/negative values.
I've already used the "normal" sampling. Using bootstrapping was just a thought of how to get some similar amount of proton data. There is about 100k gamma examples and 40k proton examples. I was thinking of using more examples while still holding gamma and proton at the same level (50/50). But this seems to lead to some problems. I'm still not sure why these happens.Martin Schmitz wrote:
If it does not work, try, to use the simple sample operator. There you can use "balance classes" and define the ratios for gamma and proton seperatly. This should do the trick.
Did you try to use weights? Should be fine for a decision tree.
I'm new on this project, but I've heard of it before and found it pretty amazing.Martin Schmitz wrote:
Ohh it's FACT :-). I looove the project. As you might know i did my phd on icecube but was "a bit" involded in the data analysis of FACT (because of the coffee machine).
Are you at Wolfgangs or Katharinas chair? In the physics department it is always useful to talk with Tim about the problems.
I'm not the first reusing your work!Marius wrote:
Hi Alexey, nice to see that my work is finally being reused. So investing in your scholarship finally pays offThe scholarship was definitely a really good help, though I wasn't able to handle an internship or something similar. This is still not my thesis, but just work. But who knows, how it all ends up!
0