Different Results using DecisionTree in RM4.4 and RM4.5
thorbenkeller
New Altair Community Member
Hi everybody,
I have a Problem with a very simple Decision Tree, that does not return the correct result. I suspect a Bug(?) in the new Version 4.5 since in the older Version I always got the correct result.
What I have is 50.000 examples, where I have two classes (A and and a single numerical Attribute (ranging from 0 to 10,5). From those 50.000 examples, around 35.000 have class A and 15.000 have class B. All I want to do is find the best threshold to separate those two classes, i.e. training a binary decision tree with depth 2.
The earlier Version found the threshold to be around 3,5, which results in an overall classification rate of around 90%. The new RapidMiner Version finds the threshold to be around 9,4 resulting in an overall classification rate of around 69%!!
I used the exact same process-file and did not make any changes. Using GridParameterOptimization I checked around 27.000 parameter combinations, but none of them resulted in a classification rate greater than 69%.
Does anybody have a similar problem or could give any help?
Thanks for any Feedback.
PS: Thank you very much for the Development of this great Tool, it helped me a lot in my Diploma Thesis :-)
I have a Problem with a very simple Decision Tree, that does not return the correct result. I suspect a Bug(?) in the new Version 4.5 since in the older Version I always got the correct result.
What I have is 50.000 examples, where I have two classes (A and and a single numerical Attribute (ranging from 0 to 10,5). From those 50.000 examples, around 35.000 have class A and 15.000 have class B. All I want to do is find the best threshold to separate those two classes, i.e. training a binary decision tree with depth 2.
The earlier Version found the threshold to be around 3,5, which results in an overall classification rate of around 90%. The new RapidMiner Version finds the threshold to be around 9,4 resulting in an overall classification rate of around 69%!!
I used the exact same process-file and did not make any changes. Using GridParameterOptimization I checked around 27.000 parameter combinations, but none of them resulted in a classification rate greater than 69%.
Does anybody have a similar problem or could give any help?
Thanks for any Feedback.
PS: Thank you very much for the Development of this great Tool, it helped me a lot in my Diploma Thesis :-)
Tagged:
0
Answers
-
Hi Thorben,
I will check that. We did not change anything in the decision tree, but perhabs there's somewhere a cross-effect...
It would help us a lot, if you could send a process, where this occurs. Since you probably cannot send uns the data, it would be great, if you could use an ExampleSetGenerator instead.
Greetings,
Sebastian0