Hello to everybody,
I am working on my master thesis dealing with data mining aspects, so I began to learn using RapidMiner 5.0.
But there are lot of problems I´m facing, so I hope getting help in this forum.
My problem is to use decision trees to predict quantitative values, so i have to use trees being able to handle numerical labels, called regression trees.
Although RapidMiner 5.0 provide lots of different types of decsion trees being described to be regression trees, they can not handle numerical labels, so i´m a little bit confused about that.
Here an excerpt of my data to be analyzed:
input 1 input 2 input 3 input 4 input 5 input6 label |
0,0050 0,0413 0,0610 0,01 0,01 0,01 0,120
0,0050 0,0413 0,0610 0,01 0,01 0,01 0,121
0,0050 0,0413 0,0610 0,01 0,01 0,01 0,127
0,0037 0,0467 0,0913 0,01 0,01 0,01 0,099
0,0037 0,0467 0,0913 0,01 0,01 0,01 0,094
0,0037 0,0467 0,0913 0,01 0,01 0,01 0,127
0,0030 0,0363 0,0600 0,01 0,01 0,01 0,097
0,0030 0,0363 0,0600 0,01 0,01 0,01 0,101
0,0030 0,0363 0,0600 0,01 0,01 0,01 0,087
0,0030 0,0370 0,0593 0,01 0,01 0,01 0,038
0,0030 0,0370 0,0593 0,01 0,01 0,01 0,058
0,0030 0,0370 0,0593 0,01 0,01 0,01 0,038
0,0197 0,3550 0,8407 0,03 0,14 0,056 0,100
0,0197 0,3550 0,8407 0,03 0,14 0,056 0,096
Sorry for the bad layout.
The description of the decison trees aimded to be used, tells the following:
"This operator learns decision trees from both nominal and numerical data. Decision trees are powerful classification methods which often can also easily be understood. This decision tree learner works similar to Quinlan's C4.5 or CART.
The actual type of the tree is determined by the criterion, e.g. using gain_ratio or Gini for CART / C4.5."
This decision tree working similar to the CART (Classification by regression), but can not handle numerical label.
I hope you can help me.
Thank you.
[/table]