Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Got 100% accuracy ,precious and Recall
fatimidveil
hi everyone, My data set consist of 1150 entities and i have one attribute that is highly correlated with my class attribute. i got 100 % accuracy precious and recall of my algorithm .
and also my decision tree get only one attribute that is highly correlated with my class attritube.
what should i do know .?
i have apply three algorithm on my data set id3,cart and c4.5
so how i calculate with one is perform better than other ?
Find more posts tagged with
AI Studio
Performance
Accepted answers
varunm1
Great, you can also look at the general relation between the highly correlated attribute and the outcome variable. If that relationship is acceptable in your domain, then you are fine. What I mean by relationship? For example, if you have a data set related to sport outcome and you are trying to predict win or loss for a team. In this dataset, there is a predictor column named "Winning percentage" that has values ranging from 0 to 100. Lets assume that the output "Outcome" attribute is labeled based on this winning percentage column (if winning percentage >=50 then Win and if winning percentage < 50 then loss). In this case, the algorithm can predict with very high accuracy as there is a clear relation between "Wiinning percentage" and "Oucome". These sort of general checks can be performed if we have have accuracy and highly correlating attribute.
All comments
lionelderkrikor
Hi
@fatimidveil
,
Have you tried to submit your data to the AutoModel ?
It can be a good starting point...
Regards,
Lionel
fatimidveil
no am working on my thesis and i collect data by my own
varunm1
Hello
@fatimidveil
You can try using automodel option present in rapidminer as mentikned by
@lionelderkrikor
. If you want to build model by yourself, use cross validation method with 5 folds and see how the model performances are varying. In cross validation, you can use feature selection and optimal hyper parameter search for better model performance.
There is nothing wrong in having single sttribute in Tree. One reason for this is the pruning in tree that will remove attributes that doesn't provide much information.
fatimidveil
yes i perform cross validation with 10 fold and also perform feature selection techniques like weight by information gain ,gini index ,chi Square statistics and weight by information gain ratio .
fatimidveil
i perform auto model as well and get the same tree as i got earlier with one attribute
varunm1
Great, you can also look at the general relation between the highly correlated attribute and the outcome variable. If that relationship is acceptable in your domain, then you are fine. What I mean by relationship? For example, if you have a data set related to sport outcome and you are trying to predict win or loss for a team. In this dataset, there is a predictor column named "Winning percentage" that has values ranging from 0 to 100. Lets assume that the output "Outcome" attribute is labeled based on this winning percentage column (if winning percentage >=50 then Win and if winning percentage < 50 then loss). In this case, the algorithm can predict with very high accuracy as there is a clear relation between "Wiinning percentage" and "Oucome". These sort of general checks can be performed if we have have accuracy and highly correlating attribute.
fatimidveil
thank you so much for your support my variable is not much correlate if i discard that variable my tree seem fine to me in fact my tree is built a
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups