"TreeModel prediction confidence failure (when generated by Decision Tree)"

mtgreen · January 2010

Hi,

Ive noticed something funny when 'm using a tree model generated by the DecisionTree operator. In addition to using the predicted class value, I'm using the actual underlying confidences. I'm applying a generated TreeModel to a test ExampleSet of about 100,000 examples. I've noticed that occasionally, a test example reaches a node in the tree model that is not a leaf yet has no matching edges for the example. According to the comments, I believe the intention is to simply classify the example according to the majority class represented by the node. However, only leaf nodes maintain an accurate Map Counter even though the method expects all nodes to have this. Currently, by default, the TreeModel classifies the example as "0" and returns nothing "?" for the confidence of each possible classification. I'm assuming that this is a very well known issue OR my mistaken interpretation of the code.

If it's a real issue, I'm happy to submit an updated TreeModel which can recursively generate an accurate Map Counter (if one does not already exist) in these types of cases. Can someone comment to let me know if I'm missing something? Thanks!! I love this product.

~Michael

mtgreen · January 2010

I think I was able to correct the problem by making a slight modification to the Tree class. I changed the getCounterMap function to fetch counter statistics for child nodes (if it has not done so already). Here is the code that I added:

public Map<String, Integer> getCounterMap() {
if (isLeaf() == false) {
// Fill counterMap if it is currently empty
if(counterMap.isEmpty() == true) {
for (Edge edge : children) {
Map<String,Integer> childCounterMap = edge.getChild().getCounterMap();
Iterator<String> s = childCounterMap.keySet().iterator();
while (s.hasNext()) {
String className = s.next();
Integer parentCount = counterMap.get(className);
int newCount = childCounterMap.get(className);
if(parentCount != null)
newCount += parentCount;
counterMap.put(className, newCount);
}
}
}
}

return counterMap;
}

Elsewhere in the Tree class, I changed functions which read the counter map (not the modififiers, just the accessors) to call getCounterMap instead of directly reading the counterMap.

I'm interested to know if this modification would be helpful to include in future releases. Let me know. Thanks!

~Michael

land · February 2010

Hi Michael,
every improvement is welcome

Would be very kind if you could send us a patch file from the repository? It would simplify my life a lot. I will include it then.

Greetings,
Sebastian

"TreeModel prediction confidence failure (when generated by Decision Tree)"

Answers

Categories