🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Final prediction in bagging algorithm"

User: "adrian_crouch"
New Altair Community Member
Updated by Jocelyn
Hello RM community,

I'm not certain whether I'm wrong but I always thought that the bagging meta algorithm should select the final prediction on the basis of a majority vote (in classification). While averaging the numeric confidences generated by the individual models for a label value this would mean that the final confidence may not directly map to the final prediction.

Lets say we have three models that are aggregated and the models predict confidences of 0.4., 0.4 and 0.9 for class 'A' and 0.6, 0.6, 0.1 respectively for class 'B' for a given example in a binominal classification. When averaging these confidences, class 'A' would get a confidence of 0.567 and class 'B' 0.433. In a majority voting approach I would however expect 'B' as the finally predicted class as it was 2 times predicted by the three models while class 'A' was predicted only once.

This does not correlate with the implementation in the BaggingModel (version 5.3.008). There it is the label value for the highest averaged confidence that is finally chosen - which for the example above was 'A' due to the higher confidence of 0.567.

Could someone tell me if I made a mistake with my thinking here?
Many thanks,
Adrian

Find more posts tagged with

Sort by:
1 - 2 of 21
    Hi Adrian,

    it simply comes down to weighted or unweighted average. I think both are useful. Brimans original RF implementation used unweighted.

    ~Martin
    User: "adrian_crouch"
    New Altair Community Member
    OP
    You may be right: in case confidences were averaged and multiplied with the weight that comes from the number of times a label value was predicted then my assumption holds. But when looking into BaggingModel's implementation I can't find anything that deals with weights in this context (and so it's no wonder the result does not conform with my expectation).
    So I don't exactly get the point. Am I misinterpreting something or is it indeed a bug in the bagging implementation?