🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Uneven distributed binominal data

User: "MuehliMan"
New Altair Community Member
Updated by Jocelyn
Dear RM community,

I have a problem handling my dataset. I am trying to build a random forest model with a binominal label. The only prblem is, that the dataset contains 50 positives and 200 negatives. If all examples are predicted als false the accuracy is still quite OK (80%).  And this is exactly what happens: Most models I get are predicting most as false.

So my question is, how to handle uneven distributed datasets. Is there for example a way to weight correct positives more than correct negatives negatives? A correct predicted postive should then be 200/50 times more valueable.

Cheers,
Markus

Find more posts tagged with