🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"[SOLVED] Probability of relevance in text classification"

User: "wmarella"
New Altair Community Member
Updated by Jocelyn
Hello, does anyone know if there is an operator or setting that will allow me to generate a vector or table of probabilities where each document in the corpus is rated for the probability that it's relevant?

I've trained a naive bayes operator on a set of about 1000 short documents to classify them as relevant or not relevant. I'm able to get it to work sufficiently well that the auc is .853. I'm wondering if there's a way to have not just two classes: definitely relevant, definitely not relevant, and not able to be classified. A human would definitely be able to classify the ones that machine would put in the third group, but I'm thinking if I could generate the probability of relevance for each document, I could pull the ones in the midrange out and improve the accuracy of those remaining.

Thanks in advance for any advice

Find more posts tagged with

Sort by:
1 - 2 of 21
    User: "MariusHelf"
    New Altair Community Member
    Hi, you can use the operator "Drop Uncertain Predictions" for exactly that.

    Best, Marius
    User: "wmarella"
    New Altair Community Member
    OP
    Thanks, Marius, helpful as always!