Deep Learning - Test results Confidence values

Predictions for test data come back with a prediction (0 or 1 in my case) and a confidence value (float) for both:
confidence(0), confidence(1).

To get the overall confidence, in the past, I would create a new attribute and give it abs(conf0 - conf1) as the overall confidence.
When I do this, I'm noticing very small numbers for the "1 prediction". the values are always below .27.

These seem very low given the predictions are coming back as expected.
The confidence for the 0 label can be very high.

The only thing i can think of is that the dataset is highly imbalanced and has far more 0 labels than 1.
Is this the reason the confidence values are coming back so low for the lesser class? More data would provide better confidence?

My ultimate goal was to "act" on all predictions above a certain confidence but this is perhaps showing me that i cannot use a single value for both predictions (0 and 1) and that i might need to use 2 different confidence values as "trusted".

I trust 0 above .8
I trust 1 above .25 <-- just seems very low to me even tho the results look good.

(or i artifically bump up the confidence of the lesser class so they seem more normal)

As it is, best case, i'd be trusting near a 60%/40% confidence combination which isn't that much better than flipping a coin, i.e., 50%/50%.

So I'm wondering how the confidence values are generated and how I should be interpreting them in terms of what minimum values can be "trusted" and would be considered "actionable".

Thanks.

Find more posts tagged with

AI Studio

Performance

Deep Learning + Neural Nets

Accepted answers

lionelderkrikor

Hi @fstarsinic,

It seems that this operator is used in the training part of the process.
To see how/where this operator is used, run an Auto-Model classification process (for example with the "Titanic" dataset).
When the results are displayed in the final screen (the "results" screen)., click on "OPEN PROCESS" and you will see the process.
Then go to Train Model --> Optimize ? --> you will see inside this subprocess operator the Rescale Confidences operator just after the modelling :

Image: https://us.v-cdn.net/6030995/uploads/editor/18/jgq614945ry9.png

Hope this helps,

Regards,

Lionel

All comments

jacobcybulski

When the model favours one class over another it is the sign of bias. There can be lots of reasons for the biased model, e.g. (1) your data may be heavily unbalanced so during training the model sees one class much more than the other, (2) your system is too simple for the data so that your deep model does not have enough redundancy to accommodate it, (3) a similar issue is that your model underfits the data, so it needs more training, (4) finally it is possible that your training sample and validation sample are very different. The confidence values are not necessarily 50-50 they reflect the nature of your data. "Bumping" your confidence up may be valid as long as the model is unbiased. It is best to get the best model performance first - watch the training performance vs validation performance, the training performance to see that the model is still learning, and validation performance to see at what point the model overfits your training data. If your data or the model are massive it may be too much to ask for cross-validation but at least you could check if the distribution of the two partitions indicates that they come from the same population. After the model is all good, you could adjust the classification threshold to improve some performance indicators.

Jacob

fstarsinic

I've decided to not adjust anything in the model, but to have different thresholds for what is flagged an "actionable" prediction for each label separately. That way I can use 2 different target values for the 2 different labels.

Telcontar120

I understand you said that you don't necessarily want to make any adjustments, but for this type of problem in the future, you might want to check out the Rescale Confidences operator as well as the Drop Uncertain Predictions operator.

fstarsinic

Interesting. thank you. that looks very promising. I'll try those now.

fstarsinic

I'm looking at the Rescale Confidence operator. How/Where does this operator fit into a process? Before the model is created? After? And does it need to be used in Training only? Testing only? Both?

lionelderkrikor

Hope this helps,

Regards,

Lionel

sgenzer

ah thank you @lionelderkrikor.

lionelderkrikor

You're welcome, Scott !

Regards,

Lionel