Finding an incorrect grading pattern

New Altair Community Member

Oct 22, 2016

Updated Nov 5, 2024 by Jocelyn

I was given a labelled data set and I was told few of the labels are wrongly assigned, i.e. some of the data were graded inaccurately. I'm supposed to find which ones. Which tool in RapidMiner should I use?

I tried the operator Find Outliers (Density), but somehow I feel that is not the one I'm looking for.

Thank you very much for advice. Markéta

Find more posts tagged with

AI Studio

Demand Forecasting

Sort by:

1 - 2 of 21

IngoRM

New Altair Community Member

Oct 24, 2016

Here is an idea: you could train a model on the data set which is generalizing well (no overfitting, no k-nn with 1 neighbor only, you get the idea...) and then apply this model to the training data set again. Whenever the prediction differs from the label, this could be a good candidate for wrongly labeled.

Just my 2c,

Ingo

Telcontar120

New Altair Community Member

Oct 25, 2016

Another potenial approach would be to run a clustering analysis on the labeled classes separately and then look for individual outliers that way.

Finding an incorrect grading pattern

Find more posts tagged with

Quick Links