🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Binary text classification - Help in process needed.

User: "thiemo"
New Altair Community Member
Updated by Jocelyn

Hey guys,

 

We want to do a binary classification on a text data set with the distribution 80% negative class, 20% positive class. In order to reach maximum statistical meaningfulness, we want to do so by using 10-fold cross validation.

 

If we model this within Rapidminer, we are unsuccessful since it doesn’t output any statistical metrics (like precision, recall, etc):

 

Bildschirmfoto 2016-12-01 um 12.14.37.pngBildschirmfoto 2016-12-01 um 12.15.34.png

 

 

We found a workaround that works, but it doesn’t make any sense out of a ML perspective: If we first divide into training or test and then use 10-fold-crossvalidation it works — But the training or test split should be part of the crossvaligdation (9 training folds, 1 test fold, 10 iterations). So right now the only way to get this working is by FIRST dividing into test and training and THEN use X-Validation. Did we model it the right way or did we miss anything?

 

Bildschirmfoto 2016-12-01 um 12.14.37.pngBildschirmfoto 2016-12-01 um 12.15.01.pngBildschirmfoto 2016-12-01 um 12.15.34.png

 

 

If you need any more information for helping us, just comment.

Thank you very much in advanced.

 

Best regards!

Find more posts tagged with