🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Calculate performance only on TRUE cases??"

User: "bobdobbs"
New Altair Community Member
Updated by Jocelyn
Hello,

First off, I want to say thank you for this great software.  I LOVE RapidMiner!!!

On to my question...

We are looking at creating an SVM for detecting positive indications of a medical condition.  
We have training data that is labled "true" and "false" along with all the features.  (True examples are those where the person has the medical condition.  They represent about 20% of the training data.)

When attempting a grid parameter function or a feature selection function we are seeing a problem with finding an ideal result.

WE DON'T CARE ABOUT THE NEGATIVE OR "FALSE" CASES.  We only care about the accuracy of the "true" cases.  

The problem is that the accuracy performance measure is the average of accuracy for BOTH cases (true and false.)  For example, if we just predict everything as false, since 80% of of our examples are false, then we automatically have 40% accuracy, but ZERO correct predictions for the class we care about.

*** I guess what we ultimately want to do is train a SINGLE CLASS SVM that is focused on predicting the true class as accurately as possible. ****

So we don't need a performance scored based on the aggregate accuracy of the model, but ONLY ON THE ACCURACY OF THE "TRUE" PREDICTIONS.

One thought was to use class weighting in either the SVM or classification performance steps, but how much? and which to use?

Another thought was to use some creative application of the meta-cost function, but how would we incorporate that with the libsvm function??

Is this possible in RM?

Any and all ideas would be appreciated.  :)

Find more posts tagged with