Hi again!
I'm trying to use LibSVMLearner with a one-class setup, as I've heard this is a great way to do anomaly detection. However, whenever I use it (such as in the XML below), I get a null pointer exception.
All I'm doing is turning on the learner to be one-class. Doing this causes it to throw the null exception. If I leave it as C-SVC, it works. I presume there's some magic numbers I'm not giving it when I turn on one-class, but I don't know what it is. I'm feeding it a nominal true/false label, with the attributes being all numeric.
The exception is:
java.lang.NullPointerException
at com.rapidminer.operator.learner.functions.kernel.LibSVMModel.performPrediction(LibSVMModel.java:139)
at com.rapidminer.operator.learner.PredictionModel.apply(PredictionModel.java:77)
at com.rapidminer.operator.ModelApplier.apply(ModelApplier.java:84)
at com.rapidminer.operator.Operator.apply(Operator.java:666)
at com.rapidminer.operator.OperatorChain.apply(OperatorChain.java:416)
at com.rapidminer.operator.Operator.apply(Operator.java:666)
at com.rapidminer.Process.run(Process.java:695)
at com.rapidminer.Process.run(Process.java:665)
at com.rapidminer.Process.run(Process.java:655)
at com.rapidminer.gui.ProcessThread.run(ProcessThread.java:61)
and the XML is:
<operator name="Root" class="Process" breakpoints="after" expanded="yes">
<parameter key="logverbosity" value="status"/>
<parameter key="logfile" value="/Users/cflewis/Documents/Computing/0809/Spring/Data mining/Final/output.log"/>
<operator name="Load Training Set" class="CSVExampleSource">
<parameter key="filename" value="/Users/cflewis/Documents/Computing/0809/Spring/Data mining/Final/data.csv"/>
</operator>
<operator name="AttributeSubsetPreprocessing" class="AttributeSubsetPreprocessing" expanded="yes">
<parameter key="condition_class" value="attribute_name_filter"/>
<parameter key="attribute_name_regex" value="label"/>
<operator name="Numerical2Binominal" class="Numerical2Binominal">
</operator>
</operator>
<operator name="ChangeAttributeRole" class="ChangeAttributeRole">
<parameter key="name" value="label"/>
<parameter key="target_role" value="label"/>
</operator>
<operator name="AttributeFilter" class="AttributeFilter">
<parameter key="condition_class" value="is_numerical"/>
</operator>
<operator name="AbsoluteStratifiedSampling" class="AbsoluteStratifiedSampling">
<parameter key="sample_size" value="1000"/>
</operator>
<operator name="LibSVMLearner" class="LibSVMLearner">
<parameter key="svm_type" value="one-class"/>
<list key="class_weights">
</list>
</operator>
<operator name="CSVExampleSource" class="CSVExampleSource">
<parameter key="filename" value="/Users/cflewis/Documents/Computing/0809/Spring/Data mining/Final/DataminingContest2009.Task1.Test.Inputs"/>
</operator>
<operator name="AttributeFilter (2)" class="AttributeFilter">
<parameter key="condition_class" value="is_numerical"/>
</operator>
<operator name="ModelApplier" class="ModelApplier">
<list key="application_parameters">
</list>
</operator>
<operator name="ExcelExampleSetWriter" class="ExcelExampleSetWriter">
<parameter key="excel_file" value="/Users/cflewis/Documents/Computing/0809/Spring/Data mining/Final/decisiontree_results.xls"/>
</operator>
</operator>
The XML may look slightly odd as it's loading a training set, training on it, saving the model, then reloading it to be used with a test set. It is applying it to the test set that is failing.
I've no idea what I'm doing wrong! Any help is appreciated!