"SVM Models - Target Variable with more than two levels"
HamsterDR
New Altair Community Member
I have a data mining problem in which there are four levels in the target variable. I have used a SVM model in Statistica that works very well for my data - and supports the four level target variable. I am just starting out with Rapid Miner, and it looks like all the SVM models in Rapid Miner only support binary target variables. Is that the case? I think the libSVM implementation supports more than two levels (that is what Statistica uses) - but the description of this SVM implementation in Rapid Miner still seems to say that it only supports binary target variables. If this capability is not available now, is it planned for the future?
David
David
0
Answers
-
Hello
It works fine with labels that have multiple nominal values.
Here's an example using the Iris data set
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.3.007">
<context>
<input/>
<output/>
<macros/>
</context>
<operator activated="true" class="process" compatibility="5.3.007" expanded="true" name="Process">
<process expanded="true">
<operator activated="true" class="retrieve" compatibility="5.3.007" expanded="true" height="60" name="Retrieve Iris" width="90" x="112" y="75">
<parameter key="repository_entry" value="//Samples/data/Iris"/>
</operator>
<operator activated="true" class="x_validation" compatibility="5.3.007" expanded="true" height="112" name="Validation" width="90" x="313" y="75">
<process expanded="true">
<operator activated="true" class="support_vector_machine_libsvm" compatibility="5.3.007" expanded="true" height="76" name="SVM" width="90" x="179" y="30">
<list key="class_weights"/>
</operator>
<connect from_port="training" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true">
<operator activated="true" class="apply_model" compatibility="5.3.007" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="performance" compatibility="5.3.007" expanded="true" height="76" name="Performance" width="90" x="179" y="30"/>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<connect from_op="Retrieve Iris" from_port="output" to_op="Validation" to_port="training"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 1"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
</process>
</operator>
</process>
regards
Andrew
0 -
I don't think so - this is what I got when I tried to run a dataset with a four level target variable. I used the SVM libSVM option.
Apr 16, 2013 8:11:21 PM SEVERE: Process failed: The operator SVM does not have sufficient capabilities for the given data set: polynominal attributes not supported
David0 -
You've probably got nominals in the non target variables. What does the meta-data of the input example set look like before the SVM?
Andrew0 -
I got that message on my home PC with 16GB of RAM (the process was using 12GB of RAM). On my work laptop (with 4GB) I can't even read in the data without running out of memory. It looks to me like the system is trying to keep everything in memory. This is not a big dataset - 9100 observations and 423 variables - so that is surprising. The original data is in SAS, but the SAS import step fails (I have reported the bug) - I had to save it as an excel file to get Rapid-I to read it.
I think I am getting way ahead of myself here - I am new to Rapid-I and I need to start with some simpler examples. I just got the "Data Mining for the Masses" (Matthew North) book, and will work through the examples in that book to get started.
David0 -
Select the SVM process, right click and choose Breakpoint Before (shift F7).
Run the process.
Go to the meta data view.
What are the roles and types of each of the attributes?
One should have the label role and should be type nominal.
All the remaining regular attributes must be numeric, integer or real.
If this checks out, LibSVM will work
As for the SAS import issue, how big is the raw data file?
regards
Andrew0