What type of label is supported for LibSVM one-class learning?

Legend
New Altair Community Member
Dears,
I have tried SVM (LibSVM) one-class learning, however I found error message:
"The learning scheme SVM does not have sufficient capabilities for the given data set: binominal label not supported".
Even also tried polynominal, several other type conversion, I cannot get it.
I have tried to search on Google and this forum, I failed to find right answer.
Please give me the word
.
Very thanks.
Danny.
I have tried SVM (LibSVM) one-class learning, however I found error message:
"The learning scheme SVM does not have sufficient capabilities for the given data set: binominal label not supported".
Even also tried polynominal, several other type conversion, I cannot get it.
I have tried to search on Google and this forum, I failed to find right answer.
Please give me the word

Very thanks.
Danny.
Tagged:
0
Answers
-
If you use one-class the variable can only have "one class". If the binomial variable you are trying to classify has 2 options (true/false) the operator will complain. If you are trying to do straight classification change from "one-class" to C-SVC.
0 -
Hi, thanks for your response.
However, I am not a newbie for SVM classification.
Learning data has only one label, "true", for one-class learning.
Even though I was eliminating label attribute, I couldn't get it.
BR.0 -
Hi,
the capability check is broken for this case. For now, please go to the preferences and check "rapidminer.general.capabilities.warn". This will bypass the check and trigger only a warning (which you can ignore). I will fix this problem.
Cheers,
Simon0 -
Dear Simon,
I'd very appreciate your support.
It will helpful.
Kindly Regards,
Danny.0 -
This will also be very useful to me. LibSVMs one-class model should be able to take a binomial label and predict whether or not an example falls within or outside the one-class.
Thanks,
-Gagi0 -
Hi,
we preferred to force the user to recognize that the One-class option really can't distinguish between two classes in a training set. That's why it will only work without warning if only one label value is present. If you have more than one lable, please create a new attribute with only one value as label.
Greetings,
Sebastian0 -
I understand that for ease of use it makes sense to take one-class labels as training. For prediction one should be able to compare the performance between 2 classes. How else is performance measured? For example if I make a one-class model how can I check if some of my specific samples fall outside the one class boundary as expected?
The power of one-class learning is that it is unsupervised, it would be nice to have the ability to use labeled data to check how well an unsupervised approach can separate 2 classes of data the one-class (normal samples) and the other-class (outliers).
0 -
Hi,
well the thing is, that one-class learning does not separate multiple classes but only builds a model about one class and to what extend data points are believed to belong to that single class. This does not say anything about a second class at all. You may of course define outliers for yourself by implying a threshold for the class confidence after you applied the one-class model. It should be obvious that such a threshold can not be defined by the learning approach (on what basis should such a threshold be defined chosen) but must be defined by the user, as it may stronlgy depend on your data and the class distribution.dragoljub wrote:
The power of one-class learning is that it is unsupervised, it would be nice to have the ability to use labeled data to check how well an unsupervised approach can separate 2 classes of data the one-class (normal samples) and the other-class (outliers).
Kind regards,
Tobias0 -
The thing is, for one-class SVM nu sets that threshold. According to Learning with Kernels (by Bernhard Schölkopf and Alex Smola) nu sets the upper bound on the % of outliers and the lower bound on the % of support vectors. So in reality one-class SVM actually predicts between 2 classes of data, the in-class and the out-class. All I am saying is that the libSVM operator would be more useful if for one-class learning it would allow you to send 2 classes of labels the 'in-class' and the 'out-class' and see if the learning algorithm can distinguish between them in an unsupervised way according to your kernel and nu parameter. This is exactly how the C implementation of libSVM works.Tobias Malbrecht wrote:
It should be obvious that such a threshold can not be defined by the learning approach (on what basis should such a threshold be defined chosen) but must be defined by the user, as it may stronlgy depend on your data and the class distribution.
I'm confused about how a one-class model created in rapidminer can be used to predict outliers versus normal samples.
Thanks,
-Gagi0 -
Hello,
I am working with one-class svm's too and I also miss the libsvm behaviour. Nevertheless based on the comments I implemented a little example how you can classify data with one-class models based on thresholds. Hope that helps
greetings,
Harald
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
<context>
<input>
<location/>
</input>
<output>
<location/>
<location/>
<location/>
<location/>
</output>
<macros/>
</context>
<operator activated="true" class="process" expanded="true" name="Process">
<process expanded="true" height="557" width="1090">
<operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30">
<parameter key="target_function" value="two gaussians classification"/>
<parameter key="number_examples" value="2000"/>
<parameter key="number_of_attributes" value="8"/>
<parameter key="attributes_lower_bound" value="0.0"/>
<parameter key="attributes_upper_bound" value="1.0"/>
<parameter key="use_local_random_seed" value="true"/>
</operator>
<operator activated="true" class="multiply" expanded="true" height="94" name="Multiply" width="90" x="182" y="165"/>
<operator activated="true" class="x_validation" expanded="true" height="112" name="Validation" width="90" x="313" y="30">
<process expanded="true" height="673" width="433">
<operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="45" y="30">
<parameter key="condition_class" value="attribute_value_filter"/>
<parameter key="parameter_string" value="label=cluster1"/>
</operator>
<operator activated="true" class="select_attributes" expanded="true" height="76" name="Select Attributes" width="90" x="180" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="label"/>
<parameter key="invert_selection" value="true"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="generate_attributes" expanded="true" height="76" name="Generate Attributes" width="90" x="45" y="120">
<list key="function_descriptions">
<parameter key="label" value=""cluster1""/>
</list>
</operator>
<operator activated="true" class="set_role" expanded="true" height="76" name="Set Role" width="90" x="180" y="120">
<parameter key="name" value="label"/>
<parameter key="target_role" value="label"/>
</operator>
<operator activated="true" class="support_vector_machine_libsvm" expanded="true" height="76" name="SVM" width="90" x="313" y="120">
<parameter key="svm_type" value="one-class"/>
<parameter key="gamma" value="5.0"/>
<parameter key="coef0" value="3.0"/>
<parameter key="nu" value="0.4"/>
<list key="class_weights"/>
</operator>
<connect from_port="training" to_op="Filter Examples" to_port="example set input"/>
<connect from_op="Filter Examples" from_port="example set output" to_op="Select Attributes" to_port="example set input"/>
<connect from_op="Select Attributes" from_port="example set output" to_op="Generate Attributes" to_port="example set input"/>
<connect from_op="Generate Attributes" from_port="example set output" to_op="Set Role" to_port="example set input"/>
<connect from_op="Set Role" from_port="example set output" to_op="SVM" to_port="training set"/>
<connect from_op="SVM" from_port="model" to_port="model"/>
<portSpacing port="source_training" spacing="0"/>
<portSpacing port="sink_model" spacing="0"/>
<portSpacing port="sink_through 1" spacing="0"/>
</process>
<process expanded="true" height="673" width="547">
<operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="find_threshold" expanded="true" height="76" name="Find Threshold" width="90" x="45" y="165"/>
<operator activated="true" class="apply_threshold" expanded="true" height="76" name="Apply Threshold" width="90" x="179" y="165"/>
<operator activated="true" class="nominal_to_binominal" expanded="true" height="94" name="Nominal to Binominal" width="90" x="313" y="30">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="prediction(label)"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<operator activated="true" class="performance" expanded="true" height="76" name="Performance" width="90" x="447" y="30">
<parameter key="use_example_weights" value="false"/>
</operator>
<connect from_port="model" to_op="Apply Model" to_port="model"/>
<connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
<connect from_op="Apply Model" from_port="labelled data" to_op="Find Threshold" to_port="example set"/>
<connect from_op="Find Threshold" from_port="example set" to_op="Apply Threshold" to_port="example set"/>
<connect from_op="Find Threshold" from_port="threshold" to_op="Apply Threshold" to_port="threshold"/>
<connect from_op="Apply Threshold" from_port="example set" to_op="Nominal to Binominal" to_port="example set input"/>
<connect from_op="Nominal to Binominal" from_port="example set output" to_op="Performance" to_port="labelled data"/>
<connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
<portSpacing port="source_model" spacing="0"/>
<portSpacing port="source_test set" spacing="0"/>
<portSpacing port="source_through 1" spacing="0"/>
<portSpacing port="sink_averagable 1" spacing="0"/>
<portSpacing port="sink_averagable 2" spacing="0"/>
</process>
</operator>
<operator activated="true" class="multiply" expanded="true" height="94" name="Multiply (2)" width="90" x="447" y="30"/>
<operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model (2)" width="90" x="581" y="165">
<list key="application_parameters"/>
</operator>
<operator activated="true" class="find_threshold" expanded="true" height="76" name="Find Threshold (2)" width="90" x="718" y="165"/>
<operator activated="true" class="apply_threshold" expanded="true" height="76" name="Apply Threshold (2)" width="90" x="852" y="165"/>
<operator activated="true" class="nominal_to_binominal" expanded="true" height="94" name="Nominal to Binominal (2)" width="90" x="986" y="165">
<parameter key="attribute_filter_type" value="single"/>
<parameter key="attribute" value="prediction(label)"/>
<parameter key="include_special_attributes" value="true"/>
</operator>
<connect from_op="Generate Data" from_port="output" to_op="Multiply" to_port="input"/>
<connect from_op="Multiply" from_port="output 1" to_op="Validation" to_port="training"/>
<connect from_op="Multiply" from_port="output 2" to_op="Apply Model (2)" to_port="unlabelled data"/>
<connect from_op="Validation" from_port="model" to_op="Multiply (2)" to_port="input"/>
<connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
<connect from_op="Multiply (2)" from_port="output 1" to_port="result 1"/>
<connect from_op="Multiply (2)" from_port="output 2" to_op="Apply Model (2)" to_port="model"/>
<connect from_op="Apply Model (2)" from_port="labelled data" to_op="Find Threshold (2)" to_port="example set"/>
<connect from_op="Find Threshold (2)" from_port="example set" to_op="Apply Threshold (2)" to_port="example set"/>
<connect from_op="Find Threshold (2)" from_port="threshold" to_op="Apply Threshold (2)" to_port="threshold"/>
<connect from_op="Apply Threshold (2)" from_port="example set" to_op="Nominal to Binominal (2)" to_port="example set input"/>
<connect from_op="Nominal to Binominal (2)" from_port="example set output" to_port="result 3"/>
<portSpacing port="source_input 1" spacing="0"/>
<portSpacing port="sink_result 1" spacing="0"/>
<portSpacing port="sink_result 2" spacing="0"/>
<portSpacing port="sink_result 3" spacing="0"/>
<portSpacing port="sink_result 4" spacing="0"/>
</process>
</operator>
</process>0 -
Thanks for this example!
I am however getting this error in RM5:
"The learning scheme SVM does not have sufficient capabilities for the given data set: polynominal label not supported"
How are you getting around this?
Thanks,
-Gagi0 -
Hi,
if I remember correctly, I have already solved that issue in the current developer version. Since we will publish the final version today, you simply could update afterwards.
Please tell me, if this issue remains.
Greetings,
Sebastian0 -
i solved it by unselecting "..capabilities.warn" in the preferences, but as sebastian said the new version should fix this.
greetings,
harald
0 -
Hi,
first, thanks for the example by Harry678. It seems to run without having to change default preferences in 5.0.3.
But:- Harry has to use three operators in the training part of the validator (Select Attribute / Generate Attribute / Set Role) which really shouldn't be needed if the fix Sebastian promised worked correctly: After the Filter Examples, all what is there is one class...
- I can't see how the example could serve any useful purpose, as the threshold searching on top level requires the label to be present ???
I was hoping that the SVM checkbox 'calculate confidences' would do something useful: Well, it shows a message one-class SVM probability output not supported yet - not sure whether this is a problem with libSVM or RM?
Stefan0 -
Hi Stefan,
Due to the fact that one-class learning can only learn exactly one nominal class label, the three operators in the training are necessary by concept. To change this behavior some changes in the LibSVMLearner and Model are required.
please have a look at http://rapid-i.com/rapidforum/index.php/topic,1746.0.html
This patch adds the classic libsvm one-class classification behavior which predicts 1 or -1 for a sample. You still need to postprocess the labels, but at least you get some kind of binary prediction out of the model. I'd gladly accept feedback for this patch and maybe someone of the dev's can have a look on it.
Greetings, Harald0 -
This looks very promising. I am not using one-class SVM for my current project but expect to get back into it soon. I will give this a shot and see how it goes.
I highly encourage RM dev team to consider this patch since LibSVMs one-class algorithm is one of the most useful unsupervised learning methods in practice (no labeled classes).
-Gagi0 -
Indeed, this looks promising - so I have to invest getting a build environment for RM up and running :-\ ...
Am I then rightly interpreting your patch that libSVM (C version) only gives an in/out classification, but doesn't attribute a continues confidence level to the result?
I disagree here... RM knows labels 'polynominal' and 'binominal' - there is no label class attribute 'uninominal'. Hence, if there is a check whether or not there are multiple label values, this has to be implied from the data. But if you have a filter leaving only one value, the implication on the remaining data is clear.harri678 wrote:
Due to the fact that one-class learning can only learn exactly one nominal class label, the three operators in the training are necessary by concept.
(I'm insisting on this, since RMs selling point is to support rapid development - however, such kind of detours as needed in the example make the environment very heavy to use and waste user time on training RM rather than training the learner ... I'm sensing that this is a consequence of the create view on data rather than copy data, which seems to be what the Multiply operator on top level is doing, at least 'sometimes' ...)
Stefan0 -
Hi Stefan,
Yes you can either get the classic confidence value or the classification behavior from the patch. At first I tried to deliver both confidence and prediction but it wasn't that easy, the svm_predict function didn't return the confidence values (java libsvm problem?). So to get both confidence and prediction for each example two svm_* are needed and I didn't like the overhead so I left it out in the patch.Am I then rightly interpreting your patch that libSVM (C version) only gives an in/out classification, but doesn't attribute a continues confidence level to the result?
Same here. As I have seen it in the code many checks and decisions in the LibSVMLearner are based on label attribute type and the number of different label attribute values. This would need quite a big change and lots of testing. What would be the most logical GUI variant for one-class? I think one filter and one learner?I disagree here... RM knows labels 'polynominal' and 'binominal' - there is no label class attribute 'uninominal'.
You can find very good documentation on the development environment (eclipse, subclipse) on the RM website
Greetings,
Harald0 -
... this is the second time in a week, that I have to bump a thread from 2010.
I have a process which eventually filters down a data set to one class, "around which" I want a one-class SVM build a model for use on full data - I want to see, whether such SVM is then able to isolate the samples correctly.
I set a break-point just before the SVM operator and find that I have a binominal label with mode = 1 (3872), least = 0 (0). ok. - good.
Then, I get into SVM: The operator SVM does not have sufficient capabilties ...
Then, I follow Simons advise to check rapidminer.general.capabilities.warn (now in 5.1.017). This has the simple effect of changing the error message to The attribute Label has 2 different values...
So, what now?
Thanks for any help! Stefan
0 -
Hi Stefan,
This is because the metadata in RapidMiner needs to refresh. I find that before I send the data into a one-class SVM (if I've filtered the data) I need to save it and then reimport it into process.
One way of doing this is Write CSV followed by Read CSV (reading the CSV from the file output).
That gets the metadata for the binominal label cleared.0