The Siemens Community Catalyst program was co-created with our community to acknowledge technology leaders who consistently contribute to the Siemens Community. Nominations are accepted on a rolling basis.
virex wrote:Hi, i'm using the Lib SVM in rapidminer. I would like to obtain the real data point values obtained after the SVM model is determined, that is used to plot the ROC curve. Is there anyway to obtain these values via the GUI itself? I searched and found in the rapidminer api that there is a container ROCData that holds all the ROC data points for a single ROC curve(http://rapid-i.com/api/rapidminer-5.1/com/rapidminer/tools/math/package-use.html). Is there any way to retrieve this data points?
<?xml version="1.0" encoding="UTF-8" standalone="no"?><process version="5.1.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="5.1.003" expanded="true" name="Process"> <process expanded="true" height="161" width="413"> <operator activated="true" class="generate_data" compatibility="5.1.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"> <parameter key="target_function" value="sum classification"/> <parameter key="number_examples" value="500"/> <parameter key="number_of_attributes" value="2"/> </operator> <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.1.003" expanded="true" height="76" name="SVM" width="90" x="179" y="30"> <parameter key="kernel_type" value="linear"/> <list key="class_weights"/> </operator> <connect from_op="Generate Data" from_port="output" to_op="SVM" to_port="training set"/> <connect from_op="SVM" from_port="model" to_port="result 1"/> <connect from_op="SVM" from_port="exampleSet" to_port="result 2"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> <portSpacing port="sink_result 3" spacing="0"/> </process> </operator></process>
Ingo Mierswa wrote:Hi Virex,well, there is no simple GUI option like "Export ROC Points" or something similar. But of course you could create a process which generates the data points itself: Apply the model, create the confidences, sort them, count the true and false positive sums, create the rates etc. Advanced process design, but certainly possible.Hi AMT,well, your request does actually not have anything to do with the original topic - please start a new topic thread in future. Anyway, here are two comments:1- well, if I do this anything looks fine (compare the process below, the bias is close to 0). By the way, in high dimensional space the intercept becomes completely irrelevant since it is only a single degree of freedom which can be more or less safely be ignored. This is also reflected by most SVM implementation so there might indeed be a notable difference for low-dimensional data sets.2- No. You cannot interpret those values as easy as for the linear case for any other non-linear kernel function, sorry.Cheers,Ingo <?xml version="1.0" encoding="UTF-8" standalone="no"?><process version="5.1.003"> <context> <input/> <output/> <macros/> </context> <operator activated="true" class="process" compatibility="5.1.003" expanded="true" name="Process"> <process expanded="true" height="161" width="413"> <operator activated="true" class="generate_data" compatibility="5.1.003" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"> <parameter key="target_function" value="sum classification"/> <parameter key="number_examples" value="500"/> <parameter key="number_of_attributes" value="2"/> </operator> <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.1.003" expanded="true" height="76" name="SVM" width="90" x="179" y="30"> <parameter key="kernel_type" value="linear"/> <list key="class_weights"/> </operator> <connect from_op="Generate Data" from_port="output" to_op="SVM" to_port="training set"/> <connect from_op="SVM" from_port="model" to_port="result 1"/> <connect from_op="SVM" from_port="exampleSet" to_port="result 2"/> <portSpacing port="source_input 1" spacing="0"/> <portSpacing port="sink_result 1" spacing="0"/> <portSpacing port="sink_result 2" spacing="0"/> <portSpacing port="sink_result 3" spacing="0"/> </process> </operator></process>
In what concerns the first question: I do not agree with you. This bias determines where the hyperplane pass in the multidimensional space, so it has a very concrete meaning.
So my question is what is the information that w is providing in this case?