"SVM Flow Question"

Ghostrider
Ghostrider New Altair Community Member
edited November 5 in Community Q&A
I want to generate an SVM model given some training data and then simply try to predict the training data using the same model.  I setup an optimization operator to search for optimal parameters, but everytime, it quickly executes and then returns the same high error rate.  Can someone help me modify my flow below so that it generates a model which can successfully predict the training data?  I really don't know the search limits to use for the parameter optimization.

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.1.001">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.1.001" expanded="true" name="Process">
   <parameter key="parallelize_main_process" value="true"/>
   <process expanded="true" height="615" width="964">
     <operator activated="true" class="retrieve" compatibility="5.1.001" expanded="true" height="60" name="Retrieve (2)" width="90" x="45" y="30">
       <parameter key="repository_entry" value="//MLData/FirstData"/>
     </operator>
     <operator activated="true" class="set_role" compatibility="5.1.001" expanded="true" height="76" name="Set Role" width="90" x="179" y="30">
       <parameter key="name" value="Date"/>
       <parameter key="target_role" value="cluster"/>
       <list key="set_additional_roles"/>
     </operator>
     <operator activated="true" class="set_role" compatibility="5.1.001" expanded="true" height="76" name="Set Role (2)" width="90" x="313" y="30">
       <parameter key="name" value="RRVALC"/>
       <parameter key="target_role" value="prediction"/>
       <list key="set_additional_roles"/>
     </operator>
     <operator activated="true" class="normalize" compatibility="5.1.001" expanded="true" height="94" name="Normalize" width="90" x="447" y="30"/>
     <operator activated="true" class="parallel:optimize_parameters_evolutionary_parallel" compatibility="5.0.001" expanded="true" height="94" name="Optimize Parameters (Evolutionary)" width="90" x="648" y="30">
       <list key="parameters">
         <parameter key="SVM.gamma" value="[0.0;10]"/>
         <parameter key="SVM.epsilon" value="[10;100000]"/>
         <parameter key="SVM.nu" value="[0.0;0.5]"/>
         <parameter key="SVM.C" value="[0.1;30000]"/>
       </list>
       <parameter key="max_generations" value="150"/>
       <parameter key="use_early_stopping" value="true"/>
       <parameter key="population_size" value="50"/>
       <parameter key="show_convergence_plot" value="true"/>
       <parameter key="number_of_threads" value="4"/>
       <process expanded="true" height="633" width="982">
         <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.1.001" expanded="true" height="76" name="SVM" width="90" x="313" y="30">
           <parameter key="svm_type" value="epsilon-SVR"/>
           <parameter key="gamma" value="2.195031521888635"/>
           <parameter key="C" value="29054.787802158513"/>
           <parameter key="nu" value="0.46481950308466047"/>
           <parameter key="cache_size" value="800"/>
           <parameter key="epsilon" value="48471.18778624996"/>
           <list key="class_weights"/>
         </operator>
         <operator activated="true" class="apply_model" compatibility="5.1.001" expanded="true" height="76" name="Apply Model" width="90" x="447" y="30">
           <list key="application_parameters"/>
         </operator>
         <operator activated="true" class="performance" compatibility="5.1.001" expanded="true" height="76" name="Performance" width="90" x="581" y="30"/>
         <connect from_port="input 1" to_op="SVM" to_port="training set"/>
         <connect from_op="SVM" from_port="model" to_op="Apply Model" to_port="model"/>
         <connect from_op="SVM" from_port="exampleSet" to_op="Apply Model" to_port="unlabelled data"/>
         <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
         <connect from_op="Performance" from_port="performance" to_port="performance"/>
         <portSpacing port="source_input 1" spacing="0"/>
         <portSpacing port="source_input 2" spacing="0"/>
         <portSpacing port="sink_performance" spacing="0"/>
         <portSpacing port="sink_result 1" spacing="0"/>
       </process>
     </operator>
     <connect from_op="Retrieve (2)" from_port="output" to_op="Set Role" to_port="example set input"/>
     <connect from_op="Set Role" from_port="example set output" to_op="Set Role (2)" to_port="example set input"/>
     <connect from_op="Set Role (2)" from_port="example set output" to_op="Normalize" to_port="example set input"/>
     <connect from_op="Normalize" from_port="example set output" to_op="Optimize Parameters (Evolutionary)" to_port="input 1"/>
     <connect from_op="Optimize Parameters (Evolutionary)" from_port="performance" to_port="result 1"/>
     <connect from_op="Optimize Parameters (Evolutionary)" from_port="parameter" to_port="result 2"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
     <portSpacing port="sink_result 3" spacing="0"/>
   </process>
 </operator>
</process>
Tagged:

Answers

  • haddock
    haddock New Altair Community Member
    Are there not examples of exactly this in the samples ( Meta section )?
  • SebastianLoh
    SebastianLoh New Altair Community Member
    Hi Ghostrider,

    the Set Parameter operator is your friend:

    http://rapid-i.com/wiki/index.php?title=Set_Parameters

    "This operator is useful, e.g. in the following scenario. If one wants to find the best parameters for a certain learning scheme, one usually is also interested in the model generated with this parameters. While the first is easily possible using a ParameterOptimizationOperator, the latter is not possible because the ParameterOptimizationOperator does not return the IOObjects produced within, but only a parameter set."

    I hope I could help.

    Ciao Sebastian
  • Ghostrider
    Ghostrider New Altair Community Member
    haddock wrote:

    Are there not examples of exactly this in the samples ( Meta section )?
    Humm...yes, you're right.  Meta Section under Samples in RapidMiner.  Example 01 looks like the setup that I am after.  Sorry, I should have checked the examples before asking.