"How do you apply Platt Scaling in X-Validation?"

Charles54
Charles54 New Altair Community Member
edited November 5 in Community Q&A
Hello all,

I am having trouble using the Rescale Confidences operator. I looked over the sample file, and it seems straight forward,, but I can't figure out how to apply it in an X-Validation.

This is the best I can come up with. You can see that the labeled data which is output from the Apply Model operator does not contain confidence values -- therefore the Performance operator fails. I am new to data mining, so perhaps my thinking is way off the mark. (I used Rapid Miner 5)
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.0">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" expanded="true" name="Root">
   <process expanded="true" height="395" width="748">
     <operator activated="true" class="generate_data" expanded="true" height="60" name="ExampleSetGenerator" width="90" x="45" y="30">
       <parameter key="target_function" value="checkerboard classification"/>
       <parameter key="number_examples" value="500"/>
       <parameter key="number_of_attributes" value="2"/>
     </operator>
     <operator activated="true" class="x_validation" expanded="true" height="112" name="Validation" width="90" x="246" y="30">
       <process expanded="true" height="414" width="433">
         <operator activated="true" class="support_vector_machine" expanded="true" height="112" name="JMySVMLearner" width="90" x="45" y="30">
           <parameter key="kernel_type" value="radial"/>
         </operator>
         <operator activated="true" class="rescale_confidences" expanded="true" height="76" name="PlattScaling" width="90" x="179" y="30"/>
         <connect from_port="training" to_op="JMySVMLearner" to_port="training set"/>
         <connect from_op="JMySVMLearner" from_port="model" to_op="PlattScaling" to_port="prediction model"/>
         <connect from_op="JMySVMLearner" from_port="exampleSet" to_op="PlattScaling" to_port="example set"/>
         <connect from_op="PlattScaling" from_port="model" to_port="model"/>
         <portSpacing port="source_training" spacing="0"/>
         <portSpacing port="sink_model" spacing="0"/>
         <portSpacing port="sink_through 1" spacing="0"/>
       </process>
       <process expanded="true" height="414" width="435">
         <operator activated="true" class="apply_model" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
           <list key="application_parameters"/>
         </operator>
         <operator activated="true" class="rescale_confidences" expanded="true" height="76" name="Rescale Confidences" width="90" x="179" y="30"/>
         <operator activated="true" class="performance" expanded="true" height="76" name="Performance" width="90" x="313" y="30"/>
         <connect from_port="model" to_op="Apply Model" to_port="model"/>
         <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
         <connect from_op="Apply Model" from_port="labelled data" to_op="Rescale Confidences" to_port="example set"/>
         <connect from_op="Apply Model" from_port="model" to_op="Rescale Confidences" to_port="prediction model"/>
         <connect from_op="Rescale Confidences" from_port="example set" to_op="Performance" to_port="labelled data"/>
         <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
         <portSpacing port="source_model" spacing="0"/>
         <portSpacing port="source_test set" spacing="0"/>
         <portSpacing port="source_through 1" spacing="0"/>
         <portSpacing port="sink_averagable 1" spacing="0"/>
         <portSpacing port="sink_averagable 2" spacing="0"/>
         <portSpacing port="sink_averagable 3" spacing="0"/>
       </process>
     </operator>
     <connect from_op="ExampleSetGenerator" from_port="output" to_op="Validation" to_port="training"/>
     <connect from_op="Validation" from_port="model" to_port="result 1"/>
     <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
     <portSpacing port="sink_result 3" spacing="0"/>
   </process>
 </operator>
</process>

I have read over Steffen's posts on this subject, but I am afraid I still can't figure it out. Any help would be much appreciated.

Regards, Charles

Answers

  • land
    land New Altair Community Member
    Hi Charles,
    just delete the PlatScaling. The second one will do the trick. PlatScaling internally applies the model, but does not alter it. Here's how it works:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Root">
        <process expanded="true" height="395" width="748">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="ExampleSetGenerator" width="90" x="45" y="30">
            <parameter key="target_function" value="checkerboard classification"/>
            <parameter key="number_examples" value="500"/>
            <parameter key="number_of_attributes" value="2"/>
          </operator>
          <operator activated="true" class="x_validation" expanded="true" height="112" name="Validation" width="90" x="246" y="30">
            <process expanded="true" height="414" width="533">
              <operator activated="true" class="support_vector_machine" expanded="true" height="112" name="JMySVMLearner" width="90" x="45" y="30">
                <parameter key="kernel_type" value="radial"/>
              </operator>
              <connect from_port="training" to_op="JMySVMLearner" to_port="training set"/>
              <connect from_op="JMySVMLearner" from_port="model" to_port="model"/>
              <portSpacing port="source_training" spacing="0"/>
              <portSpacing port="sink_model" spacing="0"/>
              <portSpacing port="sink_through 1" spacing="0"/>
            </process>
            <process expanded="true" height="414" width="435">
              <operator activated="true" class="rescale_confidences" expanded="true" height="76" name="PlattScaling" width="90" x="45" y="30"/>
              <operator activated="true" class="performance" expanded="true" height="76" name="Performance" width="90" x="313" y="30"/>
              <connect from_port="model" to_op="PlattScaling" to_port="prediction model"/>
              <connect from_port="test set" to_op="PlattScaling" to_port="example set"/>
              <connect from_op="PlattScaling" from_port="example set" to_op="Performance" to_port="labelled data"/>
              <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
              <portSpacing port="source_model" spacing="0"/>
              <portSpacing port="source_test set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_averagable 1" spacing="0"/>
              <portSpacing port="sink_averagable 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="ExampleSetGenerator" from_port="output" to_op="Validation" to_port="training"/>
          <connect from_op="Validation" from_port="model" to_port="result 1"/>
          <connect from_op="Validation" from_port="averagable 1" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>
    Greetings,
      Sebastian
  • Charles54
    Charles54 New Altair Community Member
    Hello Sebastian,

    Thanks so much for the clear - and very quick - reply. I had not realized that the Platt scaling operator replaced the need for the model applier. As usual, I understand the complex eventually... the obvious takes me a little longer.

    Unfortunately, the configuration your offered gives me the same wonky output as my original process. However, at least I know that the problem lies somewhere other than with the scaling. I will try running it again with a larger sample. Thanks again for sharing your expertise. It probably saved me numerous hours of futile experimentation. Have a great day.

    Regards, Charles