landscape image classification system

Arenzky
Arenzky New Altair Community Member
edited November 5 in Community Q&A
I am new to RapidMiner.
I have two image dataset:
1. Trained dataset: 10 folders, referring to 10 different landscape classes: beach, grassland, forest, etc...
2. Test dataset: random images (I know the images belong to one of landscape classes, but I do not yet know to which one)

AIM: I want to classify the images in the test dataset using the trained dataset as framework. Answering a generic question:
How many images are classified as beach, how many as grassland, how many as forest etc...?

The general method I guess is the following:
1. extract features
2. apply classifier
3. validate
4. if necessary retrain
5. start process from new

Images are all jpeg images, (I have several 100s) and I was thinking to use the image processing plugin provided by http://splab.cz/en/research/data-mining/articles, which you know better than me.

Can somebody give me some hint to start by using this plugin or some similar workflow?

thanks a lot,
Daniel

Answers

  • StaryVena
    StaryVena New Altair Community Member
    Hello  Daniel,
    example process to classify scene using global features can looks like this:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.3.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.3.000" expanded="true" name="Process">
        <process expanded="true" height="539" width="1150">
          <operator activated="true" class="imageprocessing:multiple_color_image_opener" compatibility="1.3.003" expanded="true" height="60" name="MCIO" width="90" x="45" y="30">
            <list key="images">
              <parameter key="nature" value="C:\Users\xuherv00\RMrepository\tutorial\categorization\nature"/>
              <parameter key="urban" value="C:\Users\xuherv00\RMrepository\tutorial\categorization\urban"/>
            </list>
            <parameter key="assign_label" value="true"/>
            <process expanded="true" height="536" width="942">
              <operator activated="true" class="imageprocessing:global_feature_extraction" compatibility="1.3.003" expanded="true" height="60" name="Global Feature Extractor from a Single Image" width="90" x="313" y="30">
                <process expanded="true" height="623" width="299">
                  <operator activated="true" class="imageprocessing:statistics" compatibility="1.3.003" expanded="true" height="60" name="Global statistics" width="90" x="45" y="30">
                    <parameter key="Median" value="false"/>
                    <parameter key=" Normalized Center of Mass" value="false"/>
                    <parameter key="Thickness" value="196"/>
                  </operator>
                  <operator activated="true" class="imageprocessing:BIC" compatibility="1.3.003" expanded="true" height="76" name="BIC" width="90" x="45" y="120"/>
                  <operator activated="true" class="imageprocessing:histogram" compatibility="1.3.003" expanded="true" height="60" name="histogram" width="90" x="45" y="210">
                    <parameter key="Bins" value="232"/>
                  </operator>
                  <operator activated="true" class="imageprocessing:dLog_distance" compatibility="1.3.003" expanded="true" height="60" name="dLog" width="90" x="179" y="120"/>
                  <operator activated="true" class="imageprocessing:color_to_grayscale" compatibility="1.3.003" expanded="true" height="60" name="Color to grayscale" width="90" x="45" y="300"/>
                  <operator activated="true" class="imageprocessing:obcf" compatibility="1.3.003" expanded="true" height="60" name="OBCF" width="90" x="179" y="300"/>
                  <connect from_port="color image plus 1" to_op="Global statistics" to_port="color image plus"/>
                  <connect from_port="color image plus 2" to_op="BIC" to_port="color image plus"/>
                  <connect from_port="color image plus 3" to_op="histogram" to_port="color image plus"/>
                  <connect from_port="color image plus 4" to_op="Color to grayscale" to_port="color image plus"/>
                  <connect from_op="Global statistics" from_port="features" to_port="feature 1"/>
                  <connect from_op="BIC" from_port="grayscale image plus" to_op="dLog" to_port="grayscale image plus Hist"/>
                  <connect from_op="histogram" from_port="features" to_port="feature 3"/>
                  <connect from_op="dLog" from_port="features" to_port="feature 2"/>
                  <connect from_op="Color to grayscale" from_port="grayscale image" to_op="OBCF" to_port="grayscale image plus"/>
                  <connect from_op="OBCF" from_port="features" to_port="feature 4"/>
                  <portSpacing port="source_color image plus 1" spacing="0"/>
                  <portSpacing port="source_color image plus 2" spacing="72"/>
                  <portSpacing port="source_color image plus 3" spacing="72"/>
                  <portSpacing port="source_color image plus 4" spacing="72"/>
                  <portSpacing port="source_color image plus 5" spacing="0"/>
                  <portSpacing port="sink_feature 1" spacing="0"/>
                  <portSpacing port="sink_feature 2" spacing="72"/>
                  <portSpacing port="sink_feature 3" spacing="72"/>
                  <portSpacing port="sink_feature 4" spacing="72"/>
                  <portSpacing port="sink_feature 5" spacing="0"/>
                </process>
              </operator>
              <connect from_port="color image plus" to_op="Global Feature Extractor from a Single Image" to_port="color image plus"/>
              <connect from_op="Global Feature Extractor from a Single Image" from_port="example set" to_port="Example set"/>
              <portSpacing port="source_color image plus" spacing="0"/>
              <portSpacing port="sink_Example set" spacing="0"/>
            </process>
          </operator>
          <operator activated="true" class="normalize" compatibility="5.3.000" expanded="true" height="94" name="Normalize" width="90" x="179" y="30"/>
          <operator activated="true" class="split_data" compatibility="5.3.000" expanded="true" height="94" name="Split Data" width="90" x="313" y="75">
            <enumeration key="partitions">
              <parameter key="ratio" value="0.8"/>
              <parameter key="ratio" value="0.2"/>
            </enumeration>
          </operator>
          <operator activated="true" class="optimize_selection_evolutionary" compatibility="5.3.000" expanded="true" height="94" name="Optimize Selection (Evolutionary)" width="90" x="447" y="75">
            <parameter key="population_size" value="10"/>
            <process expanded="true" height="623" width="165">
              <operator activated="true" class="x_validation" compatibility="5.3.000" expanded="true" height="112" name="Validation" width="90" x="45" y="30">
                <description>A cross-validation evaluating a decision tree model.</description>
                <process expanded="true" height="536" width="446">
                  <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.3.000" expanded="true" height="76" name="SVM" width="90" x="178" y="30">
                    <list key="class_weights"/>
                  </operator>
                  <connect from_port="training" to_op="SVM" to_port="training set"/>
                  <connect from_op="SVM" from_port="model" to_port="model"/>
                  <portSpacing port="source_training" spacing="0"/>
                  <portSpacing port="sink_model" spacing="0"/>
                  <portSpacing port="sink_through 1" spacing="0"/>
                </process>
                <process expanded="true" height="536" width="446">
                  <operator activated="true" class="apply_model" compatibility="5.3.000" expanded="true" height="76" name="Apply Model" width="90" x="45" y="30">
                    <list key="application_parameters"/>
                  </operator>
                  <operator activated="true" class="performance" compatibility="5.3.000" expanded="true" height="76" name="Performance" width="90" x="245" y="30"/>
                  <connect from_port="model" to_op="Apply Model" to_port="model"/>
                  <connect from_port="test set" to_op="Apply Model" to_port="unlabelled data"/>
                  <connect from_op="Apply Model" from_port="labelled data" to_op="Performance" to_port="labelled data"/>
                  <connect from_op="Performance" from_port="performance" to_port="averagable 1"/>
                  <portSpacing port="source_model" spacing="0"/>
                  <portSpacing port="source_test set" spacing="0"/>
                  <portSpacing port="source_through 1" spacing="0"/>
                  <portSpacing port="sink_averagable 1" spacing="0"/>
                  <portSpacing port="sink_averagable 2" spacing="0"/>
                </process>
              </operator>
              <connect from_port="example set" to_op="Validation" to_port="training"/>
              <connect from_op="Validation" from_port="averagable 1" to_port="performance"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="source_through 1" spacing="0"/>
              <portSpacing port="sink_performance" spacing="36"/>
            </process>
          </operator>
          <operator activated="true" class="select_by_weights" compatibility="5.3.000" expanded="true" height="94" name="Select by Weights" width="90" x="581" y="30"/>
          <operator activated="true" class="select_by_weights" compatibility="5.3.000" expanded="true" height="94" name="Select by Weights (2)" width="90" x="715" y="210"/>
          <operator activated="true" class="support_vector_machine_libsvm" compatibility="5.3.000" expanded="true" height="76" name="SVM (2)" width="90" x="715" y="30">
            <list key="class_weights"/>
          </operator>
          <operator activated="true" class="apply_model" compatibility="5.3.000" expanded="true" height="76" name="Apply Model (2)" width="90" x="849" y="30">
            <list key="application_parameters"/>
          </operator>
          <operator activated="true" class="performance" compatibility="5.3.000" expanded="true" height="76" name="Performance (2)" width="90" x="983" y="30"/>
          <connect from_op="MCIO" from_port="example set" to_op="Normalize" to_port="example set input"/>
          <connect from_op="Normalize" from_port="example set output" to_op="Split Data" to_port="example set"/>
          <connect from_op="Split Data" from_port="partition 1" to_op="Optimize Selection (Evolutionary)" to_port="example set in"/>
          <connect from_op="Split Data" from_port="partition 2" to_op="Select by Weights (2)" to_port="example set input"/>
          <connect from_op="Optimize Selection (Evolutionary)" from_port="example set out" to_op="Select by Weights" to_port="example set input"/>
          <connect from_op="Optimize Selection (Evolutionary)" from_port="weights" to_op="Select by Weights" to_port="weights"/>
          <connect from_op="Optimize Selection (Evolutionary)" from_port="performance" to_port="result 3"/>
          <connect from_op="Select by Weights" from_port="example set output" to_op="SVM (2)" to_port="training set"/>
          <connect from_op="Select by Weights" from_port="weights" to_op="Select by Weights (2)" to_port="weights"/>
          <connect from_op="Select by Weights (2)" from_port="example set output" to_op="Apply Model (2)" to_port="unlabelled data"/>
          <connect from_op="Select by Weights (2)" from_port="weights" to_port="result 4"/>
          <connect from_op="SVM (2)" from_port="model" to_op="Apply Model (2)" to_port="model"/>
          <connect from_op="Apply Model (2)" from_port="labelled data" to_op="Performance (2)" to_port="labelled data"/>
          <connect from_op="Apply Model (2)" from_port="model" to_port="result 2"/>
          <connect from_op="Performance (2)" from_port="performance" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="36"/>
          <portSpacing port="sink_result 4" spacing="126"/>
          <portSpacing port="sink_result 5" spacing="198"/>
        </process>
      </operator>
    </process>
    Best,
    Václav
  • wessel
    wessel New Altair Community Member
    This process is awesome!

    Can I upload it to the community database?
  • StaryVena
    StaryVena New Altair Community Member
    Yes, of course  ;)

    Best,
    Václav