"saved XML output bug"

labrat
labrat New Altair Community Member
edited November 5 in Community Q&A
Hi all,

the XML that is exported out of Rapidminer when you save results is currently invalid because the closing object-stream tag is omitted from the bottom of the file.

Cheers

Stuart

Answers

  • fischer
    fischer New Altair Community Member
    This is strange. The XML output is handled by a library, so this is hard to track down. Can you post an example?

    Cheers,
    Simon
  • labrat
    labrat New Altair Community Member
    Using RM 4.5 and using the SVM/Xval example in the tutorial if you do the analysis:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="Input" class="ExampleSource">
            <parameter key="attributes" value="../data/polynomial.aml"/>
        </operator>
        <operator name="XVal" class="XValidation" expanded="yes">
            <parameter key="sampling_type" value="shuffled sampling"/>
            <operator name="Training" class="LibSVMLearner">
                <parameter key="svm_type" value="epsilon-SVR"/>
                <parameter key="kernel_type" value="poly"/>
                <parameter key="C" value="1000.0"/>
                <list key="class_weights">
                </list>
            </operator>
            <operator name="ApplierChain" class="OperatorChain" expanded="yes">
                <operator name="Test" class="ModelApplier">
                    <list key="application_parameters">
                    </list>
                </operator>
                <operator name="Evaluation" class="RegressionPerformance">
                    <parameter key="root_mean_squared_error" value="true"/>
                    <parameter key="absolute_error" value="true"/>
                    <parameter key="relative_error" value="true"/>
                    <parameter key="normalized_absolute_error" value="true"/>
                    <parameter key="root_relative_squared_error" value="true"/>
                    <parameter key="squared_error" value="true"/>
                    <parameter key="correlation" value="true"/>
                </operator>
            </operator>
        </operator>
    </operator>

    if you save the performance file *.per you get this;

    <object-stream>
      <PerformanceVector id="1">
        <currentValues id="2">
          <entry>
            <string>root_mean_squared_error</string>
            <double>7.271397088254498</double>
          </entry>
          <entry>
            <string>relative_error</string>
            <double>0.4261726449515895</double>
          </entry>
          <entry>
            <string>correlation</string>
            <double>0.9990774750706919</double>
          </entry>
          <entry>
            <string>normalized_absolute_error</string>
            <double>0.04030556352101554</double>
          </entry>
          <entry>
            <string>absolute_error</string>
            <double>5.107471175794692</double>
          </entry>
          <entry>
            <string>squared_error</string>
            <double>54.826375982674925</double>
          </entry>
          <entry>
            <string>root_relative_squared_error</string>
            <double>0.04407058437419177</double>
          </entry>
        </currentValues>
        <comparator class="com.rapidminer.operator.performance.PerformanceVector$DefaultComparator" id="3"/>
        <mainCriterion>first</mainCriterion>
        <averagesList id="4">
          <root__mean__squared__error id="5">
            <sum>10965.275196534985</sum>
            <squaresSum>3960036.9527361454</squaresSum>
            <exampleCount>200.0</exampleCount>
            <predictedAttribute class="NumericalAttribute" id="6">
              <attributeDescription id="7">
                <name>prediction(label)</name>
                <valueType>4</valueType>
                <blockType>1</blockType>
                <defaultValue>0.0</defaultValue>
                <index>6</index>
              </attributeDescription>
              <transformations id="8"/>
              <statistics class="linked-list" id="9">
                <NumericalStatistics id="10">
                  <sum>0.0</sum>
                  <squaredSum>0.0</squaredSum>
                  <valueCounter>0</valueCounter>
                </NumericalStatistics>
                <WeightedNumericalStatistics id="11">
                  <sum>0.0</sum>
                  <squaredSum>0.0</squaredSum>
                  <totalWeight>0.0</totalWeight>
                  <count>0.0</count>
                </WeightedNumericalStatistics>
                <com.rapidminer.example.MinMaxStatistics id="12">
                  <minimum>Infinity</minimum>
                  <maximum>-Infinity</maximum>
                </com.rapidminer.example.MinMaxStatistics>
                <UnknownStatistics id="13">
                  <unknownCounter>0</unknownCounter>
                </UnknownStatistics>
              </statistics>
              <constructionDescription>prediction(label)</constructionDescription>
            </predictedAttribute>
            <labelAttribute class="NumericalAttribute" id="14">
              <attributeDescription id="15">
                <name>label</name>
                <valueType>4</valueType>
                <blockType>1</blockType>
                <defaultValue>0.0</defaultValue>
                <index>5</index>
              </attributeDescription>
              <transformations id="16"/>
              <statistics class="linked-list" id="17">
                <NumericalStatistics id="18">
                  <sum>0.0</sum>
                  <squaredSum>0.0</squaredSum>
                  <valueCounter>0</valueCounter>
                </NumericalStatistics>
                <WeightedNumericalStatistics id="19">
                  <sum>0.0</sum>
                  <squaredSum>0.0</squaredSum>
                  <totalWeight>0.0</totalWeight>
                  <count>0.0</count>
                </WeightedNumericalStatistics>
                <com.rapidminer.example.MinMaxStatistics id="20">
                  <minimum>Infinity</minimum>
                  <maximum>-Infinity</maximum>
                </com.rapidminer.example.MinMaxStatistics>
                <UnknownStatistics id="21">
                  <unknownCounter>0</unknownCounter>
                </UnknownStatistics>
              </statistics>
              <constructionDescription>label</constructionDescription>
            </labelAttribute>
            <meanSum>72.71397088254498</meanSum>
            <meanSquaredSum>548.2637598267493</meanSquaredSum>
            <averageCount>10</averageCount>
          </root__mean__squared__error>
          <absolute__error id="22">
            <sum>1021.4942351589382</sum>
            <squaresSum>10965.275196534985</squaresSum>
            <exampleCount>200.0</exampleCount>
            <predictedAttribute class="NumericalAttribute" reference="6"/>
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <meanSum>51.07471175794692</meanSum>
            <meanSquaredSum>269.8618246507336</meanSquaredSum>
            <averageCount>10</averageCount>
          </absolute__error>
          <relative__error id="23">
            <sum>85.2345289903179</sum>
            <squaresSum>1012.762540663155</squaresSum>
            <exampleCount>200.0</exampleCount>
            <predictedAttribute class="NumericalAttribute" reference="6"/>
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <meanSum>4.261726449515895</meanSum>
            <meanSquaredSum>3.142985588188072</meanSquaredSum>
            <averageCount>10</averageCount>
          </relative__error>
          <normalized__absolute__error id="24">
            <predictedAttribute class="NumericalAttribute" reference="6"/>
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <deviationSum>1021.4942351589382</deviationSum>
            <relativeSum>27075.057565148352</relativeSum>
            <trueLabelSum>4078.1396808612185</trueLabelSum>
            <exampleCounter>20.0</exampleCounter>
            <meanSum>0.40305563521015536</meanSum>
            <meanSquaredSum>0.018255354969483512</meanSquaredSum>
            <averageCount>10</averageCount>
          </normalized__absolute__error>
          <root__relative__squared__error id="25">
            <predictedAttribute class="NumericalAttribute" reference="6"/>
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <deviationSum>10965.275196534985</deviationSum>
            <relativeSum>6475981.792977156</relativeSum>
            <trueLabelSum>4078.1396808612185</trueLabelSum>
            <exampleCounter>20.0</exampleCounter>
            <meanSum>0.4407058437419177</meanSum>
            <meanSquaredSum>0.021258629086615133</meanSquaredSum>
            <averageCount>10</averageCount>
          </root__relative__squared__error>
          <squared__error id="26">
            <sum>10965.275196534985</sum>
            <squaresSum>3960036.9527361454</squaresSum>
            <exampleCount>200.0</exampleCount>
            <predictedAttribute class="NumericalAttribute" reference="6"/>
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <meanSum>548.2637598267493</meanSum>
            <meanSquaredSum>34173.58564301037</meanSquaredSum>
            <averageCount>10</averageCount>
          </squared__error>
          <correlation id="27">
            <labelAttribute class="NumericalAttribute" reference="14"/>
            <predictedLabelAttribute class="NumericalAttribute" reference="6"/>
            <exampleCount>200.0</exampleCount>
            <sumLabel>36083.680010339376</sumLabel>
            <sumPredict>36280.64884722099</sumPredict>
            <sumLabelPredict>1.3344662294616919E7</sumLabelPredict>
            <sumLabelSqr>1.3277723556890765E7</sumLabelSqr>
            <sumPredictSqr>1.34225663075396E7</sumPredictSqr>
            <meanSum>9.990774750706919</meanSum>
            <meanSquaredSum>9.981562312225481</meanSquaredSum>
            <averageCount>10</averageCount>
          </correlation>
        </averagesList>
        <source>Evaluation</source>
      </PerformanceVector>


    as you see you are missing the "</object-stream>" tag.  This is also the same for the *.RES files too


    Stuart
  • fischer
    fischer New Altair Community Member
    Confirmed. However, that does not prevent RM from reading the file back in, does it? At least not for me.

    This is in fact a problem with xstream. It was simple to fix from our side, although I think this is a flaw in the implementation of xstream. It requires us to close the stream after every object which now prevents us to send several XML streams in a row.

    Cheers,
    Simon
  • labrat
    labrat New Altair Community Member
    Correct RM can read is able to read it back, however some programs (like EXCEL)  can be very fussy about having correctly constructed XML.

    Well i glad i could help