Confused by the numerical XValidation output

User: "Legacy User"
New Altair Community Member
Updated by Jocelyn
Hi,

Here is my question this time: why the RMS printed by the XValidation decreases with # of validations?

Here is a simple example:

Data set:

X,Y
0, 0.18224201
1, 2.002307783
2, 4.187028114
...
49, 98.21944595

(this is simply Y = 2*X + rand() - 0.5)


Standard XVal experiment:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="H:\tmp\lin.aml"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <parameter key="create_complete_model" value="true"/>
        <parameter key="keep_example_set" value="true"/>
        <parameter key="number_of_validations" value="60"/>
        <parameter key="sampling_type" value="shuffled sampling"/>
        <operator name="LinearRegression" class="LinearRegression">
            <parameter key="feature_selection" value="none"/>
            <parameter key="keep_example_set" value="true"/>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="Performance" class="Performance">
            </operator>
        </operator>
    </operator>
</operator>

When i increase the number_of_validations, here is what happens:

no_of_val    rms_error

10                0.271 +- 0.040
20                0.258 +- 0.087
30                0.248 +- 0.117
40                0.252 +- 0.122
50                0.239 +- 0.140

I would expect, with # of validations, the error should remain about the same (because it's determined by the rand() ) and its uncertainty decrease?

Thanks!

Find more posts tagged with