🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Confused by the numerical XValidation output

User: "Legacy User"
New Altair Community Member
Updated by Jocelyn
Hi,

Here is my question this time: why the RMS printed by the XValidation decreases with # of validations?

Here is a simple example:

Data set:

X,Y
0, 0.18224201
1, 2.002307783
2, 4.187028114
...
49, 98.21944595

(this is simply Y = 2*X + rand() - 0.5)


Standard XVal experiment:

<operator name="Root" class="Process" expanded="yes">
    <operator name="ExampleSource" class="ExampleSource">
        <parameter key="attributes" value="H:\tmp\lin.aml"/>
    </operator>
    <operator name="XValidation" class="XValidation" expanded="yes">
        <parameter key="create_complete_model" value="true"/>
        <parameter key="keep_example_set" value="true"/>
        <parameter key="number_of_validations" value="60"/>
        <parameter key="sampling_type" value="shuffled sampling"/>
        <operator name="LinearRegression" class="LinearRegression">
            <parameter key="feature_selection" value="none"/>
            <parameter key="keep_example_set" value="true"/>
        </operator>
        <operator name="OperatorChain" class="OperatorChain" expanded="yes">
            <operator name="ModelApplier" class="ModelApplier">
                <list key="application_parameters">
                </list>
            </operator>
            <operator name="Performance" class="Performance">
            </operator>
        </operator>
    </operator>
</operator>

When i increase the number_of_validations, here is what happens:

no_of_val    rms_error

10                0.271 +- 0.040
20                0.258 +- 0.087
30                0.248 +- 0.117
40                0.252 +- 0.122
50                0.239 +- 0.140

I would expect, with # of validations, the error should remain about the same (because it's determined by the rand() ) and its uncertainty decrease?

Thanks!

Find more posts tagged with