"Non-linear regression"

Strauss
Strauss New Altair Community Member
edited November 5 in Community Q&A
Is there an operator for non-linear regression, e.g. polynomic? I didn't found something in this way.

Answers

  • haddock
    haddock New Altair Community Member
    There is a polynomial kernel available in LibSVMLearner.
  • Strauss
    Strauss New Altair Community Member
    Thank you very much. Is there somewhere a code example how to use it and what parameters have to be choosed?
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    you can find many examples in the "sample" directory of RapidMiner. There should also be one for a general regression setting. For the polynomial LibSVM, you have to set the type to one of the both "SVR" types, select the kernel type "polynomial" and define an appropriate degree and values for C. Which parameter values are appropriate can be evaluated by using one of the parameter optimization operators (please also refer to the sample dir). Here is a simple setup (model is applied on the training data - never do this in real life ;-):

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="number_of_attributes" value="1"/>
            <parameter key="target_function" value="one variable non linear"/>
        </operator>
        <operator name="NoiseGenerator" class="NoiseGenerator">
            <list key="noise">
            </list>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="degree" value="2"/>
            <parameter key="keep_example_set" value="true"/>
            <parameter key="kernel_type" value="poly"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
    </operator>
    However, I would usually prefer an RBF kernel or an (additional) feature construction (for example with YAGGA2) instead but if polynomial works for your data this is of course fine.

    Cheers,
    Ingo
  • Strauss
    Strauss New Altair Community Member
    I got really problems in using this  regression type. My approach is to load an example set from a database and produce a prediction model. But I think it tooks too much time (e.g. more than 2 minutes) and the results are not satisfiying.

    My operator tree is the following:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
            <parameter key="database_system" value="HSQLDB"/>
            <parameter key="database_url" value="jdbc:hsqldb:file:SnapshotDB"/>
            <parameter key="label_attribute" value="SNAPSHOT"/>
            <parameter key="query" value="SELECT SID, SNAPSHOT FROM snapshots"/>
            <parameter key="username" value="sa"/>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="10000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="kernel_type" value="poly"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelWriter" class="ModelWriter">
            <parameter key="model_file" value="prediction.mod"/>
        </operator>
    </operator>
    I hope someone can help me to solve these problems or can explain how to calculate these model...
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    if the runtime is too high you could try to reduce the value of "C". If the results are not satisfying, I always would try the RBF kernel with an optimized value for gamma / sigma. This often leads to much better fits. Instead of introducing the non-linearity in the learner, you could also construction additional (polynomial) features before learning and simply apply a linear regression scheme afterwards. This is often faster and leads to understandable models.

    Cheers,
    Ingo
  • Strauss
    Strauss New Altair Community Member

    ... I always would try the RBF kernel with an optimized value for gamma / sigma.
    It would be great if you could explain me how to do this... Which Operator I have to choose for this?
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    here are the basic settings for a RBF SVM:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="ExampleSetGenerator" class="ExampleSetGenerator">
            <parameter key="attributes_lower_bound" value="-20.0"/>
            <parameter key="attributes_upper_bound" value="15.0"/>
            <parameter key="number_examples" value="300"/>
            <parameter key="number_of_attributes" value="1"/>
            <parameter key="target_function" value="one variable non linear"/>
        </operator>
        <operator name="LibSVMLearner" class="LibSVMLearner">
            <parameter key="C" value="2000.0"/>
            <list key="class_weights">
            </list>
            <parameter key="gamma" value="1.0"/>
            <parameter key="keep_example_set" value="true"/>
            <parameter key="svm_type" value="epsilon-SVR"/>
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
            </list>
        </operator>
    </operator>

    For the parameter optimization, you could have a look into the sample directory (..._Meta.../...ParameterOptimization.xml).

    Cheers,
    Ingo
  • Strauss
    Strauss New Altair Community Member
    Okay, thank you very much. I think I got it now.

    But could it be possible that setting the degree of the function doesn't have influence to the result?
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    the kernel parameter "degree" is only used for a poynomial kernel, the parameters "sigma" / "gamma" are only used for RBF kernels.

    Cheers,
    Ingo