kernelPCA

Quarrel
Quarrel New Altair Community Member
edited November 5 in Community Q&A
Hi,

Firstly, thanks for a fantastic program, I've used it extensively in my own research and have had great success.

To my problem, I'm trying to use the feature transformation kernelPCA. However it invariably gives me an error along the lines of:

P Jul 9, 2009 1:42:49 AM: ModelApplier: Applying com.rapidminer.operator.features.transformation.KernelPCAModel
P Jul 9, 2009 1:42:49 AM: KernelPCA: Adding new the derived features...
P Jul 9, 2009 1:42:49 AM: KernelPCA: Calculating new features
G Jul 9, 2009 1:42:49 AM: [Fatal] ArrayIndexOutOfBoundsException occured in 1st application of ModelApplier (ModelApplier)
G Jul 9, 2009 1:42:49 AM: [Fatal] Process failed: operator cannot be executed (60). Check the log messages...
          Root[1] (Process)
          +- ExampleSource[1] (ExampleSource)
          +- KernelPCA[1] (KernelPCA)
here ==> +- ModelApplier[1] (ModelApplier)

This example is relatively simple compared to my actual setup, but fails just the same. If I change this kernelPCA to a PCA or GHA for instance, it works just fine. Is kernelPCA really different to PCA in terms of RapidMiner setup? (obivously it is in effect, but I can worry about whether its a good idea if I can get it to work :)

However, each of these appears to take an ExampleSet, then return an ExampleSet and a Model, which implies kernelPCA also should allow a ModelApplier as GHA or PCA do.

Note that the index of the error (60 in the above) always appears to be 1 past the number of attributes I have in the ExampleSet.

I'm sure I'm missing something, but help would be much appreciated :)

All this is in RapidMiner 4.4 CE.


--Q
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    unfortunately, or rather fortunately, I cannot reproduce this error. It seems to me, that it is connected with your data. Could you please try to reproduce this error with an exampleSet generator and paste the whole process here? Or if it only occurs with your data set, could you paste it, or send it via mail? I'm really interested in getting this bug out of there, because this is one of my favourite operators I wrote during the past years...


    Greetings,
      Sebastian
  • Quarrel
    Quarrel New Altair Community Member
    Hi,

    Thanks for looking.

    So my example above was actually a modified sample, using the sonar.aml data.

    I got there by loading the sample 05_GHA_Weights.xml, enabling the ModelApplier, deleting the ComponentWeights. If I run that still with the GHA then it works. If I replace GHA with KernelPCA it fails with the shown error message. I believe the two should be equivalent in basic usage (if not in outcome of the returned Models) - is this correct?

    I very much get an equivalent message with my own data, but this seems like a good place to start as its on "known" data.

    So, the whole process that fails:

    <operator name="Root" class="Process" expanded="yes">
        <description text="GHA is a faster principal components analysis algorithm which is more suitable for bigger data sets. In this process the ModelApplier has been disabled and the operator ComponentWeights is used to determine attribute weights from the GHA model. These weights can be used for feature selection for example."/>
        <parameter key="logverbosity" value="status"/>
        <operator name="ExampleSource" class="ExampleSource">
            <parameter key="attributes" value="../data/sonar.aml"/>
        </operator>
        <operator name="KernelPCA" class="KernelPCA">
        </operator>
        <operator name="ModelApplier" class="ModelApplier">
            <list key="application_parameters">
              <parameter key="keep_attributes" value="true"/>
              <parameter key="nr_components" value="2"/>
            </list>
        </operator>
    </operator>

    (Obviously the text is just a holdover from modifying 05_GHA_Weights)

    Now checking the process gives no errors, then running it gives:

    P Jul 10, 2009 12:21:33 AM: ModelApplier: Set parameters for com.rapidminer.operator.features.transformation.KernelPCAModel
    P Jul 10, 2009 12:21:33 AM: KernelPCA: The learned model does not support parameter
    Last message repeated 1 times.
    P Jul 10, 2009 12:21:33 AM: ModelApplier: Applying com.rapidminer.operator.features.transformation.KernelPCAModel
    P Jul 10, 2009 12:21:33 AM: KernelPCA: Adding new the derived features...
    P Jul 10, 2009 12:21:33 AM: KernelPCA: Calculating new features
    G Jul 10, 2009 12:21:33 AM: [Fatal] ArrayIndexOutOfBoundsException occured in 1st application of ModelApplier (ModelApplier)
    G Jul 10, 2009 12:21:33 AM: [Fatal] Process failed: operator cannot be executed (60). Check the log messages...
              Root[1] (Process)
              +- ExampleSource[1] (ExampleSource)
              +- KernelPCA[1] (KernelPCA)
    here ==> +- ModelApplier[1] (ModelApplier)



    --Q
  • land
    land New Altair Community Member
    Hi,
    thank you for this detailed error description. I was able to track this bug down and removed it. The correct version will be included in the next release, which will be available for download within the next two weeks.

    And of course you are right: KernelPCA / PCA or GHA are in principle equivalent and could be used in the same part of the process.

    Greetings,
      Sebastian
  • Quarrel
    Quarrel New Altair Community Member
    Sebastian,

    thanks very much for tracking this down. I probably should have glanced at the source myself seeing I have it lying around. Glad you found the problem, and I've very excited to hear that a new release might be in the next few weeks!


    --Q