"Basic Feature Selection Process"

smackdown33
smackdown33 New Altair Community Member
edited November 5 in Community Q&A
Hi ive been trying to run the following XML code with varying heap space arrangements and nothing has come to fruition. I keep getting GC errors or heap space errors. The dataset consists of 1300 examples with 7200 attributes but after about 10secs the previously mentioned errors arise. Can someone please help.

<operator name="Root" class="Process" expanded="yes">
    <operator name="ArffExampleSource" class="ArffExampleSource">
        <parameter key="data_file" value="G:\Postgrad\Java\NetBean Projects\ImageUtil\lutImages\newPNGImages\10Colours\Features\features2.arff"/>
        <parameter key="datamanagement" value="short_sparse_array"/>
    </operator>
    <operator name="FS" class="FeatureSelection" expanded="yes">
        <operator name="FSChain" class="OperatorChain" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <operator name="Learner" class="LibSVMLearner">
                    <list key="class_weights">
                    </list>
                </operator>
                <operator name="ApplierChain" class="OperatorChain" expanded="yes">
                    <operator name="Applier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Evaluator" class="Performance">
                    </operator>
                </operator>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                  <parameter key="generation" value="operator.FS.value.generation"/>
                  <parameter key="performance" value="operator.FS.value.performance"/>
                </list>
            </operator>
        </operator>
    </operator>
</operator>

Answers

  • land
    land New Altair Community Member
    Hi,
    that's correct. Unfortunately the current implementation of the feature selection needs way too much memory. Please take a look at this forum thread, where we had the complete discussion, yet. http://rapid-i.com/rapidforum/index.php/topic,1089.0.html

    Greetings,
      Sebastian
  • smackdown33
    smackdown33 New Altair Community Member
    Hi Sebastian,

    thanks for the response, ive read the other thread, but it didnt really get me anywhere. I was wondering do you or anyone out there know the best way to select a reduced feature set from the 7200 attributes I have and then train an SVM using the reduced feature set with the computer limitations i have.
  • land
    land New Altair Community Member
    Hi,
    Did you try the evolutionary Feature Selection?

    Greetings,
      Sebastian
  • smackdown33
    smackdown33 New Altair Community Member
    Hi sebastian,

    is that the evolutionary feature aggregation which you are referring to. Its the only feature selection method that i can find that is "evolutionary".

    Thanks again for your help.
  • smackdown33
    smackdown33 New Altair Community Member
    Hi, below is the setup i most recently used and Im still not getting anything from it, keeps running out of memory after at most 1min. If you can help me get this sorted out, you will be a life save Sebastian, thanks for your help.
    <operator name="Root" class="Process" expanded="yes">
          <operator name="ArffExampleSource" class="ArffExampleSource">
            <parameter key="data_file" value="G:\Postgrad\Java\NetBean Projects\ImageUtil\lutImages\newPNGImages\10Colours\Features\features2.arff"/>
        </operator>
        <operator name="EvolutionaryFeatureAggregation" class="EvolutionaryFeatureAggregation" expanded="yes">
            <operator name="XValidation" class="XValidation" expanded="yes">
                <operator name="JMySVMLearner" class="JMySVMLearner">
                </operator>
                <operator name="OperatorChain" class="OperatorChain" expanded="yes">
                    <operator name="ModelApplier" class="ModelApplier">
                        <list key="application_parameters">
                        </list>
                    </operator>
                    <operator name="Performance" class="Performance">
                    </operator>
                </operator>
            </operator>
            <operator name="ProcessLog" class="ProcessLog">
                <list key="log">
                </list>
            </operator>
        </operator>
    </operator>
  • land
    land New Altair Community Member
    Hi,
    beside from buying the metioned plugin? Hmm. You could do an evolutionaryWeighting, this should switch of unimportant attributes by giving them weight 0. Hope that will help you.

    Greetings,
      Sebastian
  • smackdown33
    smackdown33 New Altair Community Member
    Hi Sebastian,

    can you PM me the price of the plugin, thanks.