"how to decrease model size - delete where weight

emolano
emolano New Altair Community Member
edited November 2024 in Community Q&A
Hi there.. me again :)
I have a process to create a textmining model. My model is too big so I want it to use data where weight>0... on the weight table I see lots of words with weight=0 that I want to delete - not include in the model. Is there a way to do this?
thanks again for your help!
here my code
 
<operator name="Root" class="Process" expanded="yes">
    <description text="#ylt#h3#ygt#text Data Mining#ylt#/h3#ygt##ylt#p#ygt##ylt#/p#ygt#"/>
    <operator name="DatabaseExampleSource" class="DatabaseExampleSource">
        <parameter key="database_url" value="jdbc:mysql://bi01:3306/database"/>
        <parameter key="username" value="user"/>
        <parameter key="password" value="pwd"/>
        <parameter key="query" value="SELECT `ID_NUM`, `SHORT_DESC`, `PLATFORM` FROM `TABLEX`;"/>
        <parameter key="label_attribute" value="PLATFORM"/>
        <parameter key="id_attribute" value="ID_NUM"/>
    </operator>
    <operator name="StringTextInput" class="StringTextInput" expanded="yes">
        <parameter key="filter_nominal_attributes" value="true"/>
        <parameter key="remove_original_attributes" value="true"/>
        <parameter key="default_content_language" value="english"/>
        <parameter key="output_word_list" value="crmtraining_words.list"/>
        <list key="namespaces">
        </list>
        <operator name="StringTokenizer" class="StringTokenizer">
        </operator>
        <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
        </operator>
        <operator name="TokenLengthFilter" class="TokenLengthFilter">
            <parameter key="min_chars" value="2"/>
        </operator>
        <operator name="ToLowerCaseConverter" class="ToLowerCaseConverter">
        </operator>
        <operator name="PorterStemmer" class="PorterStemmer">
        </operator>
        <operator name="StopwordFilterFile" class="StopwordFilterFile">
            <parameter key="file" value="stop_filter_platform.txt"/>
        </operator>
        <operator name="TermNGramGenerator" class="TermNGramGenerator">
            <parameter key="max_length" value="3"/>
        </operator>
    </operator>
    <operator name="LibSVMLearner" class="LibSVMLearner">
        <parameter key="kernel_type" value="linear"/>
        <list key="class_weights">
        </list>
    </operator>
    <operator name="ModelWriter" class="ModelWriter">
        <parameter key="model_file" value="model.mod"/>
    </operator>
</operator>
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    you could use a weighting scheme before applying the learner, this would reduce the number of attributes and hence the length of support vectors. A similar weighting to the svm's weight vectors will be given by the SVMWeighting operator.  If you need to apply the weights lateron, you could use the attributeWeightsApplier.

    Greetings,
      Sebastian

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.