"how to decrease model size - delete where weight

emolano · May 2009

Hi there.. me again

I have a process to create a textmining model. My model is too big so I want it to use data where weight>0... on the weight table I see lots of words with weight=0 that I want to delete - not include in the model. Is there a way to do this?
thanks again for your help!
here my code

<operator name="Root" class="Process" expanded="yes">
<description text="#ylt#h3#ygt#text Data Mining#ylt#/h3#ygt##ylt#p#ygt##ylt#/p#ygt#"/>
<operator name="DatabaseExampleSource" class="DatabaseExampleSource">
<parameter key="database_url" value="jdbc:mysql://bi01:3306/database"/>
<parameter key="username" value="user"/>
<parameter key="password" value="pwd"/>
<parameter key="query" value="SELECT `ID_NUM`, `SHORT_DESC`, `PLATFORM` FROM `TABLEX`;"/>
<parameter key="label_attribute" value="PLATFORM"/>
<parameter key="id_attribute" value="ID_NUM"/>
</operator>
<operator name="StringTextInput" class="StringTextInput" expanded="yes">
<parameter key="filter_nominal_attributes" value="true"/>
<parameter key="remove_original_attributes" value="true"/>
<parameter key="default_content_language" value="english"/>
<parameter key="output_word_list" value="crmtraining_words.list"/>
<list key="namespaces">
</list>
<operator name="StringTokenizer" class="StringTokenizer">
</operator>
<operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
</operator>
<operator name="TokenLengthFilter" class="TokenLengthFilter">
<parameter key="min_chars" value="2"/>
</operator>
<operator name="ToLowerCaseConverter" class="ToLowerCaseConverter">
</operator>
<operator name="PorterStemmer" class="PorterStemmer">
</operator>
<operator name="StopwordFilterFile" class="StopwordFilterFile">
<parameter key="file" value="stop_filter_platform.txt"/>
</operator>
<operator name="TermNGramGenerator" class="TermNGramGenerator">
<parameter key="max_length" value="3"/>
</operator>
</operator>
<operator name="LibSVMLearner" class="LibSVMLearner">
<parameter key="kernel_type" value="linear"/>
<list key="class_weights">
</list>
</operator>
<operator name="ModelWriter" class="ModelWriter">
<parameter key="model_file" value="model.mod"/>
</operator>
</operator>

land · May 2009

Hi,
you could use a weighting scheme before applying the learner, this would reduce the number of attributes and hence the length of support vectors. A similar weighting to the svm's weight vectors will be given by the SVMWeighting operator. If you need to apply the weights lateron, you could use the attributeWeightsApplier.

Greetings,
Sebastian

"how to decrease model size - delete where weight

Answers

Welcome!

Welcome!

Quick Links

Categories