Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
problem with stopwordfilterfile
nguyenxuanhau
my file xml as:
<process version="4.6">
<operator name="Root" class="Process" expanded="yes">
<description text="Text Hau"/>
<parameter key="logverbosity" value="init"/>
<parameter key="random_seed" value="2001"/>
<parameter key="send_mail" value="never"/>
<parameter key="process_duration_for_mail" value="30"/>
<parameter key="encoding" value="UTF-8"/>
<operator name="TextInput" class="TextInput" expanded="yes">
<list key="texts">
<parameter key="graphics" value="dulieu"/>
</list>
<parameter key="default_content_type" value=""/>
<parameter key="default_content_encoding" value="utf-8"/>
<parameter key="default_content_language" value=""/>
<parameter key="prune_below" value="-1"/>
<parameter key="prune_above" value="-1"/>
<parameter key="vector_creation" value="TermOccurrences"/>
<parameter key="use_content_attributes" value="false"/>
<parameter key="use_given_word_list" value="false"/>
<parameter key="return_word_list" value="false"/>
<parameter key="id_attribute_type" value="short"/>
<list key="namespaces">
</list>
<parameter key="create_text_visualizer" value="false"/>
<parameter key="on_the_fly_pruning" value="-1"/>
<parameter key="extend_exampleset" value="false"/>
<operator name="StringTokenizer" class="StringTokenizer">
</operator>
<operator name="StopwordFilterFile" class="StopwordFilterFile">
<parameter key="file" value="dulieu/stopword.txt"/>
<parameter key="case_sensitive" value="true"/>
</operator>
</operator>
</operator>
</process>
when i run this file, it don't filter words that were encoded by utf-8
Find more posts tagged with
AI Studio
Accepted answers
All comments
land
Hi,
if you switch to expert mode of RapidMiner in the parameters view, you will see that there is an encoding parameter. If you set this parameter to UTF-8 the process will work.
Greetings,
Sebastian
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups