"Filter Stopwords (Dictionary): how to connect the dictionary"

Krystyna
Krystyna New Altair Community Member
edited November 5 in Community Q&A
Hello everybody,

I have RapidMiner 5.2.0003, where Filter Stopwords (Dictionary) module differes from the previous version. I can not manage to connest the file with stopwords anymore. Earlier i just selected the txt-file. Now there is an input-file parameter. I tried to use retrieve, read from... etc. but it doesn't work. Could you please advise?

Thanks a lot!

My best
Krystyna

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    RapidMiner 5.2.3 has been released more than 7 months ago. Please update both RapidMiner and the Text Processing extension to the latest version, and if the problem still occurs, please give a detailed problem description with an example process according to the post linked in my signature.

    Best, Marius
  • Krystyna
    Krystyna New Altair Community Member
    Hi Marius,

    My softrawe is updated. In video tutorials I only habe seen examples for older vesrion, where Modul Filter Stopwords (Dictionary) had another structure. this is my process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.003">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" breakpoints="after" class="process" compatibility="5.2.003" expanded="true" name="Process">
        <process expanded="true" height="476" width="547">
          <operator activated="true" class="retrieve" compatibility="5.2.003" expanded="true" height="60" name="Retrieve" width="90" x="45" y="75">
            <parameter key="repository_entry" value="Nachfrager 2012-07_Lexikon"/>
          </operator>
          <operator activated="true" class="text:process_document_from_data" compatibility="5.2.004" expanded="true" height="76" name="Process Documents from Data" width="90" x="179" y="210">
            <parameter key="keep_text" value="true"/>
            <parameter key="prunde_below_percent" value="2.0"/>
            <parameter key="prune_above_percent" value="100.0"/>
            <list key="specify_weights"/>
            <process expanded="true" height="763" width="785">
              <operator activated="true" class="text:transform_cases" compatibility="5.2.004" expanded="true" height="60" name="Transform Cases" width="90" x="112" y="120"/>
              <operator activated="true" class="text:tokenize" compatibility="5.2.004" expanded="true" height="60" name="Tokenize" width="90" x="246" y="210"/>
              <operator activated="true" class="text:stem_german" compatibility="5.2.004" expanded="true" height="60" name="Stem (German)" width="90" x="380" y="300"/>
              <connect from_port="document" to_op="Transform Cases" to_port="document"/>
              <connect from_op="Transform Cases" from_port="document" to_op="Tokenize" to_port="document"/>
              <connect from_op="Tokenize" from_port="document" to_op="Stem (German)" to_port="document"/>
              <connect from_op="Stem (German)" from_port="document" to_port="document 1"/>
              <portSpacing port="source_document" spacing="0"/>
              <portSpacing port="sink_document 1" spacing="0"/>
              <portSpacing port="sink_document 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Retrieve" from_port="output" to_op="Process Documents from Data" to_port="example set"/>
          <connect from_op="Process Documents from Data" from_port="example set" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
  • MariusHelf
    MariusHelf New Altair Community Member
    Your example process does not help much, since it does not even contain the Filter Stopwords operator (that's how we call "modules" in RapidMiner: "Operator"). However, if you disconnect the file input port, the option to select a text file will re-appear. The file input port is supposed to be used together with the Open File operator, which can also read from web resources and thus makes operators relying on file input more flexible. But as I said, just disconnect the port to get the old behaviour back.

    Happy Mining!
    ~Marius