🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Text Mining beginner HELP"

ayaghciUser: "ayaghci"
New Altair Community Member
Updated by Jocelyn
Hello,

I am quite new for text mining process. I am trying to user defined external dictionary but having problem.

My question is that when I create user-defined dictionary (in notepad), what will be the structure. For instance
I did craete my file as, and I used (open wordnet dictionary)
artier, arty
artiest, arty

but I am getting an error.

I appreciate any comments, or any reference (book, website) suggestion

 

Find more posts tagged with

Sort by:
1 - 3 of 31
    Hi,

    with which operator do you want to use your external dictionary? Depending on the operator the structure of the dictionary may change.

    Best,
    Nils
    ayaghciUser: "ayaghci"
    New Altair Community Member
    OP
    Hi Nills

    I am trying to use [Stem (Dictionary)] operator. My intention is that (1) generate dictionary (txt form), (2) stem the tokens

    Thanks in advance

    Art
    Hi,

    your dictionary has to look like this:

    arty:artier
    arty:artiest
    and your process may look like this:

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.007">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.007" expanded="true" name="Process">
        <process expanded="true" height="235" width="547">
          <operator activated="true" class="text:read_document" compatibility="5.2.001" expanded="true" height="60" name="Read Document" width="90" x="132" y="152">
            <parameter key="file" value=""/>
          </operator>
          <operator activated="true" class="text:tokenize" compatibility="5.2.001" expanded="true" height="60" name="Tokenize" width="90" x="313" y="165"/>
          <operator activated="true" class="text:stem_dictionary" compatibility="5.2.001" expanded="true" height="60" name="Stem (Dictionary)" width="90" x="447" y="165">
            <parameter key="file" value=""/>
          </operator>
          <connect from_op="Read Document" from_port="output" to_op="Tokenize" to_port="document"/>
          <connect from_op="Tokenize" from_port="document" to_op="Stem (Dictionary)" to_port="document"/>
          <connect from_op="Stem (Dictionary)" from_port="document" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Best,
    Nils