W-Apriori doesn't work. need help

edfred
edfred New Altair Community Member
edited November 5 in Community Q&A
Hi at all,

i want to use the W-Apriori operator to generate some association rules, but it's not working.
I am using the rapidminer version 4.3.
This is my operatorchain:

root
|  |
|  |-Textinput
|    |
|    |-StringTokenizer
|    |-GermanStopwordFilter
|    |-ToLowerCaseConverter
|    |-TokenLengthFGilter
|
|-ExampleSetWriter
|
|-W-Apriori

If I press the start-button, there is a an exception like this:

Error: 905 External Error
Error in: W-Apriori (W-Apriori) W-Apriori caused an error: weka.core.UnsupportedAttributeTypeException: weka.associations.Apriori: Cannot handle numeric attributes! An external program or library has reported an error. Please see the documentation of this program or library for further information.

How can I get binary attributes. I think I have to converte them somehow.

Can youo give me an example operator chain, where it's works?

Best regards
edfred
Tagged:

Answers

  • earmijo
    earmijo New Altair Community Member
    If you are using "Binary Ocurrences" as your Vector Creation choice, you will have a matrix of 0/1s. You still have to transform it into a matrix of true/false which is the input form accepted by the Asociators like Weka-apriori.  You can do this with the Numerical2Binomial converter (Preprocessing/Attributes/Filter/Converter/...).
    <operator name="Root" class="Process" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <parameter key="attributes" value=""/>
            <parameter key="create_text_visualizer" value="true"/>
            <parameter key="default_content_encoding" value="ISO-8859-1"/>
            <list key="namespaces">
            </list>
            <parameter key="on_the_fly_pruning" value="3"/>
            <parameter key="prune_below" value="2"/>
            <list key="texts">
              <parameter key="graphics" value="../data/newsgroup/graphics"/>
              <parameter key="hardware" value="../data/newsgroup/hardware"/>
            </list>
            <parameter key="vector_creation" value="BinaryOccurrences"/>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="EnglishStopwordFilter" class="EnglishStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
        </operator>
        <operator name="Numerical2Binominal" class="Numerical2Binominal">
        </operator>
        <operator name="W-Apriori" class="W-Apriori">
        </operator>
    </operator>
  • edfred
    edfred New Altair Community Member
    Thank you that was very helpfl. It works now but the German words aren't displayed in the right way. Like the letters "ä", "ö", "ü" and "ß". Where can I set the enccoding to utf-8 ?
  • land
    land New Altair Community Member
    Hi,
    this can be switched in the Textinput operator. The parameter is called "default_encoding" or something like that.

    Greetings,
      Sebastian
  • edfred
    edfred New Altair Community Member
    Hi,

    I tried this:

    <operator name="Root" class="Process" expanded="yes">
        <operator name="TextInput" class="TextInput" expanded="yes">
            <parameter key="attributes" value=""/>
            <parameter key="create_text_visualizer" value="true"/>
            <parameter key="default_content_encoding" value="UTF-8"/>
            <list key="namespaces">
            </list>
            <parameter key="on_the_fly_pruning" value="3"/>
            <parameter key="prune_below" value="2"/>
            <list key="texts">
              <parameter key="test" value="../rm_workspace/apriori/test"/>
            </list>
            <parameter key="vector_creation" value="BinaryOccurrences"/>
            <operator name="ToLowerCaseConverter" class="ToLowerCaseConverter">
            </operator>
            <operator name="StringTokenizer" class="StringTokenizer">
            </operator>
            <operator name="GermanStopwordFilter" class="GermanStopwordFilter">
            </operator>
            <operator name="TokenLengthFilter" class="TokenLengthFilter">
                <parameter key="min_chars" value="3"/>
            </operator>
        </operator>
        <operator name="Numerical2Binominal" class="Numerical2Binominal">
        </operator>
        <operator name="W-Apriori" class="W-Apriori">
        </operator>
    </operator>
    But this is not working. Rapidminer freezes after 5 minutes. I tried it with this:
    java -Xms128M -Xmx1024M -jar rapidminer.jar
    But Rapidminer still freeze. And I have to close the whole program.
    If I use the default encoding (I let the space empty.), it's working. But it's not displaying the german letters.
    Do you know why?

  • land
    land New Altair Community Member
    Hi,
    unfortunatly I don't have any clue, why this should happen. And I can't test it without the data.
    Did you wait a few minutes before closing rapidMiner? Some parts of the TextMiningPlugin somehow manage to block the gui thread. But the gui thread recovers if the calculation has been finished.

    Greetings,
      Sebastian
  • edfred
    edfred New Altair Community Member
    I was waiting a long time, but nevertheless the program was blocked and I have to abort it.