[SOLVED] Temporal Association Rules

Q-Dog
Q-Dog New Altair Community Member
edited November 5 in Community Q&A
Hi,

is it possible to mine temporal association rules with RapidMiner (or temporal datamining in general)?

For Example:
"90% of the men who bought beer on friday, will buy headache tablets on saturday"


Cheers Q-Dog
Tagged:

Answers

  • haddock
    haddock New Altair Community Member
    Hi,

    You could try the Generalised Sequential Patterns operator ?
  • Q-Dog
    Q-Dog New Altair Community Member
    Thanks for your answer haddock, I just did.

    But it seems there is an error with this operator. When I try to view the results of the GSPset, I get:

    Error executing background job 'Creating Display'
    java.lang.IndexOutOfBoundsException: Index 0, Size: 0


    Exception: java.lang.IndexOutOfBoundsException
    Message: Index: 0, Size: 0
    Stack trace:

     java.util.LinkedList.entry(LinkedList.java:365)
     java.util.LinkedList.get(LinkedList.java:315)
     com.rapidminer.gui.renderer.RendererService.getName(RendererService.java:298)
     com.rapidminer.gui.processeditor.results.ResultTab.createComponent(ResultTab.java:143)
     com.rapidminer.gui.processeditor.results.ResultTab.showResult(ResultTab.java:116)
     com.rapidminer.gui.processeditor.results.DockableResultDisplay.showResultNow(DockableResultDisplay.java:226)
     com.rapidminer.gui.processeditor.results.DockableResultDisplay.access$200(DockableResultDisplay.java:78)
     com.rapidminer.gui.processeditor.results.DockableResultDisplay$5.run(DockableResultDisplay.java:212)
     com.rapidminer.gui.tools.ProgressThread$2.run(ProgressThread.java:177)
     java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
     java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
     java.lang.Thread.run(Thread.java:662)
    Can anyone reproduce this error?

    Here is an example process with data from samples (only to demonstrate the error):

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.2.000">
      <context>
        <input/>
        <output/>
        <macros/>
      </context>
      <operator activated="true" class="process" compatibility="5.2.000" expanded="true" name="Process">
        <process expanded="true" height="640" width="748">
          <operator activated="true" class="generate_nominal_data" compatibility="5.2.000" expanded="true" height="60" name="Generate Nominal Data" width="90" x="31" y="119">
            <parameter key="number_examples" value="10"/>
          </operator>
          <operator activated="true" class="generate_id" compatibility="5.2.000" expanded="true" height="76" name="Generate ID" width="90" x="179" y="120"/>
          <operator activated="true" class="nominal_to_binominal" compatibility="5.2.000" expanded="true" height="94" name="Nominal to Binominal" width="90" x="313" y="120"/>
          <operator activated="true" class="generalized_sequential_patterns" compatibility="5.2.000" expanded="true" height="76" name="GSP" width="90" x="447" y="120">
            <parameter key="customer_id" value="label"/>
            <parameter key="time_attribute" value="id"/>
            <parameter key="window_size" value="1.0"/>
            <parameter key="max_gap" value="1.0"/>
            <parameter key="min_gap" value="0.0"/>
          </operator>
          <connect from_op="Generate Nominal Data" from_port="output" to_op="Generate ID" to_port="example set input"/>
          <connect from_op="Generate ID" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
          <connect from_op="Nominal to Binominal" from_port="example set output" to_op="GSP" to_port="example set"/>
          <connect from_op="GSP" from_port="example set" to_port="result 1"/>
          <connect from_op="GSP" from_port="patterns" to_port="result 2"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
          <portSpacing port="sink_result 3" spacing="0"/>
        </process>
      </operator>
    </process>


    Chees Q-Dog
  • Q-Dog
    Q-Dog New Altair Community Member
    You're to quick haddock, I edited the post and inserted an XML  ;)
  • haddock
    haddock New Altair Community Member
    Sadly seems to work here!

    Feb 9, 2012 5:37:18 PM WARNING: Found only 2.0 sequences. Together with the small minimal support, this could result in very many patterns and a long calculation time.
    Feb 9, 2012 5:37:18 PM INFO: Generating Candidates of length 1
    Feb 9, 2012 5:37:18 PM INFO: Generated 190 candidates
    Feb 9, 2012 5:37:18 PM INFO: Building Hashtree for counting candidates of length 2
    Feb 9, 2012 5:37:18 PM INFO: Counting supporting sequences for candidates of length 2
    Feb 9, 2012 5:37:18 PM INFO: Filtered Candidates. Remaining: 46
    Feb 9, 2012 5:37:18 PM INFO: Generating Candidates of length 2
    Feb 9, 2012 5:37:18 PM INFO: Generated 138 candidates
    Feb 9, 2012 5:37:18 PM INFO: Building Hashtree for counting candidates of length 3
    Feb 9, 2012 5:37:18 PM INFO: Counting supporting sequences for candidates of length 3
    Feb 9, 2012 5:37:18 PM INFO: Filtered Candidates. Remaining: 35
    Feb 9, 2012 5:37:18 PM INFO: Generating Candidates of length 3
    Feb 9, 2012 5:37:18 PM INFO: Generated 14 candidates
    Feb 9, 2012 5:37:18 PM INFO: Building Hashtree for counting candidates of length 4
    Feb 9, 2012 5:37:18 PM INFO: Counting supporting sequences for candidates of length 4
    Feb 9, 2012 5:37:18 PM INFO: Filtered Candidates. Remaining: 14
    Feb 9, 2012 5:37:18 PM INFO: Generating Candidates of length 4
    Feb 9, 2012 5:37:18 PM INFO: Generated 2 candidates
    Feb 9, 2012 5:37:18 PM INFO: Building Hashtree for counting candidates of length 5
    Feb 9, 2012 5:37:18 PM INFO: Counting supporting sequences for candidates of length 5
    Feb 9, 2012 5:37:18 PM INFO: Filtered Candidates. Remaining: 2
    Feb 9, 2012 5:37:18 PM INFO: Generating Candidates of length 5
    Feb 9, 2012 5:37:18 PM INFO: Generated 0 candidates
    Feb 9, 2012 5:37:18 PM INFO: Saving results.
    Feb 9, 2012 5:37:18 PM INFO: Process //Data Files/Forum finished successfully after 0 s

    Actually I've just posted elsewhere about association rules and referred here

    http://rapid-i.com/rapidforum/index.php/topic,3619.msg13530.html#msg13530

    So perhaps this operator suffers from the same problems as the association rules operator, and you could stave off disaster ( the Java choke messages ) by setting the frequency bar a bit higher.

    As it happens I do a lot of exactly this," if A & B & C at T1 then is D true  at T2 with sufficient frequency etc.". If you are looking for rules with only one item in the head then you can copy the attribute you want to represent D, move all the values up on example, do your itemset mining only and then look in the frequent itemsets for that new attribute. Simples!

    Because I do this on ~1,000,000 timeslots I ended up writing an operator to outsource the actual itemset generation via a separate service, which is very fast at this, because it is CUDA based; also I needed to write an operator to do the next period peeking. It has revealed interesting stuff in my domain - but it did take quite an effort.

    Good luck with your project.
  • Nils_Woehler
    Nils_Woehler New Altair Community Member
    I just tried to run your example process and for me it works too. Do you always get the error when running the process?

    Greetings,
    Nils
  • Q-Dog
    Q-Dog New Altair Community Member
    Hmm that's a pity  :-\

    I get the error when automatically switching to the results view. The process itself works fine, the problem ony occurs when showing the results.

    Feb 10, 2012 9:58:34 AM INFO: No filename given for result file, using stdout for logging results!
    Feb 10, 2012 9:58:34 AM INFO: Process starts
    Feb 10, 2012 9:58:34 AM INFO: Loading initial data.
    Feb 10, 2012 9:58:34 AM WARNING: Found only 2.0 sequences. Together with the small minimal support, this could result in very many patterns and a long calculation time.
    Feb 10, 2012 9:58:34 AM INFO: Generating Candidates of length 1
    Feb 10, 2012 9:58:34 AM INFO: Generated 190 candidates
    Feb 10, 2012 9:58:34 AM INFO: Building Hashtree for counting candidates of length 2
    Feb 10, 2012 9:58:34 AM INFO: Counting supporting sequences for candidates of length 2
    Feb 10, 2012 9:58:34 AM INFO: Filtered Candidates. Remaining: 46
    Feb 10, 2012 9:58:34 AM INFO: Generating Candidates of length 2
    Feb 10, 2012 9:58:34 AM INFO: Generated 138 candidates
    Feb 10, 2012 9:58:34 AM INFO: Building Hashtree for counting candidates of length 3
    Feb 10, 2012 9:58:34 AM INFO: Counting supporting sequences for candidates of length 3
    Feb 10, 2012 9:58:34 AM INFO: Filtered Candidates. Remaining: 35
    Feb 10, 2012 9:58:34 AM INFO: Generating Candidates of length 3
    Feb 10, 2012 9:58:34 AM INFO: Generated 14 candidates
    Feb 10, 2012 9:58:34 AM INFO: Building Hashtree for counting candidates of length 4
    Feb 10, 2012 9:58:34 AM INFO: Counting supporting sequences for candidates of length 4
    Feb 10, 2012 9:58:34 AM INFO: Filtered Candidates. Remaining: 14
    Feb 10, 2012 9:58:34 AM INFO: Generating Candidates of length 4
    Feb 10, 2012 9:58:34 AM INFO: Generated 2 candidates
    Feb 10, 2012 9:58:34 AM INFO: Building Hashtree for counting candidates of length 5
    Feb 10, 2012 9:58:34 AM INFO: Counting supporting sequences for candidates of length 5
    Feb 10, 2012 9:58:34 AM INFO: Filtered Candidates. Remaining: 2
    Feb 10, 2012 9:58:34 AM INFO: Generating Candidates of length 5
    Feb 10, 2012 9:58:34 AM INFO: Generated 0 candidates
    Feb 10, 2012 9:58:34 AM INFO: Saving results.
    Feb 10, 2012 9:58:34 AM INFO: Process finished successfully after 0 s
    Feb 10, 2012 9:58:34 AM WARNING: Error executing background job 'Creating Display': java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
    @haddock:
    I did try lagging the attributes, unfortunately the offset between my attributes isn't constant. Therefore I wanted to use a time intervall

    // Edit
    I just uninstalled the newest version and reinstalled an older one (5.0.010) and now it seems to works. I wonder what will happen, if I update to the newest version again...
  • Nils_Woehler
    Nils_Woehler New Altair Community Member
    Okay, I ran the process with the 5.2 release and now I got the same error. But with the current SVN versionthe error is gone. Someone must already have fixed the problem  ;D
    I should be gone with the next bugfix release...

    Greetings,
    Nils
  • Q-Dog
    Q-Dog New Altair Community Member
    Nils wrote:

    Okay, I ran the process with the 5.2 release and now I got the same error. But with the current SVN versionthe error is gone. Someone must already have fixed the problem  ;D
    I should be gone with the next bugfix release...

    Greetings,
    Nils
    Sounds good :)