🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Selecting out subset of data for Association Rule Mining."

User: "ckwcheng"
New Altair Community Member
Updated by Jocelyn
Here is a subset of my data that I'd like to use for association rule mining.  It is a semi-binary table

9762 CN14 0 0 0 0 0 0 1
9763 CN07 0 0 0 0 0 0 1
9764 CN07 0 0 0 0 0 0 1
9765 CN14 0 0 0 0 0 0 1
9766 CN14 0 0 0 0 0 0 1
9767 CN33 0 0 0 0 0 0 1
9768 CN02 0 0 0 0 0 0 1
9769 CN12 0 0 0 0 0 0 1
9770 CN14 0 0 0 0 0 0 1
9771 CN04 0 0 0 0 0 0 1
9772 CN04 0 0 0 0 0 0 1
9773 CN04 0 0 0 0 0 0 1
9774 CN05 0 0 0 0 0 0 1
9775 CN07 0 0 0 0 0 0 1
9776 CN07 0 0 0 0 0 0 1
9777 CN07 0 0 0 0 0 0 1
...etc.

I want to find out the association rules of CN07, CN14, etc individually.  That is, I want to find the association rules for just CN07, and then look at finding association rules for just CN14, and so on.   What data preprocessing step must I do in order to select out a subset to run through the association rule miner?  I'm using Rapidminer 5.

Any help is much appreciated!  Thanks!

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "land"
    New Altair Community Member
    Hi,
    you can use the filter examples operator to select a subset of the example set. If you want to automatically loop through all values of an attribute, you might take a look at the following process:
    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <process version="5.0">
      <context>
        <input>
          <location/>
        </input>
        <output>
          <location/>
          <location/>
        </output>
        <macros/>
      </context>
      <operator activated="true" class="process" expanded="true" name="Process">
        <process expanded="true" height="565" width="963">
          <operator activated="true" class="generate_data" expanded="true" height="60" name="Generate Data" width="90" x="45" y="30"/>
          <operator activated="true" class="discretize_by_bins" expanded="true" height="94" name="Discretize" width="90" x="179" y="30">
            <parameter key="range_name_type" value="short"/>
          </operator>
          <operator activated="true" class="loop_values" expanded="true" height="76" name="Loop Values" width="90" x="313" y="30">
            <parameter key="attribute" value="att1"/>
            <process expanded="true" height="565" width="963">
              <operator activated="true" class="filter_examples" expanded="true" height="76" name="Filter Examples" width="90" x="45" y="30">
                <parameter key="condition_class" value="attribute_value_filter"/>
                <parameter key="parameter_string" value="att1=%{loop_value}"/>
              </operator>
              <operator activated="true" class="nominal_to_binominal" expanded="true" height="94" name="Nominal to Binominal" width="90" x="179" y="30"/>
              <operator activated="true" class="materialize_data" expanded="true" height="76" name="Materialize Data" width="90" x="313" y="30"/>
              <operator activated="true" class="fp_growth" expanded="true" height="76" name="FP-Growth" width="90" x="447" y="30"/>
              <connect from_port="example set" to_op="Filter Examples" to_port="example set input"/>
              <connect from_op="Filter Examples" from_port="example set output" to_op="Nominal to Binominal" to_port="example set input"/>
              <connect from_op="Nominal to Binominal" from_port="example set output" to_op="Materialize Data" to_port="example set input"/>
              <connect from_op="Materialize Data" from_port="example set output" to_op="FP-Growth" to_port="example set"/>
              <connect from_op="FP-Growth" from_port="frequent sets" to_port="out 1"/>
              <portSpacing port="source_example set" spacing="0"/>
              <portSpacing port="sink_out 1" spacing="0"/>
              <portSpacing port="sink_out 2" spacing="0"/>
            </process>
          </operator>
          <connect from_op="Generate Data" from_port="output" to_op="Discretize" to_port="example set input"/>
          <connect from_op="Discretize" from_port="example set output" to_op="Loop Values" to_port="example set"/>
          <connect from_op="Loop Values" from_port="out 1" to_port="result 1"/>
          <portSpacing port="source_input 1" spacing="0"/>
          <portSpacing port="sink_result 1" spacing="0"/>
          <portSpacing port="sink_result 2" spacing="0"/>
        </process>
      </operator>
    </process>
    Greetings,
      Sebastian