I'm using the attached dataset to illustrate the problem. It is a very basic program to compute association rules. I read the binary matrix. I transform the 1/0 to true/false. I compute the frequent itemset with the operator FPgrowth and here the problems start. "Blouse" is item that appears only in 3 out of 20 transactions. The program reports a support of 0.85. Obviouly, the error carries over to the rule calculation part.
Here's my code in case I did something silly.
<operator name="Root" class="Process" expanded="yes">
<operator name="CSVExampleSource" class="CSVExampleSource" breakpoints="after">
<parameter key="filename" value="K:\clothingstore.csv"/>
<parameter key="id_name" value="tid"/>
</operator>
<operator name="Numerical2Binominal" class="Numerical2Binominal" breakpoints="after">
</operator>
<operator name="FPGrowth" class="FPGrowth" breakpoints="after">
<parameter key="min_support" value="0.2"/>
</operator>
<operator name="AssociationRuleGenerator" class="AssociationRuleGenerator">
<parameter key="min_confidence" value="0.7"/>
</operator>
</operator>
If I try the Apriori algorithm from the Weka list everything is fine. I've noticed this problem with other (bigger) datasets. Can you replicate my problem? I'm using version 4.3.
[attachment deleted by admin]