FP--Growth <-> Apriori
cpc2
New Altair Community Member
Hi,
I am currently using the FPGrowth and the WEKA-Apriori Operator on the Iris Dataset.
The Process looks like this:
<operator name="Root" class="Process" expanded="yes">
<operator name="ArffExampleSource" class="ArffExampleSource">
<parameter key="data_file" value="C:\Dokumente und Einstellungen\b\Eigene Dateien\rm_workspace\sample\data\iris.arff"/>
</operator>
<operator name="Numerical2Polynominal" class="Numerical2Polynominal">
</operator>
<operator name="W-Apriori" class="W-Apriori">
<parameter key="M" value="0.0010"/>
<parameter key="I" value="true"/>
</operator>
</operator>
Both the FPGrwoth and the Apriori Op have a min_support of 0.001 . The other options are Standard.
My question is: Why is the Weka Op able to find Itemsets and the FPGrowth Op not ? Even when i lower the min_support FPGrowth doesn't
find any Itemsets at all.
I am currently using the FPGrowth and the WEKA-Apriori Operator on the Iris Dataset.
The Process looks like this:
<operator name="Root" class="Process" expanded="yes">
<operator name="ArffExampleSource" class="ArffExampleSource">
<parameter key="data_file" value="C:\Dokumente und Einstellungen\b\Eigene Dateien\rm_workspace\sample\data\iris.arff"/>
</operator>
<operator name="Numerical2Polynominal" class="Numerical2Polynominal">
</operator>
<operator name="W-Apriori" class="W-Apriori">
<parameter key="M" value="0.0010"/>
<parameter key="I" value="true"/>
</operator>
</operator>
Both the FPGrwoth and the Apriori Op have a min_support of 0.001 . The other options are Standard.
My question is: Why is the Weka Op able to find Itemsets and the FPGrowth Op not ? Even when i lower the min_support FPGrowth doesn't
find any Itemsets at all.
Tagged:
0
Answers
-
The answer is in the documentation...My question is: Why is the Weka Op able to find Itemsets and the FPGrowth Op not ? Even when i lower the min_support FPGrowth doesn't find any Itemsets at all.
and here is an example..Please note that the given data set is only allowed to contain binominal attributes, i.e. nominal attributes with only two different values. Simply use the provided preprocessing operators in order to transform your data set. The necessary operators are the discretization operators for changing the value types of numerical attributes to nominal and the operator Nominal2Binominal for transforming nominal attributes into binominal / binary ones. <operator name="Root" class="Process" expanded="yes">
<operator name="ArffExampleSource" class="ArffExampleSource">
<parameter key="data_file" value="C:\Documents and Settings\Alien\My Documents\rm_workspace\sample\data\iris.arff"/>
</operator>
<operator name="Numerical2Polynominal" class="Numerical2Polynominal">
</operator>
<operator name="Nominal2Binominal" class="Nominal2Binominal">
</operator>
<operator name="W-Apriori" class="W-Apriori" activated="no">
<parameter key="M" value="0.0010"/>
<parameter key="I" value="true"/>
</operator>
<operator name="FPGrowth" class="FPGrowth">
<parameter key="min_number_of_itemsets" value="1"/>
<parameter key="min_support" value="0.0010"/>
</operator>
</operator>0 -
Thanks man, that helped me alot.
Theres still something that I don't get:
The last 4 Rules from the Apriori Result:
7. petallength = 1.300=true 7 ==> class = Iris-setosa=true 7 conf:(1)
8. petallength = 1.600=true 7 ==> class = Iris-setosa=true 7 conf:(1)
9. petalwidth = 0.400=true 7 ==> class = Iris-setosa=true 7 conf:(1)
10. petalwidth = 0.300=true 7 ==> class = Iris-setosa=true 7 conf:(1)
Are not generated from FPGrowth. Even the itemsets are not generated (The Apriori OP generates more sets than FPGrowth) . Do you have any idea why ?
Thanks in advance,
Birger0 -
Hi there Birger,
Don't want to sound like the Thought Police, but you need to check out the algorithms, which take different inputs and produce different outputs, as we have seen. http://en.wikipedia.org/wiki/Association_rule_learning is as good a place to start as any.
That being said it would be as useful as a fart in a space-suit if different algorithms were to produce wildly different associations. But fear not! If you set like against like with the minimum support the results on the Iris set are consistent, as this shows...<operator name="Root" class="Process" expanded="yes">
<operator name="ArffExampleSource" class="ArffExampleSource">
<parameter key="data_file" value="C:\Documents and Settings\Alien\My Documents\rm_workspace\sample\data\iris.arff"/>
</operator>
<operator name="Numerical2Polynominal" class="Numerical2Polynominal">
</operator>
<operator name="Nominal2Binominal" class="Nominal2Binominal">
</operator>
<operator name="FPGrowth" class="FPGrowth">
<parameter key="keep_example_set" value="true"/>
<parameter key="find_min_number_of_itemsets" value="false"/>
<parameter key="min_support" value="0.1"/>
</operator>
<operator name="W-Apriori" class="W-Apriori">
<parameter key="C" value="0.6"/>
<parameter key="R" value="true"/>
<parameter key="c" value="1.0"/>
</operator>
</operator>0 -
Thanks a ton, sry for the stupid question0
-
Not stupid at all, glad to be of assistance.0
-
Hi,
although FP-Growth and Apriori should return exactly the same results in theory, the implementations are quite different. This does not change the result, if the input is equal, but both operators make different assumptions. For example does the FP-Growth operator ignore special attributes, it seems to me, that the W-Apriori doesn't. So if you label is a special attribute, for example of role label, FP-Growth would ignore it, and hence no FrequentItemSet would be generated containing it.
Greetings,
Sebastian0