FP-GROWTH Itemset - one of the items is oversupported
Hi RM Team,
I have issue with FP-Growth operator.
My example set contains 32 columns across 12000 examples. For some reason one of the attributes (whichever has TRUE in the first example=first row) is always showing 94-95% support, although real support for this item is 4-5% across all examples. All other items are calculated properly. Any ideas?
Thanks!
Find more posts tagged with
Sort by:
1 - 9 of
91
Hi @bernardo_pagnon, could you share the sample data and process for us to investigate the issues? I tried to re-produce the bug by testing the template under //Samples/Templates/Market Basket Analysis/Market basket analysis. With a modified min confidence from 0.1 to 0.2, the association rules are updated correctly. BTW I am using 9.6. Thanks
Sure, there it is. I am using the Supermarket_extracted file, available at http://rapidminerbook.com/
The reason is that you enabled on the "checkbox find min number of itemsets:"
If you keep the same high threshold on support but uncheck the option

You should get an updated frequency items (empty) in the result view.
More info to cover the details of the min num of itemsets can be found here
https://docs.rapidminer.com/latest/studio/operators/modeling/associations/fp_growth.html

If you keep the same high threshold on support but uncheck the option

You should get an updated frequency items (empty) in the result view.
More info to cover the details of the min num of itemsets can be found here
https://docs.rapidminer.com/latest/studio/operators/modeling/associations/fp_growth.html

hello @bernardo_pagnon I will also add that the online http://rapidminerbook.com/ is very out-of-date and has not been maintained in years. I would strongly recommend using the RapidMiner Academy instead.
Scott
oh you're a professor?
Let me change your rank and add you to the University Professor Stable. It has many KB pages including lists of books, etc..
Why didn't you tell us?
Scott

Why didn't you tell us?

Scott
Problem solved by converting TRUE/FALSE in excel file to 0 and 1 and then converting numerical to binomial in RM.
I have another question though:) In the Associations Rule operator, I'm setting the min. confidence at 0.15, but in the results, I don't see the rules between 0.15 and 0.2. I see those rules if I set min confidence to 0.1. Why is this happening?