FPGrowth algorithm with the existence of missing values

zazass8
zazass8 New Altair Community Member
edited November 2024 in Community Q&A
I am trying to implement the fpgrowth algorithm on a dataset that is already binarised, but also contains some missing values as well. Instead of applying data imputation techniques, I believe it will be better to find a way to compute support and confidence metrics by ignoring the missing values. For example if for item A, I have 4 occurencies out of 10 transactions and 2 of them are missing, then the support should be 4/8 instead of 4/10. And we will do this for all itemsets. I tried to edit the open source code of the fpgrowth algorithm from the mlxtend library, but I see that's very hard to do the code is very abstract in general. Has anyone found a way on how to solve this issue? I know @MattTC13 made exactly the same question on this forum, a few years ago if you have a solution it would be great for you to share it!

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.