"Bug in MinimalEntropyParitioning?"

New Altair Community Member
Updated by Jocelyn
Hello everybody,
I get strange results when I apply MinimumEntropyPartitioning on some datasets and wonder whether this is due to a bug in the implementation.
Let me illustrate the problem: I have a dataset with one attribute ("X") and one label with two possible values.
There are 6 possible values for X, 1 to 6. In total, I have 1116 rows, with the following target label distributions:
X-value #negatives #positives #rows
1.0 124 62 186
2.0 124 62 186
3.0 0 186 186
4.0 0 186 186
5.0 124 62 186
6.0 124 62 186
Now of course I would expect a discretization into [-infty,2], ]2,4], ]4,infty] with 372. Instead, I get:
range1 [-∞ - 2] (372), range2 [2 - 5] (558), range3 [5 - ∞] (186)
It seems like there is a bug in the operator that does not correctly distinguish open and closed interval limits.
Does anybody know of a solution or a workaround?
Best,
Henrik
I get strange results when I apply MinimumEntropyPartitioning on some datasets and wonder whether this is due to a bug in the implementation.
Let me illustrate the problem: I have a dataset with one attribute ("X") and one label with two possible values.
There are 6 possible values for X, 1 to 6. In total, I have 1116 rows, with the following target label distributions:
X-value #negatives #positives #rows
1.0 124 62 186
2.0 124 62 186
3.0 0 186 186
4.0 0 186 186
5.0 124 62 186
6.0 124 62 186
Now of course I would expect a discretization into [-infty,2], ]2,4], ]4,infty] with 372. Instead, I get:
range1 [-∞ - 2] (372), range2 [2 - 5] (558), range3 [5 - ∞] (186)
It seems like there is a bug in the operator that does not correctly distinguish open and closed interval limits.
Does anybody know of a solution or a workaround?
Best,
Henrik