"Bug in MinimalEntropyParitioning?"

User: "Legacy User"
New Altair Community Member
Updated by Jocelyn
Hello everybody,

I get strange results when I apply MinimumEntropyPartitioning on some datasets and wonder whether this is due to a bug in the implementation.

Let me illustrate the problem: I have a dataset with one attribute ("X") and one label with two possible values.
There are 6 possible values for X, 1 to 6. In total, I have 1116 rows, with the following target label distributions:

X-value    #negatives #positives #rows
1.0        124        62         186
2.0        124        62         186
3.0          0        186        186
4.0          0        186        186
5.0        124        62         186
6.0        124        62         186

Now of course I would expect a discretization into [-infty,2], ]2,4], ]4,infty] with 372. Instead, I get:

range1 [-∞ - 2] (372), range2 [2 - 5] (558), range3 [5 - ∞] (186)

It seems like there is a bug in the operator that does not correctly distinguish open and closed interval limits.
Does anybody know of a solution or a workaround?

Best,

Henrik

Find more posts tagged with