Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
"[SOLVED] Bug: Distinct Values in Advanced Charts"
Q-Dog
Hello,
I think there might be a bug in the new Advanced Charts, more precisely in "Grouping: Distinct Values".
Lets assume I have a dataset of 1000 examples and I want to create a histogram of a certain attribute a1:
- I drag a1 to the "domain" and the "range" dimension
- I select "grouping: distinct values" in the domain dimension
- I select "aggregation: count" in the range dimension
When I now sum up all the count values for the attribute, I get a sum which is by far less than 1000.
Is this a bug, or did I misunderstand "grouping distinct values" ?
If you want, I can either post a process or pictures showing this (or both of course).
Cheers Q-Dog
Find more posts tagged with
AI Studio
Visualization + Dashboards
Bug Report
Accepted answers
All comments
MariusHelf
Hi,
does a1 contain missing values? Those are not counted, and thus it is possible that the total count is less than the number of examples.
If you don't have missings, we would be very interested in your process and the data, such that we can reproduce the problem.
Best regards,
Marius
Q-Dog
Hi Marius,
no a1 does not contain any missing values. Is it somehow possible to attach the ExampleSet so that you can view it directly in RapidMiner (without importing the logfile first) ?
Will the ".ioo" file do the job?
Anyway, here is a screenshot of my problem:
The example set has 17639 examples, but the plot has by far less.
The values in the x-axis are 0-163. If you assume that each value on the y-axis is 100 (which clearly is not the case), you will end up with 164*100 = 16400 < 17639.
Cheers Q-Dog
// Edit
I just checked, e.g. 0 appears 177 times in my example set, but in the plot, the count of 0 is only 45
MariusHelf
Hm, the plotters reduce the number of data points by sampling because otherwise drawing an example set with a large number of datapoints would be very slow. However, when using aggregation and grouping, it *should* not sample. Anyways, can you please try to increase the property rapidminer.gui.plotter.rows.maximum in the Gui tab of Tools->Properties in RapidMiner to a value greater than 17000?
Best,
Marius
Q-Dog
This looks by far better, thanks a lot!
MariusHelf
Now we also fixed it in the code: if any of the grouping functions is set for a Plot, no sampling is applied for that Plot. It didn't make it into yesterday's release, though.
Best, Marius
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups