How to make a dense plot?

sawilla
sawilla New Altair Community Member
edited November 5 in Community Q&A
RapidMiner version: 4.6

My data has 3 attributes (att1, att2, att3). I want to create a scatter plot (x-axis: att1, y-axis: att2, colour: att3). My problem is that I would like a dense square (my data is dense) but my scatter plot has large gaps between the data blocks. The values of att1 and att2 are integers, they have the same range (in fact, the data points are symmetric) so my first version of the plot plotted the data points in a simple scatterplot with linear scaled axes. After loading the data, I filter it using an attribute value filter (cost <= 1000). My data then has gaps (eg: both the x-axis and y-axis values jump from 5 to 19 so the plot is dense up to 5, then a gap from 6 to 18, then dense from 19 to the next gap) and I want the plot to not contain the gaps. I tried to accomplish this by converting att1 and att2 to nominals (I also tried strings). Now, I still have gaps but my axes also have no sorting. For example, the x-axis values are 118 84 25 86 20 21 95 42…  Also, the filtered nominal/string values are indicated on the axes when there are no corresponding data points. What is the best way to produce a dense plot without gaps for this type of data?

Reg
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi Reg,
    this is a very unusual requirement. Umh. But your solution using nominals is a good idea. We have sorting on our agenda for plotter, but since it isn't done yet, you will have to do the sorting yourself: Just apply the Sort operator on the attribute containing the values before plotting. The order of the plotted values depends on the order they occur in the data set.

    Greetings,
      Sebastian
  • sawilla
    sawilla New Altair Community Member
    Good day Sebastian,

    Thanks for your reply. In fact, I am applying the sort operator twice -- once for att1 and once for att2. (Sorry, I forgot to mention that in my original post.) In "data view" the data is sorted but on the plot, it is randomly ordered (as far as I can tell).

    The other thing I should have mentioned in my original post is that att1 and att2 are ID numbers, so that is why I want a dense plot and I don't care to see the gaps for non-existent data.  In order to not see the gaps, and to preserve the sorting, I have found a work-around. If I write the sorted, filtered results to a temporary file, then read the results back in from the file, the plot is dense and sorted.

    Reg