Percentiles and distributions
Prozach
New Altair Community Member
Completely new to rapidminer.
I have gone through the tutorials but have not found exactly what I am looking for.
What I have so far: I am able to connect to a MySQL database and pull the data I want to analyze.
Most of the crunching has already been done and I just want to figure out the percentiles or distribution of the sample set.
I can't quite figure that out in the context of Rapid Miner.
My data consists of two values. A possource and a HUNC.
The possource is one of 6 values (1,6,7,18,20) and the HUNC is a numeric value denoting accuracy.
I have also tried pulling out just the data for a specific possource with a count command grouping on HUNC so my data looks like this
HUNC COUNT
98 1256
100 95847
etc
this basically shows how many instances of a particular HUNC show up.
So what I am after is the 95th and 67th percentiles of the HUNC for all the various possource values, or a standard deviation curve showing the distribution but I am having trouble getting rapidminer to produce anything.
I am not trying to predict anything really, just trying to analyze and visualize existing datasets.
I have gone through the tutorials but have not found exactly what I am looking for.
What I have so far: I am able to connect to a MySQL database and pull the data I want to analyze.
Most of the crunching has already been done and I just want to figure out the percentiles or distribution of the sample set.
I can't quite figure that out in the context of Rapid Miner.
My data consists of two values. A possource and a HUNC.
The possource is one of 6 values (1,6,7,18,20) and the HUNC is a numeric value denoting accuracy.
I have also tried pulling out just the data for a specific possource with a count command grouping on HUNC so my data looks like this
HUNC COUNT
98 1256
100 95847
etc
this basically shows how many instances of a particular HUNC show up.
So what I am after is the 95th and 67th percentiles of the HUNC for all the various possource values, or a standard deviation curve showing the distribution but I am having trouble getting rapidminer to produce anything.
I am not trying to predict anything really, just trying to analyze and visualize existing datasets.
Tagged:
0