Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Altair RapidMiner
Partitioning the DataSet into N samples
John_De_Jong
Is there a Preprocessing Filter in Rapid Miner where i can take a whole dataset, and create N samples with same distribution as the originial data set.
An example
I have data set with 1Million data, with two classes. So original Instance has size of 1 Million. I want to sub-sample them into 50 sub-samples with 20K data in each sample, i.e size of sample1, sample2...sample50 is 20K. When i run the filter i get 50 Instances, and each Instance has 20K, and each sample of 20K is unique samples from 1 Million, and it has same balance between the labels as in 1 Million, i.e if label1 had 90% and label2 had 10%, in 20K i have 18K of label1 and 2K of label2.
Any help would be appreciated
John
Find more posts tagged with
AI Studio
Comments
There are no comments yet
Quick Links
All Categories
Recent Discussions
Activity
My Discussions
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups