Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
sampling / learning curve
wessel
Dear all,
Sampling the training set can have a major impact on classification accuracy.
Especially when the data is skewed.
Lets say you have a dataset of 100k negative examples and 1k positive examples.
And you wish experiment with different pos/neg ratios in the training set.
To do this you need:
example filter: select all negative
example filter: absolute amount
example filter: select all positive
example filter: absolute amount
merge
when there are more then two classes, it gets even more cumbersome.
Would be cool if this could be combined into a single operator.
This might also be faster and more memory efficient.
Best regards,
Wessel
Find more posts tagged with
AI Studio
Accepted answers
All comments
fischer
Hi,
just to get it right: What would be the parameters of your operator? If I get it right, it would be
- a ratio for each class
- an absolute number of examples you want as output?
Cheers,
Simon
wessel
Lets see:
Input: a dataset
Parameters fields:
label = class_A [absolute amount] or [relative amount] and [sampling type]
label = class_B [absolute amount] or [relative amount] and [sampling type]
...
label = class_Z [absolute amount] or [relative amount] and [sampling type]
Defaults: absolute amount = '' relative amount = 1 sampling type = linear
Examples:
Input, dataset with 2000 examples of class A
class_A [1000] or [] and [linear] Returns a dataset containing the first 1000 instances of class A
class_A [1000] or [] and [random] Returns a dataset containing 1000 instances of class A randomly sampled
class_A [] or [0.5] and [linear] Returns a dataset containing the first 1000 instances of class A
class_A [] or [0.5] and [random] Returns a dataset containing 1000 instances of class A randomly sampled
class_A [3000] or [] and [random] Returns an error?
class_A [] or [1.4] and [random] Returns an error?
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups