Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
classification or clustering
Macd
Hi,
I am currently busy with a dataset that contains of text. I have questions how to handle this dataset.
- because of the size of the dataset i want to use the filter example for one type of title and sample to decrease the number of items. But how can this be done exactly?
- I want to apply necessary classifications to solve the business problem. I use the operators: Retrieve- nominal to text- process documents and tokenize. Can somebody help me what i do wrong here?
Find more posts tagged with
AI Studio
Accepted answers
All comments
BalazsBaranyRM
Hi!
The Filter Examples operator has operators for nominal attributes like "contains", "starts with" or "matches". These should help you filter the title.
Sampling is done with one of the Sample operators.
Academy video:
https://academy.rapidminer.com/learn/video/sampling-weighting-intro
I don't think that you're doing something wrong with the steps you're describing in your document classification. You should have a target (label) attribute for the classification and apply a learner like Naive Bayes or Support Vector Machine in a cross validation to the data.
Text Mining is a large topic. Please check out this course in the Academy:
https://academy.rapidminer.com/courses/text-and-web-mining-with-rapidminer
Regards,
Balázs
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups