Community & Support
Learn
Marketplace
Discussions
Categories
Discussions
General
Platform
Academic
Partner
Regional
User Groups
Documentation
Events
Altair Exchange
Share or Download Projects
Resources
News & Instructions
Programs
YouTube
Employee Resources
This tab can be seen by employees only. Please do not share these resources externally.
Groups
Join a User Group
Support
Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
classification or clustering
Macd
Hi,
I am currently busy with a dataset that contains of text. I have questions how to handle this dataset.
- because of the size of the dataset i want to use the filter example for one type of title and sample to decrease the number of items. But how can this be done exactly?
- I want to apply necessary classifications to solve the business problem. I use the operators: Retrieve- nominal to text- process documents and tokenize. Can somebody help me what i do wrong here?
Find more posts tagged with
AI Studio
Accepted answers
All comments
BalazsBaranyRM
Hi!
The Filter Examples operator has operators for nominal attributes like "contains", "starts with" or "matches". These should help you filter the title.
Sampling is done with one of the Sample operators.
Academy video:
https://academy.rapidminer.com/learn/video/sampling-weighting-intro
I don't think that you're doing something wrong with the steps you're describing in your document classification. You should have a target (label) attribute for the classification and apply a learner like Naive Bayes or Support Vector Machine in a cross validation to the data.
Text Mining is a large topic. Please check out this course in the Academy:
https://academy.rapidminer.com/courses/text-and-web-mining-with-rapidminer
Regards,
Balázs
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups