Coding open-ended data from surveys

Question

A RapidMiner user wants to know the answer to this question: "Hey there, I am looking to code open-ended data from surveys. I'm used to QDA that uses a cluster algorithm to help find similar open-ends for easy categorization, does RapidMiner have such option? Thank you!"

YYH · Accepted Answer

For open ended questions in survey, you can apply vectorization on text and then build clustering models on TF-IDF. It will group the similar reviews, detect duplicated reviews.
Here is an example of text clustering process on job description data