Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
How to split strings contained in a text column of csv file into words
Ayushi_Aggarwal
As of now, I am reading a CSV file which has review(text), n1, n2, n3, overall (text) columns.
I am using select attributes to include only review column, which gives me an output in rapidminer of the form:
Row Review
1 Poor service
2 There were torn seats
What i want to do is split the contents of Review column into individual words like : Poor, service, There, etc.
I am using Process documnets to data > Tokenize but somehow not getting the required output.
Please help.
Find more posts tagged with
AI Studio
CSV
Text Mining + NLP
Split
Accepted answers
David_A
Hi,
if you don't necessarily have to use the Text extension. You could also simply use the "Split" Operator (not to confuse with "Split Data") and use a regular expression. I would say something simple like
\s+|\W+
should do the trick (to split along spaces or non word characters (letters and numbers)
.
Best,
David
Telcontar120
Can you be more clear about why Tokenize is not giving you what you expect? What are you getting? If you share your process and a data sample it will be easier to troubleshoot. In general Tokenize should do exactly what you are asking for, take a text column and split it up into individual words.
All comments
David_A
Hi,
if you don't necessarily have to use the Text extension. You could also simply use the "Split" Operator (not to confuse with "Split Data") and use a regular expression. I would say something simple like
\s+|\W+
should do the trick (to split along spaces or non word characters (letters and numbers)
.
Best,
David
Telcontar120
Can you be more clear about why Tokenize is not giving you what you expect? What are you getting? If you share your process and a data sample it will be easier to troubleshoot. In general Tokenize should do exactly what you are asking for, take a text column and split it up into individual words.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups