"how to remove rows containing a particular string/word from an excel file?"
soham0077
New Altair Community Member
Hi i want to delete rows which contains a specific word in excel file and get output without those rows. I am using 5.0.13 version rapid miner. i have started using rapid miner recently. can anyone suggest me how to go about it and what operators to choose?
i have read about "filter examples" operator. now having an excel file in .xls format, what will be the best way to get output without rows containing a particular word? please reply.
i have read about "filter examples" operator. now having an excel file in .xls format, what will be the best way to get output without rows containing a particular word? please reply.
Tagged:
0
Answers
-
You did already import the data via the Read Excel operator, right? Then just add a Filter Examples operator. With RapidMiner 5 you then can filter on one column. Select attribute_value_filter as condition_class. Then the parameter_string
column1 != .*badWord.*
will keep all rows where column1 does not contain the string "badWord".
To match only whole words, your filter should look like this:
column1 != != ^(.+\s)*badWord(\s.*)*$
The cryptic syntax used here are regular expressions Google for that term to get more information.
Best regards,
Marius0