How can I calculate the frequency of specific words for each row in the excel data

New Altair Community Member

Nov 22, 2018

Updated Nov 5, 2024 by Jocelyn

Hi,

I'm working on a data that each sentence is in separate rows. I want to determine word frequency in each row with a word list that I have created. Then I would like to add these values to my dataframe as a new variable.

For example:

Let's say, I have a list of words that contains apple and banana (it is my dictionary). And I have independent sentences in rows like that:

1. X x x apple x x banana x apple.

2. X apple x x x x.

3. X x banana x apple x.
.
..
...

Now I want to calculate how many times the words in my list have been repeated separately. As a result, the new column I want to create is:

1. = 3

2. = 1

3. = 2
.
..
...

Thanks in advance.

Find more posts tagged with

AI Studio

Excel

Sort by:

1 - 1 of 11

Telcontar120

New Altair Community Member

Accepted Answer

Nov 23, 2018

If I understand your question, this is pretty straightforward in RapidMiner. Process your text data using the "Process Documents from Data" operator, which allows you to input both a defined wordlist and your data source. Inside you'll need to use Tokenize to split your text into words and then set the word vector option to "term occurrences". The output will be a new attribute (column) for each word in your wordlist with the count of the number of occurrences for the text you process (each text will be its own row or example).

View in context

🎉Community Raffle - Win $25

How can I calculate the frequency of specific words for each row in the excel data

Find more posts tagged with

Quick Links