Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Calculate TFIDF
barthos
Hello,
I would like to calculate the TF-IDF of words in different documents, in order to determine the words that are the most significant for each document.
I use the create document block for each new document and I add an attribute (a name) to each document. I then use a "documents to data" operator to generate an example set. Then I use a "process document from data" operator to compute the TF-IDF (which I selected on the parameter board).
The problem is that I don't get TF-IDF but only the number of occurences of the words and the number of documents in which they appear. Moreover, I don't see anymore the label of the document, so I am not able to distinguish the different documents.
Can somebody help me?
Thanks a lot,
Barthélémy
Find more posts tagged with
AI Studio
Accepted answers
All comments
colo
Hi Barthélémy,
it sounds like you are only looking at the wordlist output (where the word occurences are shown). But also take a look at the example set output of the "Process Documents" operator. There you will see TF-IDF values and also the document's label.
Instead of chaining "Documents to Data" and "Process Documents from Data" you can use the single operator "Process Documents" instead.
Best regards
Matthias
barthos
Thanks Matthias !
Barth
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups