10 most important words
Ev_Lazarou
New Altair Community Member
I face a problem that i have not solved so far:
I am trying to find the most important words from a dataset. How could I do this?
I am trying to find the most important words from a dataset. How could I do this?
Tagged:
0
Best Answer
-
Hi!
You can get the word list from the Process Documents operator. Here you find statistics for each term in relation to the labels.
You can also do things like selecting the documents with the highest confidence for each class, and searching for the terms with the highest values. (E. g. aggregate, sum, then transpose the table.)
Best regards,
Balázs5
Answers
-
1
-
Hello Sara!
I uploaded 2 csv files, I preprocessed them (according to an exercise of my university exams), and i cross validate them with 3 algorithms. The last part of the exercise ask us to prepare a graph with which are the 10 most important (not most common) words in fake news (1 csv file) and the 10 most important words in real news (other csv file)
.
I am uploading photos of the processes run so far in order to understand a little bit more about the concept.
0 -
Hi!
You can get the word list from the Process Documents operator. Here you find statistics for each term in relation to the labels.
You can also do things like selecting the documents with the highest confidence for each class, and searching for the terms with the highest values. (E. g. aggregate, sum, then transpose the table.)
Best regards,
Balázs5 -
Dear BalazsBarany
I have already found the most important words in entire text using weight by information gain operator and on the other hand I used wordlist to data and I found the document occurancy and total occurancy how can I merge it and see the results?
Where I have to use aggregate and sum?
Thank you!
0