Filtering ExampleSet Keywords
Hi!
I have created a process thanks to the "Process Documents from Files" operator, and included Tokenize, Filter Stopwords, Filter Tokens, Transform Cases, Create n-Grams and Stem. I also selected the vector creation option with TF-IDF. Since I am trying to find keywords in dozens of documents, in the results I am getting an ExampleSet chart with over 5000 columns. I was wondering if anyone knew how I could filter these results, so that I could have the top 100 relevant keywords or so?
Or alternatively, if there was a way to graphically visualize all the keywords, so that the most important would become obvious?
Any help would be greatly appreciated:)
Find more posts tagged with
Sort by:
1 - 4 of
41
Attach a Wordlist to Data operator to the WOR port of your Process Documents operator and then use a Sort Operator to sort them in descending fashion. You will get an example set of the most frequent words.