🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Filtering ExampleSet Keywords

User: "Mireille"
New Altair Community Member
Updated by Jocelyn

Hi!

I have created a process thanks to the "Process Documents from Files" operator, and included Tokenize, Filter Stopwords, Filter Tokens, Transform Cases, Create n-Grams and Stem. I also selected the vector creation option with TF-IDF. Since I am trying to find keywords in dozens of documents, in the results I am getting an ExampleSet chart with over 5000 columns. I was wondering if anyone knew how I could filter these results, so that I could have the top 100 relevant keywords or so? 

Or alternatively, if there was a way to graphically visualize all the keywords, so that the most important would become obvious?

Any help would be greatly appreciated:)

 

 

Find more posts tagged with

Sort by:
1 - 4 of 41
    User: "Thomas_Ott"
    New Altair Community Member

    Attach a Wordlist to Data operator to the WOR port of your Process Documents operator and then use a Sort Operator to sort them in descending fashion. You will get an example set of the most frequent words.

    User: "Mireille"
    New Altair Community Member
    OP

    Thank you very much for your help!

     

    I am just a little confused as to how to use the "sort" operator because it asks me which attribute to sort, however, each word is listed as a different attribute. 

    Thanks again!

     

     

    User: "Thomas_Ott"
    New Altair Community Member

    Sort by Total. 

    User: "Mireille"
    New Altair Community Member
    OP

    Thank you,

     

    Is there any way to order words by their TF-IDF weighted value rather than their frequency?