"I'm going nuts -- word vector frequency by category"
Processing documents from files, I categorize each document by which folder its coming from. When I run the process requesting a word list result, I get a great word list table showing all the words from my process, the "Total Occurrences" and the "Document Occurences" as columns in the table. Also included as columns in the table are each of my categories. But all the cells for each of the categories shows 0, rather than what I want, which is the total occurences of the word in the category. I'm sure I'm missing a simple operator to obtain this result but can't figure it out. Any help would be appreciated.
Thanks.
Andy
Thanks.
Andy
Find more posts tagged with
Sort by:
1 - 2 of
21

I've isolated this a bit more. Now I've determined that I get the proper results if I remove the "Extract Content" operator from the process. Why would this change the categorization freqency result set? Is there any other way to get the freqencies for the categories after extracting html code?