"Text processing operators on example set"
Hello Everyone,
I have several csv files, that looks the same: they have 2 attributes; a word list (extracted from a document), and their occurrences. First, I have to filter them. For that, I made a Stopword Dictionary. Then, I have to make one huge matrix out of them, where there are the remaining words in the header, and every document represents a line.
The "Process Documents from Files" operator works almost perfectly, BUT the occurrences lost. This operator wants to count its own occurrence, so it is going to be 1 or 0, if the given word is presented in a document or nor not. How can I use the previously counted numbers?
I also tried it with "Read CSV", "Nominal to text" and "Process Documents from Data" operators, but in this way, I can't even filter the words.
I'll also need the name of the files in the final matrix at the beginning of the lines. I already found out how to use an existing macro, but I do not know how to make one. I would like to make a file_name macro, but I don't know how to do that.
I am a newbie, so if you know the answer for one of the questions, please detail it as much as possible, because what is obvious to you, it may not be for me.
Thank you in advance!
Laura