[SOLVED] Text Processing - Tokenize: keep word order
Dear All!
Can anybody help me to do a text tokenization in a way that remains the original word order?
I have a sample text like: "delta gamma alpha beta" I use a Process Documents operator and a Tokenize operator in it. I create a word vector that will be an example set after a WordList to Data operator. And unfortunately this result is an alphabetically ordered list, so 'alpha; beta; gamma; delta' [first, second, third, fourth rows]. I want the original word order, so an example set, where the first example is 'delta', second is 'gamma', third is 'alpha', fourth is 'beta'. Without the WordList to Data operator, I have a WordList that is also an alphabetically ordered list.
Of course this can be solved with a Loop operator in a difficult way, but this is not powerful.
So how can I tokenize in a way that remains the original word order?
Thank you!!