Text Pre-processing
Hi there
I am trying to do some preprocessing on text and looking for the relevant operators in RapidMiner, if they are indeed available.
I am extracting features from a sentence, using Information Gain operator. This seems to be possible. From there, I need to construct a feature vector using Bag of Words (BOW) and Term Frequency (TF). I should end up with a vector of unigrams. I want this vector of unigrams to be based on Part of Speech (POS) for each term in the sentence.
The operators I am looking for are:
1. BOW;
2. TF;
3. PoS tagging.
Are these available in RapidMiner or am I looking in the wrong operator directories?
Thanks
I am trying to do some preprocessing on text and looking for the relevant operators in RapidMiner, if they are indeed available.
I am extracting features from a sentence, using Information Gain operator. This seems to be possible. From there, I need to construct a feature vector using Bag of Words (BOW) and Term Frequency (TF). I should end up with a vector of unigrams. I want this vector of unigrams to be based on Part of Speech (POS) for each term in the sentence.
The operators I am looking for are:
1. BOW;
2. TF;
3. PoS tagging.
Are these available in RapidMiner or am I looking in the wrong operator directories?
Thanks