Morphological stemming in RapidMiner?
batstache611
New Altair Community Member
Answers
-
Hi,
two options you might want to try include two new operators present in the Operator Toolbox extension.
One option includes using the Stem Tokens Using ExampleSet-Operator using a predefined ExampleSet as a source of potential word stems (similar to the Stem (Dictionary) operator, but using an ExampleSet as an input instead of a file). You'll find an example process included as a tutorial.
Another option would be trying to use the Levenshtein Distance. With it you could search for token with a low Levenshtein Distance and choose the shortest of them. You'll find a Generate Levenshtein Distance-Operator in the Operator Toolbox as well.
Best,
Philipp
0