Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
"[SOLVED] Stemming: Keep Information {original word, stem}"
Urselinho
Hi there,
I'm currently doing some text processing using the different stemming operators. Right now I'm wondering if there is a way to keep/show the information which words are conflated to which stem. Without doing any adjustment the results of stemming (wordlist, example set) only contain the stems and the associated information like occurences.
What I primaliry need is something like {original word, stem}.
I'm sure there is a quite easy task, but as I'm not that familiar with RM yet I don't see it. Any idea how to do this?
Many thanks in advance,
Regards,
Urs
Find more posts tagged with
AI Studio
Stemming
Text Mining + NLP
Accepted answers
All comments
MariusHelf
Hi Urs,
actually, the stemming operators dismiss the original tokens, such that it is not possible to see which stem results from which token. The only solution may be to compare the stemmed document with the original document token-wise in a rather complex process and write the mapping manually into an example set.
Best, Marius
Urselinho
Hi Marius,
that's quite unpleasent. But OK I do see the workaround. Thanks for your help.
Best,
Urs
Urselinho
Hi Marius,
me once again. I really have to ask. Otherwise it will take me a long time to find the right operators/functions.
How can I use the Stemming-Operator in a way that words are "replaced" within a given document rather than "conflated". Because right now if I, for example, do have a document with the words "Autos" and "Auto" the wordlist will only contain the stem "auto".
Thanks in advance,
Urs
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups