Textual ETL: Stemming from dictionary

Wanttoknow
Wanttoknow New Altair Community Member
edited November 2024 in Community Q&A
Hi,

First of all I have to say that RM5.0 is a wonderful tool. :o Congratulations.

I started with pre processing text for classification and I am having some problems with the "Stem (Dictionary)" component.

I am referring to a textfile for the patterns but I am not sure about the syntax of the entries/records in the textfile. The help is very brief about this

Right now the first line in my designated TXT file looks like this:

"move: moving moved move"

But it is not replacing any of the terms to their stem.

Any idea?

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • arminmania
    arminmania New Altair Community Member
    Hi,

    I am not sure, but I think you have to write as followed:

    move , moving moved move
  • TobiasMalbrecht
    TobiasMalbrecht New Altair Community Member
    Hi,
    Wanttoknow wrote:

    Right now the first line in my designated TXT file looks like this:

    "move: moving moved move"
    did you try to put a blank before the colon?

    Kind regards,
    Tobias
  • Wanttoknow
    Wanttoknow New Altair Community Member
    Well, after a lot of trail and error this seems to work

    "
    aanleveren:aanlever.*
    aanleveren:aangelever.*
    zorgverzekering:zorgverzeker.*
    "
    But putting multiple patterns on 1 line like this "aanleveren : aanlever* aangelever*" doesn't work.

  • Wanttoknow
    Wanttoknow New Altair Community Member
    Another question:

    Is it possible to use an external list for the ReplaceToken component? That would be more convenient than entering records with the list editor of the component.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.