"regex not working"

allerkonge
allerkonge New Altair Community Member
edited November 2024 in Community Q&A

Dear all,

 

I'm trying to clean a dataset, and I'm working with a couple of regex. If I text the regex on the website regexpal, it works fine, but if I put the same regex to Rapidminer with the Replace Operator, it says there is a mistake. 

 

This is one of the regex I'm testing:

 

(?=\b[\m*#])\w+

 

When I try this one, it says it's uncorrect.

 

I'm correcting this in this way, adding backslashes

 

(?=\\b[\\m*#])\\w+

 

And it doesn't says is uncorrect, but it doesn't replace anything. The attribute is gender, so I'd like to replace for example "mm", or "male" with "M"

 

Thanks a lot for your help.

 

 

Tagged:

Best Answer

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓

    Hi,

     

    regexpal only works for Javascript but RapidMiner uses the Java regex parser.  Despite the similar names, both languages are actually completely different so there is no guarantee that any Javascript regex would word with Java.  See here for example: http://stackoverflow.com/questions/21883629/java-vs-javascript-regex-matching

     

    Anyway, in your case can't you just use "m.*" (without quotes) in "replace_what" and "M" in "replace_by"?

     

    I am not sure if it needs to be more complicated than that but I don't know your data of course...

     

    Hope this helps,

    Ingo

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Answer ✓

    Hi,

     

    regexpal only works for Javascript but RapidMiner uses the Java regex parser.  Despite the similar names, both languages are actually completely different so there is no guarantee that any Javascript regex would word with Java.  See here for example: http://stackoverflow.com/questions/21883629/java-vs-javascript-regex-matching

     

    Anyway, in your case can't you just use "m.*" (without quotes) in "replace_what" and "M" in "replace_by"?

     

    I am not sure if it needs to be more complicated than that but I don't know your data of course...

     

    Hope this helps,

    Ingo