🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

"Issues with regular expressions"

User: "IngoRM"
New Altair Community Member
Updated by Jocelyn
Original message from SourceForge forum at http://sourceforge.net/forum/forum.php?thread_id=2034662&;forum_id=390413

The following regular expression works in RapidMiner: '[A-Z][a-z]+', when applied to any text, to extract words that begin with an upper case. 

However, if I add any space definition, it does not work. For example: '[A-Z][a-z][ ][A-Z][a-z]+', does not get recognized as a valid regular expression.

The same expressions work well in other regex text editors.

Any ideas on why RapidMiner is not recognizing the space definiton?

Thanks,

FDR


Edit by topic starter:

I found the answer shortly after posting this; spaces seem to be defined by \s as in:
'\s[A-Z][a-z]+\s[A-Z][a-z]+'

The expression above works. However, it does find only the first occurrence of the match. Any ideas on how to get all occurrences?


Answer by Ingo Mierswa:

Hi,

the regular expressions should be the same as they are supported by Java as explained here:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html

I am not too sure but it might be that capturing groups can help here:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/regex/Pattern.html#cg

Cheers,
Ingo

Find more posts tagged with

Comments

No comments on this post.