words containing UMLAUTE in Text Mining
Thesis_12
New Altair Community Member
Dear all,
apparently Rapid Miner is not able to search for certain words containing German Umlaute such as ä,ö,ü or also ß. When I search for the word "Änderung" in "regular expression" (in "Filter Tokens by Region" /condition: "contains match") it doesn't show any results.
I use version 5.3.005 on a Mac and am working with HTML documents. I know that the problem described above does not occur with an older version and Windows.
However, I need to solve this problem with version 5.3.005 on a mac.
I tried with " .{1,2}nderung" which worked but also gave me results like "Minderung" which was not intended.
I would be very glad if somebody knew a solution for this problem.
Thanks a lot
apparently Rapid Miner is not able to search for certain words containing German Umlaute such as ä,ö,ü or also ß. When I search for the word "Änderung" in "regular expression" (in "Filter Tokens by Region" /condition: "contains match") it doesn't show any results.
I use version 5.3.005 on a Mac and am working with HTML documents. I know that the problem described above does not occur with an older version and Windows.
However, I need to solve this problem with version 5.3.005 on a mac.
I tried with " .{1,2}nderung" which worked but also gave me results like "Minderung" which was not intended.
I would be very glad if somebody knew a solution for this problem.
Thanks a lot
Tagged:
0
Answers
-
How do you retrieve your data?
For some data retrieval operators you have to configure the correct encoding. If your input data is e.g. encoded in UTF-8 you have to configure that in the respective operator.
Best regards,
Marius0