"UTF-8 Strings generated by Execute R not displaying properly"

aruberutou
aruberutou New Altair Community Member
edited November 5 in Community Q&A
Hi,

I am using Rapidminer with "Execute R". I have a script that is downloading and parsing JSON documents.

In Rapidminer, this process is looped and then appended and displayed.

The problem is that UTF-8 strings are not being displayed correctly in Rapidminer; e.g.; "Study objective is to assign patients<U+2019> management with". As you can see the UTF-8 part is not being encoded correctly.

I have already changed my encoding preferences to "UTF-8", and check that the script displays as intended in R.

I am also converting polynomial strings that are being returned by "Execute R" to text via "Nominal to String".

I'm just guessing, but perhaps it may have something to do with how Rapidminer handles character strings being output from "Execute R"?

Any help would be appreciated. Thanks,
Tagged:

Answers

  • Marco_Boeck
    Marco_Boeck New Altair Community Member
    Hi,

    due to various R components ignoring all our efforts to globally set the encoding to UTF-8 for the entire script, communication with R is sadly limited to the system default encoding at this point in time. Characters not representable in this encoding will break when processing in R.

    Regards,
    Marco
  • aruberutou
    aruberutou New Altair Community Member
    Hi,

    Thanks for the quick response. My system default settings should allow UTF-8. It works in RStudio without having the locale set.

    I noticed, however, that text/character columns in dataframes (R) are output (to Rapidminer) as polynomials rather than text.

    I was wondering if this, perhaps, was responsible?

    If so, is there a data-type in R that that Rapidminer will encode as text upon output?

    Thanks,