[solved] Import excel charset/codepage

greg
greg New Altair Community Member
edited November 5 in Community Q&A
Hello

I'm working from an xls file, codepage 1252. When Importing it into RM, all accentuated characters appear as question marks "?". If I try to open it in OpenOffice they appear as they should be.

How can I fix this?

TIA

greg
Tagged:

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Did you set the "enconding" parameter of the Read Excel operator? (to see the parameter you have to enter expert mode by clicking the guy with the hat on top of the parameter list)

    Best, Marius
  • greg
    greg New Altair Community Member
    Than ks for the quick answer :)

    I'm always running in expert mode ; however I didn't user the "read excel" operator, I used "file->import data->import excel sheet". Should I use the operator instead? I used the repository feature because I assumed it would make things smoother when I'm sent an updated excel file.

    greg
  • MariusHelf
    MariusHelf New Altair Community Member
    Hm, the import wizard seems to be missing the encoding option. But especially when you frequently update the excel file, it would be easier to create a process with the Read Excel operator, configure it once and use a Store operator to store the result in the repository. That way you only have to re-execute your process to push the updated data into the repo.

    Best, Marius
  • greg
    greg New Altair Community Member
    OK thanks I tried using the operator ; I don't have any "encoding" option, but I have "time zone" and "locale". I set them to "paris" and "french", same result, the accentuated characters are ?....
  • MariusHelf
    MariusHelf New Altair Community Member
    Which version of RapidMiner are you using? If it is less than 5.2.6 please update to the latest version.

    Timezone and locale are only for the date formatting.
  • greg
    greg New Altair Community Member
    Hum sorry, I was assuming the auto updater would keep me to date, but it seems it didn't.

    I manually downloaded the last version, set the encoding parameter and now it's working fine :)

    Thanks a lot!!

    greg