🎉Community Raffle - Win $25

An exclusive raffle opportunity for active members like you! Complete your profile, answer questions and get your first accepted badge to enter the raffle.
Join and Win

Closest encoding to utf8mb4

User: "Robi_Me"
New Altair Community Member
Updated by Jocelyn
I am working with social media data and all those emojis are driving me crazy, when I import them they are getting changed to system encoding and are a bunch of squiggles. What encoding is closets to utf8mb4 so that I can preserve the encoding when reading from a CSV?

Find more posts tagged with

Sort by:
1 - 1 of 11
    User: "Robi_Me"
    New Altair Community Member
    OP
    Accepted Answer
    @jwpfau when I am importing into the DB it is failing saying the character is not UTF8 with error message: Incorrect string value: '\xE2 \x94 \x82....'

    This is basically all of the emojis that were being rejected. I was under the impression that I needed to set the encoding inside of Rapid Miner, however it was a change that was needed on the DB. Changing the free text field to TEXT and making the encoding UTF8mb4 sorted the issue out.