When loading textfiles rapid miner is introducing spaces and special chars

lavramu
lavramu New Altair Community Member
edited November 2024 in Community Q&A
Hi,

I am trying to do a very simple task of loading text files using the operator "Process documents from files" . After loading I see that there are spaces between each character in the file and also a special character (ÿþ) in the beginning of every file .

example :

b a l a n c e  s h e e t

I am really stuck and would appreciate any help.
I chose the regular options while loading files adn dint see this problem in any of the tutorials and is happening to me

Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • lavramu
    lavramu New Altair Community Member
    adding to my question -- I notice this does not happen to all files but only to the ones I exported from nvivo. But I exported as normal text files and look normal to me but turn up wierd in Rapidminer. Please help.
  • aborg
    aborg New Altair Community Member
    Hello,
    Are you sure those second characters are spaces and not with code 0? (Spaces have code 32.) It seems -assuming those are 0s- that the nvivo files are saved as UTF-16 with byte order mark set. (I guess RM do not try to use the encoding specified by BOMs.)
    Cheers, gabor
  • lavramu
    lavramu New Altair Community Member
    thanks a lot for the reply. They look like spaces to me . I am not sure if they are anything else. In notepad I see them as a white space.
  • MariusHelf
    MariusHelf New Altair Community Member
    Di you try to change the encoding parameter of Process Documents from Files?

    Best regards,
    Marius
  • lavramu
    lavramu New Altair Community Member
    Let me try and post back..thanks!

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.