CSV File becomes very large when imported to RapidMiner

konmad
konmad New Altair Community Member
edited November 5 in Community Q&A
Hey everyone,

I have a .csv file with a size of 283MB, however when loading the file into RapidMiner it becomes roughly 6.7GB, which is too large for my system to perform text mining. The file contains 3 columns and around 220.000 rows. Two of the columns are simply ID's and one column is the actual text with around 100 words or so each. Maybe some of you have encountered the same issue and can help me fix it or at least understand what is going on here.

Thank you guys in advance!

Answers

  • Caperez
    Caperez Altair Community Member
    Hi @konmad

    This issue could be produced by the encoding and metadata. 
    have you tried to use the Store operator after the CSV importation and then import the data directly from Rapidminer using the Retrieve operator?

    Best, 
    Cesar