"Issues Importing CSV"

jbartot
jbartot New Altair Community Member
edited November 5 in Community Q&A
Hi,

I am trying to import a CSV that is about 25M in size.  RM really struggles to process the file.  It maxes out the processors for about 10 minutes before it finally gives up and runs out of memory.  I specifically set the java heap size on launch and can see that the OS is giving RM the 2G memory space I specified.  I tried this on a smaller file (1/5 size) and got similar behavior.

I have tried this importing either to a repository or to the workspace.  I get the same behavior both ways.  The data itself is 500 x 12,000 (bag of words document vectors).  Even assuming each feature value takes up 8 bytes (for doubles) of space, it doesn't make sense that this is such a struggle. 

Any ideas?  Am I thinking about this right?

Any help would be appreciated.

Jay
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    well shouldn't happen. Is the data confidental? Would it be possible otherwise if you send it to me? Then I will include it into our checks.

    Greetings,
      Sebastian
  • jbartot
    jbartot New Altair Community Member
    Happy to share the data.  Given its size, where should I post it to?

    Thanks

    Jay
  • land
    land New Altair Community Member
    Hi,
    please send me an email. If you compress the data it should fit in my mail account.

    Greetings,
      Sebastian