Database doesn't appear after importing

Jack1701
Jack1701 New Altair Community Member
edited November 5 in Community Q&A
Hello,

I'm trying to import a .csv database to rapidminer, but after I tell it where to store the data, it says that it is importing the data to the location I specified. Then after the importing data window closes the dataset doesn't appear.

I'm trying to import the 2014 .csv from the Stanford Database on Ideology, Money in politics and elections: public version 2.0 and I've already extracted it from the .gz. I use the settings that it recommends in the data format, and no errors seem to appear, I have it replace errors with missing values then I try to place the data in the local repository, and it says it is importing the data, but nothing happens when it looks like it is finished.

I don't know what is going on, and why it is doing this.

Thank you for the help,

Jack.

Best Answer

Answers

  • sgenzer
    sgenzer
    Altair Employee
    hi @Jack1701 that clearly should not happen. Can you please share a screenshot when "Then after the importing data window closes the dataset doesn't appear"? Also please send me your rapidminer-studio.log file. It is in your .RapidMiner folder.

    Scott

  • Jack1701
    Jack1701 New Altair Community Member


    Here is a screenshot right before the import appears to stop, and what it shows after.



  • sgenzer
    sgenzer
    Altair Employee
    hi @Jack1701 ok you did not warn me that this was a 10GB csv file :smile: I just tried to import it myself and got an error (exactly the same as what I saw in your log file:


    Keep in mind that it took RM about 15 min to get here. How much RAM do you have on your machine?

    Scott

  • Jack1701
    Jack1701 New Altair Community Member
    Sorry, I didn't realize that was abnormally large.

    I have 16.0 GB of RAM on my machine.
  • Jack1701
    Jack1701 New Altair Community Member
    I opened it in Excel, and it was able to load the first 1,048,575 rows, and RapidMiner was able to import that when I saved it in another .csv file. It ended up being about 350 MB, is there a way to get excel to load different parts of the file, and load the file in, in parts, or is that the most excel can do in this case?
  • sgenzer
    sgenzer
    Altair Employee
    so as you can see, Excel is a piece of cr@p when it comes to handling large data sets. My local installation (Office 365 Excel for Mac) only uses ONE logical processor so it's not even parallelized. My advice would be to load the data set into a MySQL database and forget Excel.

    Scott
  • Jack1701
    Jack1701 New Altair Community Member
    Ok, thank you so much for the help. :)