I'v encountered the same issue when imported an 1.2GB csv format dataset into RM studio, won't success with less than 8GB RAM.
The datasource is from Kaggle as linked:
https://www.kaggle.com/backblaze/hard-drive-test-data
these size of files i think it's pretty common, comparing to BI tools like Tableau, extracting the same dataset only take no more than 900MB on memory. Maybe RM could improve some of those kinds of operators when reading or importing those size of datasets.
hi @RayJhong - so to be honest, I think that if you're working with 1GB+ data files, you should either upgrade your RAM (8GB is really baseline for Studio) or use a database.
Scott
Hi Jie,
I agree with you in that the memory use is excesive. This is somehow common in Java-software.
Did you try to load the data using a Python or R script? This could free some memory, depending on how efficient the conversion from data.table / pandas to RapidMiner is.
Of course you will probably still need more RAM to train models with the dataset. You can also try to use a combination of the Store and Free Memory operators, in order not to have a lot of tables hanging around in your memory.
Best,
Sebastian