A program to recognize and reward our most engaged community members
Marco Boeck wrote:Hi,I just did some testing here.I created two .csv files, both had 500 attributes, one had 6000 examples, the second one 30000 examples. (I tried with 60000 examples, but after I filled the file, Notepad++ refused to open it (too big)). So I let RapidMiner open both, the first one (55MB) needed about 150MB of memory, the second one (275MB) needed about 750MB memory. Both were opened by the latest RapidMiner development version without any problems (I have 8GB of RAM on this machine). Note that these were .csv files filled with only double values.Now for the .csv files with strings:500 attributes, 6000 examples, each string consisted of 26 chars: 77MB file, RM needed ~1GB to load the data.500 attributes, 30000 examples, each string consisted of 26 chars: 386MB file, RM needed ~3.5GB to load the data.This leads me to this:1) Please upgrade RapidMiner to the latest version.2) If you still run into these kind of problems, please consider using a more appropriate way of storing big amounts of data, e.g. a database or if you can't switch from .csv, try using multiple files. A 500MB .csv file is not the most efficient way of doing things - I couldn't even open it via Notepad++ on my machine.Regards,Marco