Process Failed - out of memory

Question

Hi all, I have some hundreds text files, some of them over 1GB which I have to process but even for some dozens of files max 50 MB large each I get after some time an error.:This process would need more than the maximum amount of available memory. ... I have a notebook with 8GB RAM and I can see that RM uses about 5GB. Is there any way to get this working? Is Radoop able to do something like this? I cannot find it in the extensions. Here is my XML:

nickeforos · Answer

Hi,

thanks for your respond.

I don't want to do TF/IDF statistics, I need the extracted words.

The keep_text option doesn't help.

The OutOfMemoryException occurs certainly in the Process Documents from Files - Extract Content and maybe in the Write Excel operator.

So, is there any way to deal with such big files in RM?

Best regards

MariusHelf · Answer

Hi,

if you only need the TF/IDF statistics and don't need the text any more after extracting it in Process Documents, you can disable the keep_text option in that operator. That should prevent RapidMiner from keeping all the files in memory.

Apart from that, in which operator does the OutOfMemoryException occur?

Best regards,
Marius