Automodel - large (?) CSV dataset memory issues
I'm doing research on the CIC IDS 2017 dataset which contains 200-300MB of data for one file.
I try to do a automodel and predict the source IP based on other attributes. I get into memory issues running this (I have 16GB RAM) but I assume that I have used a too large dataset or too many attributes for the modeling.
So my question is what number of lines and attributes can I expect to be handled doing this?
Find more posts tagged with
Sort by:
1 - 2 of
21
Hello,
RapidMiner is a bit resource hungry, but it shouldn't be a problem to load a large file like that one. I have 4 Gb of RAM on my MacBook Air and can load the file.
The thing is that with such a limited amount of memory, I usually do three things to maximize:
However, I also do tune my RapidMiner Studio installation to use more memory. In this case:
Preferences > System > Data Management. I configure that number to be at least twice the amount of training data.
Hope this helps,