Out of memory error

Saurabh_Sawant_24
Saurabh_Sawant_24 New Altair Community Member
edited November 5 in Community Q&A
I have a data set of  1.7 million of transactions running on windows server with 32 GB ram but still i am getting "Out of memory" error with HBOS algorithm. 
Can someone help?
Tagged:

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member
    You have a couple of options here.  These are listed in the order I would pursue them:
    1. Sample your dataset first, then define the outlier boundaries based on HBOS, and then create rules to tag outliers in your full dataset.  This should probably work just fine because you probably don't need all 1.7MM records to define your outliers using HBOS technique.
    2. Temporarily increase the size of your server RAM (if you are running in a cloud environment like AWS or Azure this is pretty easy to do).
    3. Try the RapidMiner Cloud offering, which lets you access a RapidMiner provided server to handle exceptionally large jobs on a per-credit-hour basis.
  • M_Martin
    M_Martin New Altair Community Member
    In addition to the suggestions from Telcontar120 above, have you allocated the maximum feasible amount of memory to RapidMiner Studio on your Windows Server machine?  You mention that your machine has 32 GB total ram - but how much of that 32 GB have you allocated to RapidMiner Studio to access if need be? 
    From the Settings --> Preferences menu, you can specify how much ram RM Studio can maximally use.  One of the machines I use for RapidMiner development has 32GB of ram - and I have allocated up to 23 GB of ram on this machine for RapidMiner Studio as I have had processes fail due to running out of memory. 
    RapidMiner Studio doesn't automatically grab all of the memory you allocate when it loads, but if you allocate (for example) 20 GB of ram, RapidMiner Studio will use up to the amount of ram if needed.  The Resource Monitor panel will always show you how much ram RM Studio is using at any given time.
    I have also found it useful to close and restart RM Studio after running a memory intensive process - which frees up considerable memory given that at start up, RM Studio will not need to use all of the ram you have allocated in Settings -- Preferences.
    Hope this has been helpful and best wishes, Michael Martin
  • Saurabh_Sawant_24
    Saurabh_Sawant_24 New Altair Community Member
    @M_Martin
    we are running this process on RM server allocating 25 GB ram to 1 job agent container and 5 GB for Studio
    After the process gets started we  close the studio to freeup the memory.
  • Saurabh_Sawant_24
    Saurabh_Sawant_24 New Altair Community Member
    @Telcontar120
    As i am newbie to data science and rapidminer both
    Can to tell me how to define or identify boundaries on HBOS ?

    An example would really help me to understand. Thanks.
  • Telcontar120
    Telcontar120 New Altair Community Member
    @Saurabh_Sawant_24 HBOS doesn't require you to define the boundaries for outliers, just the number of bins used to generate the histograms.  In my experience I have found it to be pretty robust such that it is not overly sensitive to this parameter, but you can try the default setting of -1 to start and then see what kind of results that generates.  [-1 is a special value that sets the bins at sqrt(N)].
  • kanika15
    kanika15 New Altair Community Member
    Hi, in the above scenario, if the pipeline fails on AI hub due to memory space issue can it be captured via exception handling?