Nomial2Binomial always throws OutOfMemory
bob123
New Altair Community Member
hello,
The data set I am working on almost 500k entries. It has about 500 entries each for about 1000 stock tickers. For each entry, the row contains a stock ticker symbol (like C or XIN) and a binomial 0,1.
So i am trying to generate frequent item sets using fpgrowth, but it complains that the ticker synbols are not binomial. So i try to run Nomial2Binomial on the data first. But it runs out of heap space quickly. So i increased the heap size to 3gb, It ran for alot longer, but in the end, it still threw a OutOfMemory exception because the heap space was exhausted.
So my question is there another way to do this? I just started using rapidminer yesterday so if I am missing something obvious, please point it out.
Thanks!
The data set I am working on almost 500k entries. It has about 500 entries each for about 1000 stock tickers. For each entry, the row contains a stock ticker symbol (like C or XIN) and a binomial 0,1.
So i am trying to generate frequent item sets using fpgrowth, but it complains that the ticker synbols are not binomial. So i try to run Nomial2Binomial on the data first. But it runs out of heap space quickly. So i increased the heap size to 3gb, It ran for alot longer, but in the end, it still threw a OutOfMemory exception because the heap space was exhausted.
So my question is there another way to do this? I just started using rapidminer yesterday so if I am missing something obvious, please point it out.
Thanks!
Tagged:
0
Answers
-
Hi,
I am not fully sure that I got you right but nevertheless: it seems that you have transactional data and want to transform it into basket data. For that purpose you could use a pivotization. There was a thread about that some days ago:
http://rapid-i.com/rapidforum/index.php/topic,648.0.html
If you want to apply Nominal2Binominal in memory (not in the database) on your data set, this would result in 500K * 1000 symbolds * 8 byte which are approx. 3.8 Gb raw data without any overhead. This is possible on a 64 bit machine with 8Gb+ memory, but certainly not on a 32 bit machine which only uses about 1.5 Gb for Java (no matter what you specify).
Cheers,
Ingo0 -
Thanks for the reply. Yes it seems like transforming the data into basket data would be helpful, but I cannot find the pivoting operator, Example2AttributePivoting, you mention in the other thread
I searched in the user guide, but could not find any mention of it. I am using Rapidminer 4.30 -
SOrry I posted as guest0