Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
ModelApplier needs to much memory with high-dimensional data?
Legacy User
Hi again,
I was playing around with the cross validation for some time using one of the templates that come with RapidMiner and the sparse toy data file. Using the toy data, the standard-XVal with a LibSVM classification learner + ModelApplier + Evauator runs in less than 2 sek.
Then I changed the the dimension of the data from the current 25 features to something larger (e.g. 100000), simply by adding 1 additional feature with the index 99999 and some value to each of my 10 sparse data vectors.
Unfortunately, the application (!) of the learned model to the test data now runs extremely long, using incredible amounts of memory. When I do the same without RapidMiner, using a simple perl script and the standard LibSVM implementation, the XVal is again done in seconds. Am I using the wrong ModelApplier or wrong options?
Thank you so much,
Mome
Find more posts tagged with
AI Studio
Accepted answers
All comments
land
Hi Mome,
this might result from some internal conversions, but I'm not sure. Could you please send me the example data file and the process?
Greetings,
Sebastian
Legacy User
Sorry for the late reply, some other project occupied all my time. Meanwhile, I found out that RapidMiner works indeed very well. I found my stupid mistake:
The SparseFormatExampleSource has a "DataManagement" parameter. When I store 1 Mio (very sparse set) attributes for thousands of samples using a double_array, I assume this leads to an extremely large (and extremely sparse) matrix. Choosing "boolean_sparse_array" instead worked well for my problem. I promis to read the operator description more carefully next time
Thanks a lot
Mome
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups