Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Large Data Handeling
sangihi
Hi,
I'm trying to classify huge data (about 500K rows) using Decision Trees in RapidMiner.
Unfortunately, the GUI gives me not enough memory even for 50K data.
Since C4.5 is able to learn the whole data instance by instance, Is it possible to "stream" data so it does not require the whole data in memory?
Find more posts tagged with
AI Studio
Accepted answers
All comments
marcin_blachnik
1) RM manage memory very efficiently so if you are able to store the dataset in the memory, you shouldn't have any problems
2) If you have a problem then use database to store the data
3) The main problem is that the decision tree (DT) build in RM is very crappy. My advice - don't use it, use J48 from weka plugin instead, or use other models kNN, Naive Bayes etc.
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups