tackle large Files
choose_username
New Altair Community Member
Hello all,
i have a large Data set (15 Attributes and almost 50.000 records). The Problem is : For example if a use the Operator Detect Outlier, RapidMiner need a very long time to perform it. Is there a Solution to this (I mean without using a different Computer)? Or do i need to look for a new Data set ?
Thanks in advance
User
i have a large Data set (15 Attributes and almost 50.000 records). The Problem is : For example if a use the Operator Detect Outlier, RapidMiner need a very long time to perform it. Is there a Solution to this (I mean without using a different Computer)? Or do i need to look for a new Data set ?
Thanks in advance
User
Tagged:
0
Answers
-
Hello,
well, there is no general answer for this. There simply exist some algorithms which have long runtimes (like neural networks, relevance vector machine and - as far as it seems - also the outlier detection operator). In contrast to other data mining solutions, RapidMiner does not remove such algorithms since they work quite well on smaller data sets (or faster machines ). Actually, there is not much you can do beside- using only a sample of the data
- trying different schemes or different approaches for you problem, in this case for outlier detection
- check if the algorithm is available in a parallel working mode and use more than one CPU core only
- inspect the source code and check if it can be optimized / parallelized which we are than happy to include into RapidMiner if you allow this
Ingo0 -
thank u for ur fast answer . i think i will look for another Data set.
greetings
user0