Altair RISE

A program to recognize and reward our most engaged community members

Nominate Yourself Now!

tackle large Files

Hello all,

i have a large Data set (15 Attributes and almost 50.000 records). The Problem is : For example if a use the Operator Detect Outlier, RapidMiner need a very long time to perform it. Is there a Solution to this (I mean without using a different Computer)? Or do i need to look for a new Data set ?

Thanks in advance

User

Find more posts tagged with

AI Studio

Accepted answers

All comments

IngoRM

Hello,

well, there is no general answer for this. There simply exist some algorithms which have long runtimes (like neural networks, relevance vector machine and - as far as it seems - also the outlier detection operator). In contrast to other data mining solutions, RapidMiner does not remove such algorithms since they work quite well on smaller data sets (or faster machines

). Actually, there is not much you can do beside

using only a sample of the data
trying different schemes or different approaches for you problem, in this case for outlier detection
check if the algorithm is available in a parallel working mode and use more than one CPU core only
inspect the source code and check if it can be optimized / parallelized which we are than happy to include into RapidMiner if you allow this

Cheers,
Ingo

choose_username

thank u for ur fast answer

. i think i will look for another Data set.

greetings

user