Filtering examples based on number of occurences in attribute

Hi,
For example I have examples that containts information about visits. Every visit is defined to visitor_id. I want to filter the examples(rows) where the visitor_id occure more than 5 times. So there will be no more then 4 rows for every visitor_id. I tried filter, but that was not helpfull.
Any idea how to do this in rapid miner ?
Thanks.
Answers
-
Hi,
While I am pretty sure that the answer to this question will involve the operators "Aggregate", "Pivot", and "Filter Examples", I am unfortunately not sure if I fully got the problem. Can you give us a small data sample (original data) as well as how the desired output for this sample should look like?
Merci,
Ingo
0 -
hi...no the Filter Examples operator is not going to help you here (as you saw). The way I see it, you need to first create an attribute that lists # of occurrences, and then you can filter for n > 5 or whatever. Personally I would use the Aggregate operator where you group by visitor_id and aggregate by visitor_id. Then join this with your original data set on the visitor_id attribute.
Scott
0