Choose elements from Column

My problem is that i need to remove all rows from a datasheet which have in a specific column unique input.
for example .... Lets say there is an column that have results from 1 to 9 ... and those can exist for 0 to 100 times or more ... if the numbers 1 and 2 in the column exist only once I want to remove their rows.
any ideas ?
thanks
Find more posts tagged with
Ok, got it. It sounds like you could try to use the "Aggregate" operator with the aggregation function "Count" on your attributes, in order to get the values that should be filtered out (because they rarely show up). Then you could use those values as input in the "Filter Examples" operator, e.g. with a macro ("Extract Macro" operator). You would need to use the "Multiply" operator to get different threads of your data though. It may become a bit labor-intensive, if you have a huge amount of attributes, but there would probably be a way to solve this kind of situation with a loop operator.
Interestingly a similar problem and solution is taught in the official RM Radoop training.
I recommend going through as many RapidMiner training courses as you can because as well as a snazzy certificate there's quite a few practical tips on how to approach data mining problems like this.
Ok, got it. It sounds like you could try to use the "Aggregate" operator with the aggregation function "Count" on your attributes, in order to get the values that should be filtered out (because they rarely show up). Then you could use those values as input in the "Filter Examples" operator, e.g. with a macro ("Extract Macro" operator). You would need to use the "Multiply" operator to get different threads of your data though. It may become a bit labor-intensive, if you have a huge amount of attributes, but there would probably be a way to solve this kind of situation with a loop operator.
Use the RegEx parameter in Select Attributes, write the RegEx, and then toggle on Invert Condition.