I have a
dataset of over 42000 records that has several duplicate and unique values.
However, I would like to clean it up and remove ONLY non-duplicate values and
leave duplicate records. I know the “remove duplicates” operator removes
duplicates but in my case, I want to do the opposite. It’s quite easy to
accomplish this task on excel but as you know, excel can’t seamlessly handle
the size of my dataset. Is there anyway I can perform this task on RapidMiner?