How to remove non-duplicate values?

New Altair Community Member

Mar 5, 2019

Updated Nov 5, 2024 by Jocelyn

A RapidMiner user wants to know the answer to this question: "Hey! I have a data set of over 42000 records that has several duplicate and unique values. However, I would like to clean it up and remove only non-duplicate values and leave duplicate records. I know the “remove duplicates” operator removes duplicates but in my case, I want to do the opposite. Any idea how I could do this? Thank you."

Find more posts tagged with

AI Studio

Duplicates

Sort by:

1 - 5 of 51

MartinLiebig

Altair Employee

Mar 5, 2019

Hi,

cant you just join the duplicates on the original data? Than you have only duplicates remaining.

BR,

Martin

sgenzer

Altair Employee

Mar 5, 2019

hi @MarlaBot so the Remove Duplicates operator has both options:

Image: https://us.v-cdn.net/6030995/uploads/editor/o6/1xzn3o8g550x.png

Does this help?

Scott

rfuentealba

New Altair Community Member

Mar 5, 2019

Hey,

You have 42000 records.

Some are duplicate.
Some are unique.

If you need the non-uniques, the dup output from the Remove Duplicates operator obtains the records that aren't unique.

Sorry, I was lost in translation, had to reorganize the question because I understood like, 3 different things. Yes, @sgenzer's question is fine. If what is required is an aggregation (like, the count of duplicated events), what @mschmitz says helps, too.

novice_miner

New Altair Community Member

Mar 5, 2019

Thanks for all your help. It worked like magic.

Best,

Telcontar120

New Altair Community Member

Mar 6, 2019

I think this is the same question as in this thread, where I provided a similar answer: https://community.rapidminer.com/discussion/comment/57000#Comment_57000

How to remove non-duplicate values?

Find more posts tagged with

Quick Links