Home
Discussions
Community Q&A
Keep samples based on prefered attribute value
aileenzhou
I have a dataset, there are some duplicated DOI. I must keep one of the duplicated DOIs based on 'source' attribute with preference: B>C>A, and delete rest.
For example, the data below, I want to keep row 1261 and 643, delete the rest.
Row DOI Source
18 10.1002/67 A
1261 10.1002/67 B
1400 10.1002/67 C
... ...
643 10.102/et.67 C
1428 10.102/et.67 A
Thank you in advance.
Find more posts tagged with
AI Studio
Accepted answers
lionelderkrikor
Hi
@aileenzhou
,
In this case, (B>C>A) :
Then use the same method as in the other thread, but by generating a new attribute called "Source_2" as described :
-
Reorder attributes
(1/ Source_2 , 2/ DOI)
- Generate a new attribute (for example called "Source_2") and replace in this new attribute :
*B by 1
*C by 1
*A by 2
- Generate concatenation of "Source_2" and "DOI" attributes (via
Generate Aggregation
attribute)
- Sort alphabetically the concatenated attributes (via
Sort
attribute /
sorting direction = increasing
)
-
Remove duplicates
of this concatenated attribute.
-
Split
back the concatenated attribute to retrieve the original attributes (without the duplicates) or remove them.
Take a look at the attached process and tell me if it answer to your need ...
Regards,
Lionel
Remove_duplicates_concat_sorting_B.rmp
All comments
MartinLiebig
Since Remove Duplicate always keeps the first you can I think sort and then use remove duplicates on the DOI.
Best,
Martin
lionelderkrikor
Hi
@aileenzhou
,
In this case, (B>C>A) :
Then use the same method as in the other thread, but by generating a new attribute called "Source_2" as described :
-
Reorder attributes
(1/ Source_2 , 2/ DOI)
- Generate a new attribute (for example called "Source_2") and replace in this new attribute :
*B by 1
*C by 1
*A by 2
- Generate concatenation of "Source_2" and "DOI" attributes (via
Generate Aggregation
attribute)
- Sort alphabetically the concatenated attributes (via
Sort
attribute /
sorting direction = increasing
)
-
Remove duplicates
of this concatenated attribute.
-
Split
back the concatenated attribute to retrieve the original attributes (without the duplicates) or remove them.
Take a look at the attached process and tell me if it answer to your need ...
Regards,
Lionel
Remove_duplicates_concat_sorting_B.rmp
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)