Compare attribute columns based on value ranges?
hi,
I want to compare the values from 2 attribute columns from 2 different excel files.. e.g radius1 and radius2,
now I want to "identify" those as equal (meaning, their ID is the same) if they are equal in a certain range, e.g radius1 = 1.77 and radius 2 = 1.78
like in a formula: if radius1 = between 1.02*radius2 and 0.98*radius2, then its equal!
then I want to join all the rows based on that equal row entries if it matches above formula.
is it somehow possible to identify equality based on ranges like above?
Find more posts tagged with
If you are only interested in casewise comparison of radius1 and radius2 values, then @BalazsBarany method works equally well without the Cartesian join--just use generate attribute to calculate the difference and filter those that meet your threshhold. But if you do want a pairwise comparison of all possible combinations of radius1 and radius2, I hope you have a small dataset! The combinations inflate pretty quickly :-) .
Best,
Hi!
If you don't have too much data, you could do a Cartesian Join, then use Generate Attributes for calculating the difference and then Filter Examples for only keeping the examples with a small difference.
If your example sets have many lines, Cartesian Join will create a huge data set. In that case, you might want to try this Generic Join approach with the built-in scripting:
http://datascientist.at/2016/06/generic-joins-in-rapidminer/#english
Regards,
Balázs