"For each XLS row, calculate similarity among the 3 text cells in that row"
dfischer
New Altair Community Member
Hi everyone,
I would appreciate if you could share any thoughts on how could I solve the problem below:
INPUT: Excel with multiple rows and 3 columns (say columns A,B and C). All excel content is text
PROBLEM: For each row, calculate similarity among the 3 text cells in that row. Then save the calculated similarities
Example:
If Sim(x,y) is the text similarity between any cells 'x' and 'y' in the Excel file, an ideal output would be another excel that follows the format below:
Sim(A1,B1) Sim(A1,C1) Sim(B1,C1)
Sim(A2,B2) Sim(A2,C2) Sim(B2,C2)
Sim(A3,B3) Sim(A3,C3) Sim(B3,C3)
Sim(A4,B4) Sim(A4,C4) Sim(B4,C4)
Sim(A5,B5) Sim(A5,C5) Sim(B5,C5)
...
Sim(An,Bn) Sim(An,Cn) Sim(Bn,Cn)
I've see a number of Rapidminer videos to learn this task but haven't succeeded yet.
Any ideas? Since I am still learning the basics, I would appreciate if you could tell what the entire process looks like.
Thank you in advance
I would appreciate if you could share any thoughts on how could I solve the problem below:
INPUT: Excel with multiple rows and 3 columns (say columns A,B and C). All excel content is text
PROBLEM: For each row, calculate similarity among the 3 text cells in that row. Then save the calculated similarities
Example:
If Sim(x,y) is the text similarity between any cells 'x' and 'y' in the Excel file, an ideal output would be another excel that follows the format below:
Sim(A1,B1) Sim(A1,C1) Sim(B1,C1)
Sim(A2,B2) Sim(A2,C2) Sim(B2,C2)
Sim(A3,B3) Sim(A3,C3) Sim(B3,C3)
Sim(A4,B4) Sim(A4,C4) Sim(B4,C4)
Sim(A5,B5) Sim(A5,C5) Sim(B5,C5)
...
Sim(An,Bn) Sim(An,Cn) Sim(Bn,Cn)
I've see a number of Rapidminer videos to learn this task but haven't succeeded yet.
Any ideas? Since I am still learning the basics, I would appreciate if you could tell what the entire process looks like.
Thank you in advance
Tagged:
0
Answers
-
Hi,
the operators you might need is Cross distances. This is calculating the similarity - but usually between documents which are given as examples. So you i think you need to use a Loop and a Transpose (or Depivot?) Operator to get a vertical example set for each round.
If you could post an example set me or another helper might find time to build an example process.
cheers,
Martin1