How Data to Similarity operator works on Large Dataset

statspro
statspro New Altair Community Member
edited November 5 in Altair RapidMiner
Dear Community Members,

I wanted to know how Data to Similarity operator works on large dataset. As per my understanding this operator works in a permutation & combination manner (i.e. nC2 ways). If we have only 50 text then it will check the combination with 49 text and it will gives us the similarity results in the result window (First, Second & Similarity) but if we have large datasets (i.e. 100000 text) then how it works. is there any other specific filter I need to use for checking the text similarity for large dataset ?

Can anyone help me on this.

Thanks,
Arun
Tagged: