Building Similarity Matrix
Hi all,
My problem is as following: Given two groups of documents, I want to compute Cosine similarity and output a similarity matrix with all the possible comparisons. The matrix should contain the names of documents (and not terms).
For the moment, the operator pipeline is:
Process documents from Files -> Data to Similarity
My questions:
1) Is it OK to use Process documents from Files operator and in text directories create 2 entries with different documents to compare (so I will have 2 class names,i.e. 2 folders with different documents to compare)
2) What is the operator that allows to visualize a document similarity matrix?
Any advice is very much appreciated!
My problem is as following: Given two groups of documents, I want to compute Cosine similarity and output a similarity matrix with all the possible comparisons. The matrix should contain the names of documents (and not terms).
For the moment, the operator pipeline is:
Process documents from Files -> Data to Similarity
My questions:
1) Is it OK to use Process documents from Files operator and in text directories create 2 entries with different documents to compare (so I will have 2 class names,i.e. 2 folders with different documents to compare)
2) What is the operator that allows to visualize a document similarity matrix?
Any advice is very much appreciated!