Altair RISE
A program to recognize and reward our most engaged community members
Nominate Yourself Now!
Home
Discussions
Community Q&A
Finding the most similar document(s) in a collection to a test document
crcowan
While I was using version 4 of Rapid Miner I built a chain to perform this function. It is discussed here:
http://rapid-i.com/rapidforum/index.php/topic,1201.msg4577.html#msg4577
and here:
http://rapid-i.com/rapidforum/index.php/topic,680.msg2587.html#msg2587
.
With the advent of Rapid Miner 5 I was wondering if there are some new/better operators to allow this function.
The basic requirement is to compare a single (test) document to a set of documents and find the document in the set that best matches the test document (cosine similarity).
Any recommendations?
Thank you.
Charles
Find more posts tagged with
AI Studio
Accepted answers
All comments
radone
Hello Charles,
I was not deal with any similar problem, but my idea is to use entropy based representation (available in text mining extension) of documents and than for example usink k-NN you can check the similarity of the documents.
Regards
radone
Quick Links
All Categories
Recent Discussions
Activity
Unanswered
日本語 (Japanese)
한국어(Korean)
Groups