Classification using Inputted keywords

mdc
mdc New Altair Community Member
edited November 5 in Community Q&A
Hi,

Is it possible in RM  to define a Class using user specified keywords? My understanding of Classification is that you have to generate a model for a certain class and then apply the model to the new documents.

What I really wanted to happen is to input keywords xxx and yyy, and then RM will find all the relevant documents using for example Similarity or Classification.

thanks,
Matthew
Tagged:

Answers

  • IngoRM
    IngoRM New Altair Community Member
    Hi Matthew,

    just a suggestion: I would index your documents by TFIDF with the text input operator. Store this example set together with the word list. Then build a new document containing only your keywords and index it by using the same word list. Instead of classification, you can now merge both examples sets and calculate the similarity by, for example, cosine similarity. Filter out only those similarities containing the keyword document and sort according to the similarities. The basic operators are all part of RM and the text plugin.

    Cheers,
    Ingo
  • mdc
    mdc New Altair Community Member
    Hi Ingo,

    What operator is used to "merge both example sets"?

    thanks,
    Matthew
  • IngoRM
    IngoRM New Altair Community Member
    You won't believe it: it's called "ExampleSetMerge"  ;D

    Cheers,
    Ingo
  • mdc
    mdc New Altair Community Member

    Thanks, but I guess the "ExampleSetMerge" is in 4.3. My PC has 4.2 and couldn't find it. I'm just waiting for 4.4 to  upgrade.

    Matthew
  • IngoRM
    IngoRM New Altair Community Member
    Ah, yes, that could be. Updating to 4.3 does not make too much sense since 4.4 is currently under the final tests before it is going to be released.

    Cheers,
    Ingo
  • mdc
    mdc New Altair Community Member
    Hi Ingo,

    I have not implemented it yet (I'm waiting for RM 4.4). However I have a question with this method. With this way, the Similarity will be applied to each of the documents against each one of them. Is this correct? Is there a way to check the similarity of one document only against several documents?

    thanks
    Matthew
  • IngoRM
    IngoRM New Altair Community Member
    Hi,

    only with a trick: iterate over the examples with an IteratingOperatorChain where the number of iterations is taken from a macro defined by the new DataMacroDefinition operator (number of examples). Filter down the examples to the current one with the ExampleRangeFilter and merge the document with this single example. Calculate the similarity and store it via ProcessLog. After the loop, you can transform the ProcessLog back to a data set, sort it...

    Cheers,
    Ingo