"how to calculate the distance between two text documents?"

User: "gfyang"
New Altair Community Member
Updated by Jocelyn
Hi,

Suppose here are two text documents, d1 and d2. I could build two vectors and read them by Iterator<Example>. Then, how to calculate the distance or similarity between them? For example, the cosine distance. Is there any operator or function provided by RM?

Thank you very much.

Sincerely yours,
gfyang

Find more posts tagged with

Sort by:
1 - 4 of 41
    User: "fischer"
    New Altair Community Member
    Hi,

    yes. Please look into com.rapidminer.tools.math.similarity.DistanceMeasure

    Cheers,
    Simon
    User: "gfyang"
    New Altair Community Member
    OP
    Hi,

    Thanks a lot for the reply. However, it is still not clear enough for me. Would you please give some Java codes?
    I tried the following, but failed:

    ExampleSet ex=...
    Example ex1 = ex.getExample(1);
    Example ex2 = ex.getExample(2);
    DistanceMeasure myDis = new DistanceMeasure();
    double dis = myDis.calculateDistance(ex1, ex2);
    It reported DistanceMeasure could not be instantiated?

    Thank you.

    Sincerely yours,
    gfyang
    User: "fischer"
    New Altair Community Member
    Hi,

    distance Measure is abstract. You can only instantiate its subclasses.

    Also, if you are using a distance measure at an operator, try installing a DistanceMeasureHelper.

    Cheers,
    Simon
    User: "gfyang"
    New Altair Community Member
    OP
    Hi,

    I see. The subclasses work well. Thank you.

    Sincerely yours,
    gfyang