TFIDF Mean

jaskiemr
jaskiemr New Altair Community Member
edited November 5 in Community Q&A
I run TFIDF on some text, four files.

1) alpha bravo
2) alpha bravo
3) alpha bravo charlie delta
4) alpha bravo charlie delta

How is the "statistic" field calculated in the Meta data view output here? Is the mean here the calculation the td/idf measure (f[ij] / f[dj] * log( D / f )?

When I run it on "charlie" from above, RapidMiner gives 0.354. When I run the calculation by hand 1/4 * log( 4 / 2 ) I get 0.075. Is this normalized somehow or is the log the natural log or base 2?

Thank you for any input.
        mj
Tagged:

Answers

  • land
    land New Altair Community Member
    Hi,
    as I already explained in another topic, the mean is simply the statistical mean of all values in this attribute. Please take a look in the other topic for more information.

    Greetings,
      Sebastian