About text mining

morphism
morphism New Altair Community Member
edited November 5 in Community Q&A

Hello, how are you?


I have interest in text mining using RapidMiner


Is there any way I can do

"Nonnegative Matrix Factorization" or "Probabilistic Latent Sementic Analysis"

or "Nonlinear Transformation" to Document-Term-Matrix??


I want to do Classification, Clustering, Summarizing, Information Retrieval etc for text data


Thank you in advance and have a nice day.

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member
    These types of text mining functions do not have native RapidMiner operators to support them. However, you could potentially accomplish them using the relevant R or Python packages through the scripting operators.  Having said that, these techniques are also somewhat more advanced or even esoteric approaches to text mining.  Have you tried the more straightforward bag-of-words approach using standard word vector creation (TF-IDF or similar) yet?  You might want to start with those and see what kind of results you get before moving onto the more complex approaches.
  • morphism
    morphism New Altair Community Member

    Hello, Telcontar120.

    Thank you for your explanation.

    I am a beginner in Text mining using RapidMiner.

    I found in books such that,

    SVD or Nonnegative Matrix Factorization techniques can  be used before doing clustering so on,

    and I guessed there are no such operators for that.

    I wanted to know the full possible functions RapidMiner can do for text mining,

    and wanted to use "Nonnegative Matrix technique"


    Then, I found RapidMiner has "Singular Value Decomposition(SVD)"


    So could you please explain to me about How "SVD"  can be applied to text mining projects

    such as clustering, classification??