Clustering in Text Mining

cupboard
cupboard New Altair Community Member
edited November 5 in Community Q&A
Hi,
I've been using the text processing package for RapidMiner and am currently trying to do clustering and association rules with text documents.  I've followed all of the steps in this Vancouver Data help video (http://vancouverdata.blogspot.com/2010/11/text-analytics-with-rapidminer-part-3.html) and built the exact same process, but have not been able to generate results.  When I try to run the process, it runs for around 20-30 minutes (as opposed to a few seconds on the video) before telling me that I have run out of memory.  I'm not dealing with large documents, only two small text files. I allocated 4GB to the program so memory shouldn't be an issue, but I keep getting this error message.  A similar thing happens with any other clustering process I try to do. 
Does anyone have any advice as to how to solve the problem? 
Thanks!

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    How many documents are you processing? Are you sure the 4GB are actually available to RapidMiner? How did you allocate them, how do you start RapidMiner, and which operating system are you using?

    Best regards,
    Marius