Mining websites - how do I get started?

Basswanker
Basswanker New Altair Community Member
edited November 5 in Community Q&A
Hi all - I am a newbie to this, but I do hope that you are able to clarify a few things for me, and hopefully get me started. Any help is appreciated. The solution that I am looking for is one where I am able to automatically crawl a forum for specific words and then log the data connected. If as an example I would like to harvest all topics or comments in this that mentions the word "windows" in connection with "blue screen" and then automatically log the URLs, the user names, date of entry, etc.

Would Rapidminer be the right tool for this?

If yes, how do I get started - if no, what would you recommend instead?

Hope to hear from you.

Best,
Allan.

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi Allan,

    you can definitely do the job with RapidMiner. Did you check out our video tutorials on http://rapid-i.com ? This is a good starting point for getting the grips on RapidMiner in general.
    For your specific task you need the Text Mining and the Web Mining extension. For crawling the web you need the Crawl Web operator, after that you should use a Process Documents operator.

    Best,
    Marius