Hi ! Can I implement a focused web crawler in Rapidminer using it's operators?

AnumBse16
AnumBse16 New Altair Community Member
edited November 2024 in Community Q&A
I'm a newbie. I looked into some operators like crawl web and TF-IDF. I need to implement focused web crawler for my project

Best Answer

  • SGolbert
    SGolbert New Altair Community Member
    Answer ✓


    With the Crawl Web operator of the Web Mining extension you can set up crawling rules as regular expressions. Note, however, that the web mining tools of RM are kind of basic and limited to HTML code analysis. You will need an external tool for complex web crawling (Javascript, PHP, authentification, lots of pages, etc.)

    Regards,
    Sebastian

Answers

  • Telcontar120
    Telcontar120 New Altair Community Member
    You should definitely download the web mining and text mining extensions, which are free.  There are many operators that will support web mining and text processing.  You can do almost anything you want in RapidMiner in this area if you are willing to put in some time to learn the platform.  There are excellent free video tutorials available and good in-program documentation with sample processes as well.
  • AnumBse16
    AnumBse16 New Altair Community Member
    @Telcontar120 can i get some guidance I'm confused which data mining operators to use to make a keyword focused web crawler
  • SGolbert
    SGolbert New Altair Community Member
    Answer ✓


    With the Crawl Web operator of the Web Mining extension you can set up crawling rules as regular expressions. Note, however, that the web mining tools of RM are kind of basic and limited to HTML code analysis. You will need an external tool for complex web crawling (Javascript, PHP, authentification, lots of pages, etc.)

    Regards,
    Sebastian