[SOLVED] Crawl Web not producing any results!

stringer_bell
stringer_bell New Altair Community Member
edited November 2024 in Community Q&A
Trying to crawl and save every boxscore from http://www.pro-football-reference.com/years/2007/games.htm

It produces no results. Process starts and finishes in 0s.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<process version="5.2.008">
 <context>
   <input/>
   <output/>
   <macros/>
 </context>
 <operator activated="true" class="process" compatibility="5.2.008" expanded="true" name="Process">
   <process expanded="true" height="190" width="279">
     <operator activated="true" class="web:crawl_web" compatibility="5.2.003" expanded="true" height="60" name="Crawl Web" width="90" x="179" y="75">
       <parameter key="url" value="http://www.pro-football-reference.com/years/2007/games.htm"/>
       <list key="crawling_rules">
         <parameter key="follow_link_with_matching_url" value=".*boxscores/2007.*"/>
         <parameter key="store_with_matching_url" value=".*boxscores/2007.*"/>
       </list>
       <parameter key="output_dir" value="C:\Users\Stringer Bell\Desktop\scrape"/>
       <parameter key="extension" value="html"/>
       <parameter key="max_depth" value="3"/>
       <parameter key="obey_robot_exclusion" value="false"/>
       <parameter key="really_ignore_exclusion" value="true"/>
     </operator>
     <connect from_op="Crawl Web" from_port="Example Set" to_port="result 1"/>
     <portSpacing port="source_input 1" spacing="0"/>
     <portSpacing port="sink_result 1" spacing="0"/>
     <portSpacing port="sink_result 2" spacing="0"/>
   </process>
 </operator>
</process>

If anyone can help it would be appreciated. I have spent hours on this and cannot figure it out.
Tagged:

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • MariusHelf
    MariusHelf New Altair Community Member
    Hi, you have to increase the max_page_size.

    Best, Marius
  • stringer_bell
    stringer_bell New Altair Community Member
    Thank you Marius!  :)
  • Soumitra
    Soumitra New Altair Community Member

    Hi Marius I am also facing the same issue I tried running the code shared by stranger bell but no luck

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.