"Web Crawl wikipedia"
maccten
New Altair Community Member
Hi All
I hope you can help me
I have a page on Wikipedia that i would like to crawl https://en.wikipedia.org/wiki/2012_in_film
I wish to extract certain details that determined if each of the films was a success or a bust i.e. text mine each of the wiki pages for each of the films
Based on this information, i would like to predict if the movies due to be released in 2013 are going to be a success
My problem is I'm unable to crawl the top website and the pages it links to (i.e. each movie) as it returns no records or files
I can web crawl wikipedia.org without any issues
Does anyone know what the problem is?
Thanks for your time
I hope you can help me
I have a page on Wikipedia that i would like to crawl https://en.wikipedia.org/wiki/2012_in_film
I wish to extract certain details that determined if each of the films was a success or a bust i.e. text mine each of the wiki pages for each of the films
Based on this information, i would like to predict if the movies due to be released in 2013 are going to be a success
My problem is I'm unable to crawl the top website and the pages it links to (i.e. each movie) as it returns no records or files
I can web crawl wikipedia.org without any issues
Does anyone know what the problem is?
Thanks for your time
Tagged:
0
Answers
-
Hi,
as (almost) always we can't help you if we don't know how you configured your operators. Please post your process xml as described in my signature.
Best regards,
Marius0