Comments on: How to scrape and parse Wikipedia https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/ Extract tables from PDFs and scrape the web Thu, 14 Jul 2016 16:12:42 +0000 hourly 1 https://wordpress.org/?v=4.6 By: Eddie https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/#comment-733 Tue, 27 Nov 2012 15:53:38 +0000 http://blog.scraperwiki.com/?p=758215842#comment-733 Your wikiscrape script just saved me, needed to get a list of regions from counties in the UK and other databases I’ve found online kept letting me down.. Setup a small script in python with flask and ran through google-refine.. Very helpful!

]]>
By: The Data Hob | ScraperWiki Data Blog https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/#comment-732 Tue, 06 Mar 2012 04:00:21 +0000 http://blog.scraperwiki.com/?p=758215842#comment-732 […] I would have loved to have derived it from the editable source of the wikipedia article, as I described elsewhere, but is impossible to do because it is insanely […]

]]>
By: Chris Davis https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/#comment-731 Fri, 27 Jan 2012 08:32:30 +0000 http://blog.scraperwiki.com/?p=758215842#comment-731 Waiting 6 months for the latest DBpedia update isn’t a concern any more. They have a version that is being synchronized live with Wikipedia now – see http://live.dbpedia.org/.

I agree with the criticism about how semantic web applications sometimes assume far too clear boundaries between entities. However, this isn’t an inherent problem with DBpedia since they’ve basically outsourced the task of creating entity definitions to the Wikipedia community. The only way they would show these caves being connected together in one system was if people on Wikipedia said that they were.

In practical terms, I think that ScraperWiki can still be an awesome tool for scraping Wikipedia since the DBpedia parser does sometimes have problems parsing certain fields, and I don’t think they have very good support yet for parsing tables.

]]>
By: Julian https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/#comment-730 Thu, 08 Dec 2011 17:30:22 +0000 http://blog.scraperwiki.com/?p=758215842#comment-730 where is this file k2.py of yours?

scraperwiki.swimport() is a function in the library as described at the bottom of this page:
https://scraperwiki.com/docs/python/python_help_documentation/

]]>
By: Alice https://blog.scraperwiki.com/2011/12/how-to-scrape-and-parse-wikipedia/#comment-729 Thu, 08 Dec 2011 16:47:25 +0000 http://blog.scraperwiki.com/?p=758215842#comment-729 Great article. But I’m facing some problem. Your first example is working fine. I have tried second one. But that one is not working. Here is traceback . Thank you.
File “k2.py”, line 2, in
wikipedia_utils = scraperwiki.swimport(“wikipedia_utils”)
AttributeError: ‘module’ object has no attribute ‘swimport’

]]>