location – ScraperWiki https://blog.scraperwiki.com Extract tables from PDFs and scrape the web Tue, 09 Aug 2016 06:10:13 +0000 en-US hourly 1 https://wordpress.org/?v=4.6 58264007 Where do tweets come from? https://blog.scraperwiki.com/2014/06/where-do-tweets-come-from/ https://blog.scraperwiki.com/2014/06/where-do-tweets-come-from/#comments Mon, 16 Jun 2014 08:40:51 +0000 https://blog.scraperwiki.com/?p=758221888 Geography of Twitter @replies

Geography of Twitter @replies by Eric Fisher, reproduced under a Creative Commons Attribution 2.0 Generic license.

In our Twitter search tool, we provide the location of tweets via the latitude and longitude data Twitter offers. If you want to know about where the user was who created a particular tweet, it’s unfortunate then that most Twitter users (including me) don’t enable this feature. What you usually find are rare sightings of latitude and longitude amongst mostly empty columns.

However, you can often get a good idea of a user’s location either from what they enter as location in their profile or from their time zone. We already get this information when you use the Twitter Friends tool, but not when searching for tweets. Now we’ve added it to our Twitter search too, so you can get an idea of where individual tweets were sent from.

This snippet of a search shows you what we now get and highlights the clear difference between the lonely lat, lng columns and the much busier user location and time zone:

Twitter_search_location_data

Create a new Twitter search dataset and you should see this extra data too!

]]>
https://blog.scraperwiki.com/2014/06/where-do-tweets-come-from/feed/ 1 758221888
What’s Twitter time zone data good for? https://blog.scraperwiki.com/2014/06/whats-twitter-time-zone-data-good-for/ https://blog.scraperwiki.com/2014/06/whats-twitter-time-zone-data-good-for/#comments Thu, 05 Jun 2014 10:08:20 +0000 https://blog.scraperwiki.com/?p=758221842 2744390812_c6e2aa449b_o

Curioso elemento el tiempo” by leoplus, available under a Creative Commons Attribution-ShareAlike license.

The Twitter friends tool has just been improved to retrieve the time zone of users. This is actually more useful than it first might sound.

If you’ve looked at Twitter profiles before, you’ve probably noticed that users can, and sometimes do, enter anything they like as their location.

Looking at @ScraperWiki‘s followers, we can see from a small snippet of users that this can sometimes give us messy data:

...Denver. & Beyond
Hyper Island | Stockholm
London
Manchester
Niteroi, Brazil
Somerset
There's a wine blog too .....
London / Berkshire...

People may enter the same location in a number of ways, and may provide data that isn’t even a location.

Locations from time zones

If we look at users’ time zones, Twitter only allows users to pick from a certain number of well-defined time zones. (There’s 141 in total; I’ve collated the entire set here.) The data this returns is much neater and we’d expect that this typically reflects the user’s home location:

...Abu Dhabi
Adelaide
Alaska
Almaty
America/Toronto
Amsterdam...

We find far fewer unique time zone data entries than unique location data for @ScraperWiki’s followers: there are 1586 different location entries, but just 106 time zones. If we wanted to discover which countries or regions our users are likely to be, the time zone data would be far simpler to work with.

Furthermore, time zone data can give us insight into the location of Twitter users who don’t specify their location if they’ve selected a time zone.

For ScraperWiki’s followers, we found 670 of them had an empty location and around the same number had an empty time zone. But, far fewer user accounts (only 255) have both of these fields empty. So, in some cases, we could have a good guess at the location for users who we couldn’t previously from the data the tool was providing.

We’re always working to improve the Twitter tools! If you have ideas for features you’d like to see, let us know!

]]>
https://blog.scraperwiki.com/2014/06/whats-twitter-time-zone-data-good-for/feed/ 2 758221842