Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Collecting geo-tagged tweets using Python by Edward Briggler

Collecting geo-tagged tweets using Python by Edward Briggler

Twitter, as well as other social media sites, offer ways of collecting gee-spatial data. The volume of data that’s available is incredible and to be able to use spatial bounds on an area of interest (AOI) provides a means of honing in on a geographic location that is of interest. I use pycurl, tweetstream, and MySQLDb to create my own Python class for collecting tweets and storing them in a MySQL database to display on a map, or for further analysis. This talk will focus on the Twitter Streaming API, using Python, database usage, as well as some of the road blocks of using free data. Tapping into the social media realm of data can provide insight into AOI and be used to analyze vast amounts of data from users around the world.

More Decks by Arkansas GIS Users Forum Conference

Other Decks in Programming

Transcript

  1. Overview •Collecting Social Media Data •Twitter Streaming API •Python •Necessary

    Libraries •Database, Web Server •Python Window Service •Displaying Tweets
  2. Collecting Social Media Data •Facebook, Fouresquare, Twitter –Majority have API(Application

    Programming Interface) –Gain access to publicly shared data (status, check-ins, etc) –Interested in geo-located user data –Geo-located data includes latitude, longitude coordinates •Could group by user’s set location too (Conway, Arkansas, etc)
  3. Twitter Streaming API https://dev.twitter.com/docs/streaming-apis • 3 types of streams (user,

    site, public) • Public Stream ◦ streams of the public data flowing through Twitter, this is what we will use to access tweets • Will need a twitter account • Sign up as a developer
  4. Python • Using Python to access this stream and work

    with the data • Python 2.7.3 32-Bit version • Environment Variables and PYTHONPATH
  5. Necessary Libraries • MySQLdb, pycurl, pywin32, tweepy ◦ MySQLdb for

    database communication ◦ pycurl, tweepy for the twitter stream ◦ pywin32 for setting up as a windows service • http://www.lfd.uci.edu/~gohlke/pythonlibs/ ◦ has tons of libraries • Utilities that are useful ◦ easy_install, pip
  6. Database and Web Server • Using xampp installation ◦ packages

    up MySQL, Apache, PHP, Tomcat, Perl together ◦ http://www.apachefriends.org/en/xampp- windows.html • Other options… ◦ SQL Server, Oracle, Postgres backend ◦ IIS or other web servers
  7. Python Windows Service • Allows for collecting tweets in the

    background • easy start and stop of service
  8. Displaying the Tweets • Using Google Maps API • PHP

    to dynamically create xml of requested tweets and to populate map • use setInterval to refresh map to pull new tweets to the map • http://localhost:8090/presentation