Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Open States TCamp11 Presentation

Avatar for jamesturk jamesturk
September 26, 2011

Open States TCamp11 Presentation

slides on OpenStates from TCamp 2011

Avatar for jamesturk

jamesturk

September 26, 2011
Tweet

More Decks by jamesturk

Other Decks in Technology

Transcript

  1. The "govtrack" role • Collect • Augment • Publish •

    Also working with Participatory Politics Foundation to make state-level OpenCongress-like site
  2. Brute Force Solution • Have to scrape in most states

    • Large volunteer effort ◦ 3500 commits ◦ 35 contributors ◦ 20,000 LoC ▪ (Python)
  3. Technical Approach: Scraping • main tools: python, lxml • derived

    libraries: ◦ scrapelib - intelligently fetch webpages ◦ validictory - validation of arbitrary data structures ◦ jellyfish - string metrics (fuzzy matching) • we provide an API for scrapers so they can write code like Bill('upper', 'HB 1', '2010 Budgetary Proposal') bill.add_action('lower', 'Introduced', '2010-01-20') • when run locally writes JSON to disk, when we run it can be imported to mongodb
  4. Technical Approach: Backend • all scrapers are run via a

    Hudson instance ◦ can use EC2 instances on-demand • data goes into MongoDB • exposed via simple API and we push dumps to S3
  5. Project Status • ~6 months ago public beta of our

    API ◦ http://openstates.sunlightlabs.com/api/ • 13 states "ready" and 11 "experimental" ◦ AK, CA, MD, MN, NJ, NC, OH, PA, TX, UT, VT, WI ◦ AZ, CT, FL, IN, MI, MS, NV, SD, VA, WA, DC • Lots of data ◦ 4012 legislators ◦ 123,239 bills ◦ 1,001,473 actions ◦ 96383 votes • All 50 by early 2012
  6. Use the data • Get Sunlight API Key @ http://services.sunlightlabs.com/

    • API Docs: http://openstates.sunlightlabs.com/api/ • RESTful JSON-based API ◦ state metadata ◦ bills, legislators, committees ▪ search by attribute or lookup by ID ◦ legislator lookup by lat+long • govkit ruby gem http://github.com/opengovernment/govkit • python-openstates http://github.com/sunlightlabs/python-openstates openstates.Legislator.search(state='ca', first_name='Mike') openstates.Bill.search('agriculture', state='vt')
  7. Adopt a State • If you have python/scraping skills or

    want a good project to learn on ◦ Contributor Guide http://goo.gl/CceA ◦ Join the Google Group: http://goo.gl/M3An ◦ http://github.com/sunlightlabs/openstates