Upgrade to Pro — share decks privately, control downloads, hide ads and more …

City of #BigData

City of #BigData

Presentation delivered multiple times describing City of Chicago innovation and technology programs.

Tom Schenk Jr

May 04, 2017
Tweet

More Decks by Tom Schenk Jr

Other Decks in Technology

Transcript

  1. Source: techplan.cityofchicago.org IN CHICAGO, WE BELIEVE THAT THE POWER OF

    TECHNOLOGY IS DRIVEN BY THE PEOPLE WHO USE AND BENEFIT FROM IT.
  2. Data on potholes are reported by residents and city staff

    through the 311 system, which is then reported on the City’s #opendata portal—updated daily. data.cityofchicago.org
  3. Chicago has released more #opendata, including important items such as

    crimes, permits, tickets, taxi trips, and most popular library items. data.cityofchicago.org/view/caas-knxs
  4. In 2012, Chicago issued an executive order which formalized the

    #opendata portal, endowed powers to the Chief Data Officer, created an advisory committee to advise on the expansion of new datasets, and required an annual open data report. Executive Order 2012-02
  5. The City will continue to increase and improve the quality

    of City data available internally and externally, and facilitate methods for analyzing that data to help create a smarter and more efficient city. Increase & Improve City Data techplan.cityofchicago.org/initiatives-by-strategy/effective-government/initiative-14/
  6. #OPENDATA PROVIDES A MEANS TO CREATE AN ECOSYSTEM AROUND DATA,

    WHICH INCLUDES MULTIPLE STAKEHOLDERS AND INITIATIVES THAT EXTEND BEYOND TRANSPARENCY.
  7. NLC issued a report discussing the role of Chicago’s leadership

    in developing a leading #opendata portal. The first chapter reviews Chicago’s open data program and its benefits to the city, residents, and others. National League of Cities
  8. - National League of Cities, p. 22 “Open data initiatives

    are an increasingly popular component of governance. At the national level, Chicago’s #opendata initiative has been held up as a model for cities that are seeking to start their own open data programs.” Image © 2012 National League of Cities
  9. Civic Tech Community Chicago has a large, vibrant, productive, civic

    community built around #opendata. This is led by Chicago residents interested in technology and society that, along with non-profits, help Chicagoans. © Tom Schenk Jr, 2016. CC-BY
  10. Using #opendata, this service developed by the civic community alerts

    individuals to street sweeping activity by providing email, text, or calendar alerts. sweeparound.us
  11. Chicago Flu Shots uses #opendata to easily find flu-shot locations

    across Chicago. The code was created by a volunteer is open source so the site was adopted by Boston, Philadelphia, and San Francisco. chicagoflushots.org
  12. The City of Chicago has a number of high-quality research

    universities and groups willing to engage in projects with the city. We can leverage #opendata portal and data itself to create cooperative relationships. Academia
  13. Array of Things arrayofthings.github.io University of Chicago has partnered with

    multiple institutions to build a mesh network of small sensors, dubbed the Array of Things, that will frequently post data for public consumption. © 2015 University of Chicago
  14. The Array of Things will provide hyper-local, temporal data on

    using a variety of sensors: ▪Sensors measuring sound and vibration ▪Low-resolution infrared cameras measuring sidewalk temperature ▪Climate and environmental data, such as air-quality and temperature Array of Things © 2016 University of Chicago
  15. An #opensource platform which allows you to explore events such

    as 311 calls, crimes, permits, inspections, DIVVY trips in an interactive map. This software can be used by the public and an internal version drives situational awareness. OpenGrid.io
  16. UNDERGROUND INFRASTRUCTURE IS HIT ON AVERAGE EVERY 60 SECONDS. THE

    TOTAL COST TO THE NATIONAL ECONOMY IS ESTIMATED TO BE $1.6 BILLION
  17. Using off-the-shelf DSLR cameras, photos are stitched together to create

    a 3-D model of the city’s underground infrastructure. City Digital, City of Chicago and a consortium of partners are piloting the tech. Underground Map
  18. #Predictions Using advanced research techniques to forecast and predict events

    in the city. #Optimization Optimizing the allocation of resources across the city to for a more efficient city. #Evaluation Evaluate programs, including the effectiveness of advanced analytics.
  19. City of Chicago found 31 factors that predicted when and

    where rodent complaints are most likely in the next week. We used spatial-temporal relationships to create these #predictions, which started as an investigation of over 350 different factors. Spatial Correlation Temporal Correlation
  20. The #predictions generate a list of likely locations and published

    to an internal site used to route preventative baiting crews to bait likely locations.
  21. Chicago leveraged the #opendata portal to share data with external

    researchers, leveraging the city’s premiere method of sharing data and saving time on data- sharing agreements to create #predictions. Using #opendata
  22. Establishments with previous critical or serious violations Three-day average high

    temperature Nearby garbage and sanitation complaints Nearby burglaries Whether establishments has tobacco or alcohol license Length of time since last inspection Length of time establishment has been operating Inspector assigned The model predicts the likelihood of a food establishment having a critical violation, a violation most likely to lead to food borne illnesses. Over a dozen #opendata sources were used to help define the model. Ultimately, ten different variables proved to create #predictions of critical violations. Significant Predictors
  23. The #predictions revealed an opportunity to find deliver results faster.

    Within the first half of work, 69% of critical violations would have been found by inspectors using a data- driven approach. During the same period, only 55% of violations were found using the status quo method. Critical violations Data-driven Status quo 0% 10% 20% 30% 40% 50% 60% 70%
  24. The food inspection model is able to deliver results faster.

    After comparing a data-driven approach versus the status quo, the rate of finding violations was accelerated by an average of 7.4 days in the 60 day pilot. That means the #predictions led to more violations would be found sooner by inspectors. IMPROVEMENT 7 days
  25. OPTIMIZING FOOD INSPECTIONS The #predictions let inspectors discover violations sooner

    will reduce the risk of patrons becoming ill, which helps reduce medical expenses, lost time at work, and even a some fatalities.
  26. The data science team has built a website which lets

    CDPH prioritize inspections based on #predictions.
  27. The analytical model will be released as an open source

    project on GitHub, allowing other cities to study or even adopt the model in their respective cities. No other city has released their analytic models before this release. #OPENSOURCE
  28. WEST NILE VIRUS • Between 5 and 884 human cases

    reported annually in Illinois since 2002 • 2,371 confirmed human infections since 2002 • Most people who become infected with West Nile virus never develop any symptoms • About 1 in 5 people who are infected will develop flu like symptoms • Less than 1% of people who are infected will develop a serious neurologic illness
  29. PREVENTION & COLLECTION • The Chicago Department of Public Health

    (CDPH) uses a multi-pronged approach to fight the spread of WNV – Larvicide in stormwater drains – DNA tests of mosquitoes (pictured) – Spraying when WNV is present
  30. DNA MONITORING • At any given time there are 60+

    traps in Chicago collecting (mostly) Culex Pipiens and Culex Restuans mosquitoes • The traps are collected twice / week • Batches of up to 50 mosquitoes are DNA tested • The data is published on https://data.cityofchicago.org/ • The results and model predictions are displayed in WindyGrid
  31. Chicago partnered with Robert Wood Johnson Foundation to offer $40,000

    in prize money for #hackathon participants to predict WNV cases. Top three winners had to make their code #opensource. Kaggle Competition
  32. WE WERE ABLE TO IDENTIFY WNV ONE WEEK IN ADVANCE

    IN OUT OF SAMPLE DATA 78% OF THE TIME, AND OUR PREDICTION WAS CORRECT 65% OF THE TIME
  33. #OPENDATA #OPENSCIENCE How might research, when combined with #opendata and

    #engagement with researchers look for a municipal government? It would resemble the on- going #openscience movement. #OPENSOURCE #ENGAGEMENT
  34. Chicago Beaches More than 20 million visitors visit the beach

    each season across 30 beaches that stretch along the 15 mile shoreline. During the summer, beaches exceed allowable E. coli thresholds around 150 times.
  35. MODEL EVALUATION 0 2 4 6 8 10 12 14

    Hit Rate Percent Benchmark New Model
  36. Journals are mis-aligned to applied city problems. Peer-review is unwelcoming

    to those outside of academia. We often choose blog posts and pre-print servers to distribute findings. https://doi.org/10.1101/250480 Peer-review?
  37. Based out of Harvard’s Kennedy School of Government, Civic Analytics

    Network is an association of 22 Chief Data Officers from the United States frequently discuss collaboration and coordination for open data and analytics. Civic Analytics Network
  38. THANK YOU Contact Info: Websites: Tom Schenk Jr. Chief Data

    Officer City of Chicago @ChicagoCDO [email protected] data.cityofchicago.org digital.cityofchicago.org opengrid.io techplan.cityofchicago.org speakerdeck.com/tomschenkjr