Strata SG 2017 - Open Budgets India: Lessons from the front line

Strata SG 2017 - Open Budgets India: Lessons from the front line

We are sharing some of our key lessons in building Open Budgets India from our journey of last two years and discuss some best practices to do Data-for-Good projects.

33325ff5fafdf8849195687d12abf30b?s=128

Gaurav Godhwani

December 07, 2017
Tweet

Transcript

  1. Open Budgets India: Lessons from the front line Gaurav Godhwani

    | @gggodhwani
  2. Government budgets explain real priorities and values of the state

    and its people
  3. Budgets are hard to consume & difficult to understand But..

  4. Major issues with India’s Budgets • Scattered and unstructured PDF

    documents • Limited availability of Budgets online • Inconsistent Formats • No Metadata • Inconsistent and incomplete Budget Codes aka Unique IDs
  5. But these problems are common across all public information systems

    & for civic-tech projects
  6. Primary Education Public Health Judiciary Agriculture Drinking Water & Sanitation

    Energy
  7. MAJOR LESSONS

  8. LESSON #1 Invest on Problem Munging

  9. 150+ Budget Source Websites

  10. 150+ Budget Formats

  11. Collaborate with Communities

  12. LESSON #2 Explore existing Data Platforms

  13. Understand Existing Data Platforms

  14. LESSON #3 Go the Agile Way

  15. Process Development Cycle

  16. LESSON #4 Build a Robust Pipeline

  17. Data Pipeline Scrape Parse Transform Publish Analyse

  18. Parse - Line-based Segmentation table_bounds = { "top": …, "left":

    …., "bottom": ..., "right": … } column_ coordinates = [c1, c2, c3, ... , cN]
  19. https://github.com/tabulapdf/tabula { Table Attributes } Parse - Line-based Segmentation

  20. But..

  21. Parse - Block-based Segmentation

  22. Clean Machine Readable Data https://openbudgetsindia.org/api/action/datastore_search?resource_id= 38e553a0-4dd9-46f5-8d62-4938e1f7df3d

  23. LESSON #5 Keep everything Open-by-default

  24. Keeping Code, Data, Research, Design - All Open https://github.com/cbgaindia

  25. LESSON #6 Enable Data Consumption

  26. Educate https://openbudgetsindia.org/budget-basics/union-budget.html#money-flow

  27. Simplify http://unionbudget2017.cbgaindia.org/

  28. Compare https://cbgaindia.github.io/story-generator/

  29. Compare https://cbgaindia.github.io/story-generator/

  30. Enable Replication https://datakind-blr.github.io/antara/

  31. Customize

  32. Customize

  33. LESSON #7 Document Everything!

  34. https://github.com/cbgaindia/parsers

  35. LESSON #8 Track Various Data Adoptions

  36. State of Aadhaar Report - Social Protection, May 2017 http://stateofaadhaar.in/wp-content/uploads/State-of-Aadhaar-Ch5-Social-Protection.pdf

  37. The Huffington Post http://www.huffingtonpost.in/vineet-john-samuel/the-gorakhpur-and-farrukhabad-tragedies-are-symptoms -of-a-larger-malaise_a_23194697/

  38. Collaborate Help us to: • Use our tools to analyse

    your data & write your data stories • Generate more Open Government Data in your Geography • Help us improve these algorithms and evolve Codebase We are open to new ideas, suggestions and feedback
  39. Attributions • https://pixabay.com/p-330580/ - AkshayaPatra Foundation at Pixabay [CC0] •

    https://flic.kr/p/9bybv7 - United Nations Photo - Maternal Health in Developing Countries at Flickr [CC BY 2.0] • https://commons.wikimedia.org/wiki/File%3ASupreme_Court_of_India_-_200705_(edited).jpg - Legaleagle86 at en.wikipedia [CC BY-SA 3.0 (https://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons • https://commons.wikimedia.org/wiki/File%3AAgriculture_main.jpg - By Meera'rah (Own work) [CC BY-SA 4.0 (https://creativecommons.org/licenses/by-sa/4.0)], via Wikimedia Commons • https://pixabay.com/en/faucet-fountain-water-dispenser-1684902/ [CC0] • https://www.meetup.com/DataKind-Bangalore/photos/26368975/460442903/ - DataKind Bangalore, Project Accelerator • https://heaven00.github.io/pycon_delhi_2017/#/ -Jayant Pahuja, PyCon Delhi - 2017 • https://www.zopyx.com/andreas-jung/contents/integrating-sphinx-documentation-into-a-pyramid-application/image - Sphinx Logo • https://www.shareicon.net/react-js-logo-react-js-117367 - ReactJS Logo • http://newprolab.com/en/dataengineer/img/logos/supersetcolor.png - Apache Superset Logo
  40. Thanks Code: https://github.com/cbgaindia Email: gaurav.godhwani@gmail.com @gggodhwani @OpenBudgetsIn @CBGAIndia @DataKindBLR