$30 off During Our Annual Pro Sale. View Details »

Open source inspired workflows 
for open (and closed) geospatial data

Ben Balter
October 15, 2015

Open source inspired workflows 
for open (and closed) geospatial data

Open source software is produced by distributed teams rarely in the same place at the same time, rarely working on the same thing at the same time, yet they consistently produce better results than their closed source and proprietary counterparts. A big part of their continued success is open source's unique workflow and tools. What would it look like to treat geospatial data with the same respect that open source developers treat their code? What if we took the tools and workflows of open source and used them to create collaborative geospatial tools and datasets, even if that data was ultimately never public?

Ben Balter

October 15, 2015
Tweet

More Decks by Ben Balter

Other Decks in Technology

Transcript

  1. !
    Open source inspired workflows 

    for open (and closed) geospatial data
    @benbalter
    [email protected]
    government.github.com

    View Slide

  2. !
    1. How open source got here
    2. Where geodata is today
    3. What data publishers can learn

    View Slide

  3. !
    Open source ≠ published code

    View Slide

  4. !
    Open source as a philosophy

    View Slide

  5. !
    Open Source (software)

    software that can be freely used, modified, and shared 

    (in both modified and unmodified form) by anyone

    View Slide

  6. !
    Open Source

    a philosophy of collaboration in which
    working materials are made available online
    for anyone to fork, modify, discuss, and contribute to.

    View Slide

  7. !
    Open source as a workflow

    View Slide

  8. !
    Open source workflows must be
    location- and time-agnostic

    View Slide

  9. !
    Wikipedia v. Encyclopedia Britannica

    View Slide

  10. !
    A brief history of 

    open source tooling

    View Slide

  11. !
    In the beginning, we had drawers

    View Slide

  12. PDP-1
    (or so I’m told)

    View Slide

  13. Open source at
    the Tech Model
    Railroad Club

    View Slide

  14. !
    We eventually upgraded to email

    View Slide

  15. !
    ...and FTP

    View Slide

  16. !
    ...and the read-only web

    View Slide

  17. !
    1. Had to be there
    2. Had to know them
    3. Had to be perfect

    View Slide

  18. !
    This is where (geo)data is today

    View Slide

  19. !
    Computering is hard

    View Slide

  20. View Slide

  21. !
    You are constantly one character
    away from crashing the entire site

    View Slide

  22. Version Control
    * 2d96cfe - (HEAD, tag: v3.1.1, origin/master, origin/HEAD, master) :gem: bump (43 minutes ago)
    * f4b446b - remove stray backtick (44 minutes ago)
    * 83599e3 - Merge branch 'master' of https://github.com/benbalter/g-man (46 minutes ago)
    |\
    | * 42514ea - Merge pull request #61 from devscott/laxco (50 minutes ago)
    | |\
    | | * 072d9b5 - Adding in additional entry for La Crosse County, WI (54 minutes ago)
    | |/
    * | 1e95d95 - remove unresolvable domains (46 minutes ago)
    * | 1a8645a - remove uwyo.edu/CES (86 minutes ago)
    |/
    * 70410ba - Merge pull request #60 from jpmckinney/canada (2 hours ago)
    |\
    | * a77ad43 - Use consistent comments for Canada hosts (2 hours ago)
    | * 1776e45 - Add more Canadian hosts (2 hours ago)
    * | 05211a0 - Merge pull request #58 from mitio/bulgarian-government-domains (3 hours ago)
    |\ \
    | * | fe8f862 - Add Bulgaria's government main domain (3 hours ago)
    | |/
    * | 85d0c7b - Merge pull request #59 from mitio/fix-readme-typos (3 hours ago)
    |\ \
    | |/
    |/|
    | * f558a90 - Add missing word in the readme (3 hours ago)

    View Slide

  23. !
    Version control tracks
    who made what change when

    View Slide

  24. !
    Author
    publishes
    User
    downloads
    User finds
    a bug
    User
    submits 

    a patch

    View Slide

  25. !
    Is this a bug?
    Has anyone else experienced this?
    Is this the best solution?

    Can someone help test this?
    Can you release a new version with the fix?
    Email all the things!

    View Slide

  26. !
    The instructions to contribute to
    the linux kernel is 50,000 words

    View Slide

  27. !
    Collaborative version control
    Decentralized & social

    View Slide

  28. !
    Decentralized
    Everyone has the opportunity to contribute

    View Slide

  29. !
    Social
    Everything happens in the open and by people

    View Slide

  30. !
    Standardized
    Don't need to RTFM

    View Slide

  31. !
    Captures and exposes process
    Proposed alternatives, what decision was made, why

    View Slide

  32. !
    Is this a bug?
    Has anyone else experienced this?
    Is this the best solution?

    Can someone help test this?
    Can you release a new version with the fix?
    (Google and then) post all the things!

    View Slide

  33. !
    "Anyone is encouraged to contribute to
    the project by forking and submitting a pull
    request. (If you are new to GitHub, you
    might start with a basic tutorial.)"
    Contributing to whitehouse/petitions

    View Slide

  34. !
    Open source ≠ published code

    View Slide

  35. !
    Open source ≠ published code

    View Slide

  36. !
    Open source ≠ published code

    View Slide

  37. !
    Three lessons geodata can 

    learn from open source

    View Slide

  38. !
    1. Prefer open formats
    to increase potential for data consumers

    View Slide

  39. !
    Purpose-built, proprietary tools

    View Slide

  40. View Slide

  41. Shapefile GeoJSON

    View Slide

  42. !
    2. Adopt shared standards

    View Slide

  43. View Slide

  44. !
    Foster Open standards

    View Slide

  45. View Slide

  46. !
    3. Free tightly held data

    View Slide

  47. !
    Published data

    View Slide

  48. View Slide

  49. !
    Open data

    View Slide

  50. View Slide

  51. !
    Collaborative data

    View Slide

  52. View Slide

  53. !
    Bonus: collaborative tools

    View Slide

  54. View Slide

  55. !
    Your first (or second) step

    View Slide

  56. !
    Don't make multi-year, multi-million
    dollar investments on a hunch for
    what developers might find useful

    View Slide

  57. !
    1. Establish a "data" repository

    View Slide

  58. !
    2. Publish all the data you can

    View Slide

  59. View Slide

  60. View Slide

  61. View Slide

  62. View Slide

  63. !
    Open source inspired workflows 

    for open (and closed) geospatial data
    @benbalter
    [email protected]
    government.github.com

    View Slide

  64. ‣ PDP-1 — flickr.com/photos/hiddenloop/307119987/
    ‣ Punch Card Decks — mehul panchal, via Wikimedia Commons
    Photo credits

    View Slide