Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Data Pipelines As Software Structures

Sean Braithwaite
June 13, 2017
35

Data Pipelines As Software Structures

Thesis: Data Pipelines emerge and grow to reflect collaboration between domains and are impeded by incidental coordination.

Sean Braithwaite

June 13, 2017
Tweet

Transcript

  1. Data Pipelines The software structures which emerge to process and

    disseminate information. A connected set of map reduce jobs for loading data into (a) database(s).
  2. Why Data Pipelines? To integrate diverse perspectives. Enable and empower

    collaboration between diverse sets of domain experts.
  3. Thesis: Data Pipelines emerge and grow to reflect collaboration between

    domains and are impeded by incidental coordination.
  4. Emily Green Omid Aladini S e b a s t

    i a n O h m F ro n x Wurmus Matthias Georgi Thank You David Whiting Lorand Kasler Gavin Bell Jon Glover Erik Bartels