Slide 1

Slide 1 text

Data Pipelines as Software Structures Berlin Buzzwords 2017 @brapse

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

Data Pipelines The software structures which emerge to process and disseminate information. A connected set of map reduce jobs for loading data into (a) database(s).

Slide 4

Slide 4 text

No content

Slide 5

Slide 5 text

Why Data Pipelines? To integrate diverse perspectives. Enable and empower collaboration between diverse sets of domain experts.

Slide 6

Slide 6 text

How do we build them? Often Badly. Misunderstood domain. Misunderstood integration. Misunderstood coordination.

Slide 7

Slide 7 text

Thesis: Data Pipelines emerge and grow to reflect collaboration between domains and are impeded by incidental coordination.

Slide 8

Slide 8 text

Abstract

Slide 9

Slide 9 text

Evolution

Slide 10

Slide 10 text

Bounded Context

Slide 11

Slide 11 text

Storage

Slide 12

Slide 12 text

Teams

Slide 13

Slide 13 text

Nested splits Teams > Contexts > Storage

Slide 14

Slide 14 text

Mapping

Slide 15

Slide 15 text

Mapping Storage Boundaries

Slide 16

Slide 16 text

Mapping Context Boundaries

Slide 17

Slide 17 text

Mapping Team Boundaries

Slide 18

Slide 18 text

Coordination

Slide 19

Slide 19 text

Coordinating Definitions

Slide 20

Slide 20 text

Coordinating Correctness

Slide 21

Slide 21 text

Coordinating Failure

Slide 22

Slide 22 text

Concrete

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

Correctness

Slide 25

Slide 25 text

Fingerprints

Slide 26

Slide 26 text

Fingerprints

Slide 27

Slide 27 text

Fingerprints

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

Coordinate Change

Slide 31

Slide 31 text

Coordinate Change

Slide 32

Slide 32 text

Coordinate Change

Slide 33

Slide 33 text

Convergence

Slide 34

Slide 34 text

Failure

Slide 35

Slide 35 text

Retroactivity

Slide 36

Slide 36 text

Mutation

Slide 37

Slide 37 text

Mutation

Slide 38

Slide 38 text

Immutability

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Conclusions Data pipelines enable coordination but require shared protocols to determine when and how we read and write data.

Slide 41

Slide 41 text

Emily Green Omid Aladini S e b a s t i a n O h m F ro n x Wurmus Matthias Georgi Thank You David Whiting Lorand Kasler Gavin Bell Jon Glover Erik Bartels