Who Am I
Data Science & Engineering
* Machine Learning
* Distributed Systems
* Computation Art
Slide 2
Slide 2 text
Mechanics of Data
Pipelines
Slide 3
Slide 3 text
Outline
* Problem motivation @ SC
* Proposed properties
* Mechanics
* Open problems
Slide 4
Slide 4 text
SoundCloud
* ~300 employees
* Most working with data
* Product driven
* Microservices
Slide 5
Slide 5 text
Organisations which design systems ... are constrained to produce designs
which are copies of the communication structures of these organisations
— M. Conway
Conway's law
Slide 6
Slide 6 text
No content
Slide 7
Slide 7 text
No content
Slide 8
Slide 8 text
Counting service
* 10^6 reads/sec
* 10^4 writes/sec
* Maintains counts for all time
Slide 9
Slide 9 text
Challenges
* Counts are subject to spam
* Deleting data is painful
* Can’t do full lambda
Materialised Views
* Ephemeral representation
* Non durable
* Asynchronous
Slide 40
Slide 40 text
Data Discovery
* Consumers/Producers
* Service which coordinates jobs
* Movable data sets
Slide 41
Slide 41 text
No content
Slide 42
Slide 42 text
* Conways law is real
* Design as data structure
* Abstract and apply
Conclusion
Slide 43
Slide 43 text
Emily Green Omid Aladini
S e b a s t i a n O h m F ro n x
Wurmus Matthias Georgi
Thank You David Whiting
Lorand Kasler Gavin Bell Jon
Glover Erik Bartels