Speaker Deck

Building a Data Pipeline with Clojure and Kafka

by David Pick

Published October 28, 2014 in Programming

At some point in every large software application's lifetime, it must turn to service-oriented architecture to deal with complexity. This often involves separating data between applications and creating a way for those applications to talk to each other. Inevitably, pieces of the system end up needing to know more about the shape of the data in the main application (e.g. a data warehouse and search) than a separate piece of architecture should. In order to combat this issue, Braintree developed a data pipeline built on PGQ, Kafka, Zookeeper, and Clojure. In this talk, David Pick will give an in-depth review of how the data pipeline functions and talk through some of the issues encountered along the way.