Stream data processing, data pipeline architecture, unified log system, event sourcing, CQRS, Complex Event Processing — these are just some of the names for an approach to system design that emphasizes and embraces that data *flow* like a stream. There’s been a recent surge in discussions about building systems this way, from LinkedIn to Confluent to Yahoo. There are lots of fascinating and inspiring articles, books, and conference talks, but many of them are bold, broad, and fundamental. There’s a dearth of guidance on the nuts and bolts of actually building systems this way. So in the spirit of Go this talk will start to fill that gap a bit, from the bottom up.
I’ll describe an existing queue-based data processing system at Timehop that was starting to break down, and the steps we took to replace it with a stream-based system. We will discuss the overall dataflow of each system and review the Go code used to interface with Kinesis and process the streaming data.