Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Unified Log at State

Mischa
May 20, 2015

Unified Log at State

How we have implemented a Unified Log at http://state.com/

Mischa

May 20, 2015
Tweet

More Decks by Mischa

Other Decks in Technology

Transcript

  1. I had read Jay Kreps’ blog post : “The Log:

    What every software engineer should know about real-time data's unifying abstraction” 3
  2. And knew it was the best way for us to

    move forward, to move faster, to engineer better systems … 4
  3. Product Market Fit • Efficiency and speed of iteration •

    Ability to be nimble • Ability to make changes quick • Ability to test out hypotheses 5
  4. Challenges • State is a startup • we burn cash

    • we have no revenue • pushy product team • super smart engineering team 6
  5. Desires • Decouple system architecture into individual standalone components •

    Enable engineers to have the freedom/flexibility to use the right tools for each and every problem • Framework / Pattern for building new components 8
  6. I wanted it to be fun (most importantly) 9 photo

    by https://www.flickr.com/photos/jdhancock/
  7. From a monolith … • Single repo codebase • Monolithic

    deploys requiring full regression testing • Unwieldily Ruby on Rails web application - battered through numerous cycles of product change 11
  8. … to a SOA • Multiple daily deploys • Individual,

    isolated, deployable web services • Engineers can choose the right tools for the tasks at hand • A formula for creating as many Micro-Services as needed 12
  9. Me • Mischa Tuffield http://mmt.me.uk/ • State CTO - Since

    Apr 2014 • 3rd Startup - Garlik, PeerIndex • PhD in CS - Capturing Autobiographical Metadata • Web and Data Geek • I ♥ Semantic Web • TBL is a legend (don’t doubt this) • ex-W3C contributor • Social Web XG Editor • RDF 1.1 Turtle Contributor • I also ♥ Star Wars and Batman • Am more of a Luke person than a Han person 15
  10. About State • Opinion Network • You shouldn't need a

    following to get heard • Exchange opinions with people around the world • Social Network structured like News where people share opinions on topics they care about • Founded by Alex & Mark Asseily (Jawbone & Skype) 16
  11. TL;DR • We created some workflows and infrastructure to allow

    for data to be persisted into our Unified Log as separate topics - e.g. Users, Opinions, “Well Said”s • We created a framework to allow the data in our Unified Log to be joined together to create new streams • We could move away from treating our database as the canonical source of truth for data - we are undecided • This allows the Engineers to use whatever technology suits the problem at hand, all they have to do is ensure that if their service dies it can recreated itself by replaying all of the data from its topic our Unified Log • We made use of segment.com as our analytics tool, unifying this concept of using an Event Based Unified Log 19
  12. State’s Unified Log • We have implemented our Log using


    Apache Kafka • Our Unified Log is our Data API for our Micro-Services • We have two types of Kafka topics • Shared Topics - Data API available to all Micro- Services • Internal Topics - not shared, contracted for use by individual Micro-Services only 21
  13. The take home message is Decoupling of Services In order

    to avoid Regressions & Spaghetti Code 22
  14. Operation Framework • We have defined a framework (Devops) for

    writing these Micro-Services. A well understood JSON over HTTP interface • 12 factor, JSON over HTTP 23
  15. The key requirement for this work was to continue Product

    Development We did NOT perform a rewrite of our stack 24
  16. And then we built our “Op Tailer” 27 • Which

    reads our Mongo DB “Replica Set Oplog” • It should be noted, that unlike MySQL where the replication data format isn’t an agreed API (see LinkedIn’s papers on their Log) and is susceptible to change. The Mongo Replica Set Oplog is just another Mongo Collection • It could change :)
  17. Kafka REST 29 • Kafka REST if a web service

    written by the folk at Confluent.io which writes data to a given Kafka topic
  18. 30 State API Backend Ruby on Rails Webapp Mongo DB

    iOS WWW Droid OpTailer User Topic (User Info Data) Kafka REST Unified Log Kafka … …
  19. Then we wrote a simple indexer 31 • Which took

    data from the Kafka topic and pushed it into Elastic Search
  20. 32 State API Backend Ruby on Rails Webapp Mongo DB

    iOS WWW Droid OpTailer User Topic (User Info Data) Kafka REST Unified Log Kafka Search Indexer Elastic Search … …
  21. Then we wrote a simple Search Service 33 • Which

    implements our Micro-Services Framework, based on Dropwizard (from the folk at Yammer)
  22. 34 State API Backend Ruby on Rails Webapp Mongo DB

    iOS WWW Droid OpTailer User Topic (User Info Data) Kafka REST Unified Log Kafka Search Indexer Elastic Search Search Service … …
  23. We are moving to V2 of our Search Service 35

    • This time we wanted to include the User’s Social Graph (which we call our Connections Graph) • But we had to join the data in our User Kafka topic with our Connections Graph • We used Apache Samza for this Task
  24. 36 State API Backend Ruby on Rails Webapp Mongo DB

    iOS WWW Droid OpTailer User Topic (User Info Data) Kafka REST User Connections (User Social Graph) Unified Log Kafka Search Indexer Samza Elastic Search Search Service User JOIN User Connections … …
  25. Segment.com • Buy instead of Build • Event based Analytics

    • Open-source Client APIs • Write once for the clients • can make use of different services • can be pushed back into product via Webhook Interface, obvs. into the Unified Log 37
  26. Gotchas • You need to be super good at Ops

    to get this stuff working • Schema Evolution is difficult - We are using Apache Avro and Schema Registry from the Confluent folk • Kafka needs to be fault tolerant - it is lower-level than a DB • Must agree upon a Devops Framework for writing these Micro- Services. A well understood JSON over HTTP interface 38
  27. THANK YOU to JD Hancock for these amazing stormtrooper photos

    (what a legend) all photos are shared under CC 2.0 39
  28. 40 Thank you for listening Questions ? Dan Harvey will

    be talking in more detail at the Hadoop User Group [email protected] @mischat