Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Making sense of stream processing

Making sense of stream processing

Talk given at /dev/winter 2014, Cambridge, UK. http://devcycles.net/2015/winter/sessions/index.php?session=8

Abstract:

Some people call it stream processing. Others call it Event Sourcing or CQRS. Some even call it Complex Event Processing. Sometimes, such self-important buzzwords are just smoke and mirrors, invented by consultants to sell you stuff. But sometimes, they contain a kernel of wisdom which can really help us design better systems.

In this talk, we will go in search of the wisdom behind the buzzwords. We will discuss how event streams can help make your application more scalable, more reliable and more maintainable. Founded in the experience of building large-scale data systems at LinkedIn, and implemented in open source projects like Apache Samza, stream processing is finally coming of age.

Martin Kleppmann

January 24, 2015
Tweet

More Decks by Martin Kleppmann

Other Decks in Programming

Transcript

  1. Input (write) Output (read) { “user_id”: 17055506, “timestamp”: 1421777578, “status”:

    “Hello world!” } { “timeline_id”: 17055506, “tweets”: [ { “tweet_id”: 557595969962127360, “username”: “hotnumbers”, “name”: “Hot Numbers Coffee”, “timestamp”: 1421777123, “status”: “Open till 9pm tonight”, “picture_url”: “http://twimg.com/…” }, { “tweet_id”: 557515007622414337, … }, … ] }
  2. SELECT tweets.*, users.* FROM tweets JOIN users ON users.id =

    tweets.sender_id JOIN follows ON follows.followee_id = users.id WHERE follows.follower_id = $user ORDER BY tweets.time DESC LIMIT 100;
  3. Input (write) Output (read) { “user_id”: 10152654725303061, “action”: “like”, “item_id”:

    10101851078206231 } { “post_id”: 10101851078206231, “author”: { “name”: “Mark Zuckerberg”, “username”: “zuck”, “photo_url”: “http://fbcdn.akamai…” }, “post_text”: “You can’t kill an idea.\n…”, “timestamp”: 1421025628, “total_likes”: 160213, “total_shares”: 6027, “top_comments”: [ {“name”: “Saida Maaoui”, …}, … ], … }
  4. Input (write) Output (read) { “user”: “Foo”, “edit_timestamp”: 1421777578, “text”:

    “Elliptic curve cryptography (ECC) is an approach to [[public-key cryptography]] based on the algebraic structure of [[elliptic curve]]s over [[finite field]]s. …” } { “text”: “Elliptic curve cryptography (ECC) is an approach to [[public-key cryptography]] based on the algebraic structure of [[elliptic curve]]s over [[finite field]]s. …” }
  5. Input (write) Output (read) { “user_id”: 12526586, “action”: “add-job-to-profile”, “job_info”:

    { “job_title”: “Author”, “company”: “O’Reilly”, “start_date”: “2013-08”, “end_date”: null, “description”: “…” } } accountant → 168929, 929431, … administrative → 481143, 937298, … actor → 656468, 807204, 894765, … advertising → 702221, 715066, … airline → 71955, 215020, 545045, … animal → 107553, 478445, 720498, … auction → 770989, 833569, … author → 218037, 755543, … banker → 408729, 758862, … biotechnology → 106272, 228421, … business → 22388, 539165, … chef → 94341, 363176, 365579, … college → 459788, 830339, … computer → 598379, 693195, … …