Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Gwen Shapira on Realtime Data Processing at Facebook

Gwen Shapira on Realtime Data Processing at Facebook

Realtime data processing powers many use cases at Facebook, including realtime reporting of the aggregated, anonymized voice of Facebook users, analytics for mobile applications, and insights for Facebook page administrators. Many companies have developed their own systems; we have a realtime data processing ecosystem at Facebook that handles hundreds of Gigabytes per second across hundreds of data pipelines.

Many decisions must be made while designing a realtime stream processing system. In this paper, we identify five important design decisions that affect their ease of use, performance, fault tolerance, scalability, and correctness. We compare the alternative choices for each decision and contrast what we built at Facebook to other published systems.

Our main decision was targeting seconds of latency, not milliseconds. Seconds is fast enough for all of the use cases we support and it allows us to use a persistent message bus for data transport. This data transport mechanism then paved the way for fault tolerance, scalability, and multiple options for correctness in our stream processing systems Puma, Swift, and Stylus...

Papers_We_Love

June 26, 2017
Tweet

More Decks by Papers_We_Love

Other Decks in Technology

Transcript

  1. 1
    Papers We Love:
    Realtime Data Processing at Facebook
    Gwen Shapira
    Confluent Inc.

    View full-size slide

  2. 2
    Papers We Love:
    Realtime Data Processing at Facebook

    View full-size slide

  3. 3
    Published in 2016 (!)

    View full-size slide

  4. 4
    What kind of paper is this?

    View full-size slide

  5. 5
    This is NOT
    The one true architecture
    .
    Please don’t cargo-cult this paper

    View full-size slide

  6. 6
    Few real-time systems at Facebook
    • Chorus – aggregate trends
    • Realtime feedback for mobile app developers
    • Page analytics – likes, engagement…
    • Offload CPU-intensive dashboard queries

    View full-size slide

  7. 10
    Looking for trending topics in 5 minute windows

    View full-size slide

  8. 11
    The Tofu & Potatoes of the paper:
    Design Decisions

    View full-size slide

  9. 12
    / KafkaStreams
    + exactly
    once

    View full-size slide

  10. 13
    Decision #1 – Language Paradigm
    • Declarative (SQL) – easy & limited
    • Functional
    • Procedural (C++, Java, Python) –
    most flexibility, control, performance. Longer dev cycle.

    View full-size slide

  11. 14
    Decision #1 – Language Paradigm
    • Declarative (SQL) – easy & limited
    • Functional
    • Procedural (C++, Java, Python) –
    most flexibility, control, performance. Longer dev cycle.

    View full-size slide

  12. 15
    Decision #2: Data Transfer
    • RPC (Millwheel, Flink, SparkStreaming)
    • All about speed
    • Message-forwarding broker (Heron)
    • Applies back-pressure, multiplex
    • Persistent stream storage (Samza, Kafka’s Stream API)
    • Most reliable
    • Decouples processors

    View full-size slide

  13. 16
    Decision #2: Data Transfer

    View full-size slide

  14. 17
    Love Song to Scribe
    Independent stream processing nodes
    And storing inputs / outputs
    Made everything great

    View full-size slide

  15. 18
    Decision #3 – Processing Semantics

    View full-size slide

  16. 19
    Decision #3 – Processing Semantics
    Facebook Verdict: It depends on requirements
    • Ranker writes to idempotent system – at least once
    • Scuba can lose data, but not handle duplicates – at most once
    • …. Exactly once is REALLY HARD and requires transactions

    View full-size slide

  17. 20
    Don’t miss the side-note on side-effects
    • Exactly once means writing output + offsets to a transactional system
    • This takes time
    • Why just wait when you can deserialize? And maybe do other stateless stuff?

    View full-size slide

  18. 21
    Decision #4 – State Saving
    • In-memory state with replication (Old VoltDB)
    • Requires lots of hardware and network
    • Local database (Samza, Kafka Streams API)
    • Remote database (Millwheel)
    • Upstream (i.e. replay everything on failure)
    • Global consistent snapshot (Flink)

    View full-size slide

  19. 22
    Decision #4 – State Saving
    Facebook Verdict: It depends
    Rhode Island Alaska

    View full-size slide

  20. 23
    Best Part of the Paper – by far
    How to efficiently work with state in remote DB?

    View full-size slide

  21. 24
    Decision #5 - Reprocessing
    • Stream only – requires long retention in the stream store
    • Maintain both batch and stream systems
    • Develop systems that can run in streams and batch (Flink, Spark)

    View full-size slide

  22. 25
    Decision #5 - Reprocessing
    • Stream only – requires long retention in the stream store
    • Maintain both batch and stream systems
    • Develop systems that can run in streams and batch (Flink, Spark)
    Facebook Verdict:
    SQL runs everywhere
    And binary generation FTW

    View full-size slide

  23. 26
    Applications – Or a whirlwind tour of good patterns
    One example:

    View full-size slide

  24. 27
    Lessons Learned!
    The biggest win is pipelines composed of independent processors
    • Mixing multiple systems let us move fast
    • High level abstractions let us improve implementation
    • Ease of debugging – Independent nodes and ability to replay
    • Ease of deployment – Puma as-a-service
    • Ease of monitoring – Lag is the most important metric. Everything is
    instrumented out of the box.
    • In the future – auto-scale based on lag

    View full-size slide

  25. 28
    Thank You!

    View full-size slide