Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reliable_Event_Pipeline___scale.pdf

 Reliable_Event_Pipeline___scale.pdf

Ananth Packkildurai

February 27, 2019
Tweet

More Decks by Ananth Packkildurai

Other Decks in Programming

Transcript

  1. Ananth Packkildurai
    February 27, 2019
    1
    Reliable Events Pipeline

    View Slide

  2. Events
    “An event is a single
    occurrence within an
    environment, usually
    involving an attempted
    state change.”

    View Slide

  3. Logs
    “A log is a collection of event
    records”

    View Slide

  4. Logs @ Slack
    2M 4 3TB
    Events per second Kafka clusters Per hour

    View Slide

  5. Me
    ➢ @ananthdurai
    ➢ Data Infrastructure Engineer
    @ Slack
    ➢ Passionate about all things related
    to ethical data management

    View Slide

  6. Team REP
    Derek Smith Jackson Argo

    View Slide

  7. Public launch: 2014 1000+ employees across
    7 countries worldwide
    HQ in San Francisco
    $841M in capital raised
    Key investors include Softbank, Accel,
    a16z, Social Capital, Index, Thrive, GV,
    Kleiner Perkins, GGV, Horizons, Spark,
    IVP and DST.
    Diverse set of industries
    including software/technology, retail, media,
    telecom and professional services.
    About Slack

    View Slide

  8. An unprecedented adoption rate

    View Slide

  9. Data Decisions

    View Slide

  10. Growth Metrics

    View Slide

  11. Service Quality Metrics

    View Slide

  12. Billing Metrics

    View Slide

  13. How did we start?

    View Slide

  14. Is it reliable?

    View Slide

  15. REP Characteristics
    Trust in Logs

    View Slide

  16. REP Characteristics
    Trust in Logs
    High Availability

    View Slide

  17. REP Characteristics
    Trust in Logs
    High Availability
    Low Latency

    View Slide

  18. Efficient
    REP Characteristics
    Trust in Logs
    High Availability
    Low Latency

    View Slide

  19. Efficient
    REP Characteristics
    Trust in Logs
    High Availability
    Low Latency

    View Slide

  20. REP pipeline

    View Slide

  21. Murron:
    Murron is a sidecar running per instance
    based, collecting logs from host and
    containers
    ● Guarantee at least once message
    delivery
    ● Support retry, back pressure and
    configurable dynamic routing
    ● Support Grpc, TCP, Http & unix
    domain protocol
    Murron logging agent

    View Slide

  22. Murron Protocol

    View Slide

  23. UID

    View Slide

  24. Message Signature

    View Slide

  25. Container

    View Slide

  26. Log correctness
    Did we log correctly?
    Measuring Reliability
    Log reliability
    Are we missing any data?

    View Slide

  27. Log reliability

    View Slide

  28. Log reliability

    View Slide

  29. Log Inspector

    View Slide

  30. Pinot is a realtime distributed OLAP
    datastore
    ● A column-oriented database with various
    compression schemes such as Run Length,
    Fixed Bit Length
    ● Pluggable indexing technologies - Sorted
    Index, Bitmap Index, Inverted Index
    ● Near real time ingestion from Kafka and batch
    ingestion from Hadoop
    ● SQL like language that supports selection,
    aggregation, filtering, group by, order by,
    distinct queries on fact data.
    ● Horizontally scalable and fault tolerant
    Apache Pinot

    View Slide

  31. REP extended

    View Slide

  32. Log Inspector

    View Slide

  33. Thank You!
    33
    For more information go to: slack.com

    View Slide