Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Scalability and resilience in practice: current trends and opportunities

Julien Ponge
October 03, 2019

Scalability and resilience in practice: current trends and opportunities

Presentation at the 38th International Symposium on Reliable Distributed Systems (SRDS 2019) in Lyon, France.

Julien Ponge

October 03, 2019
Tweet

More Decks by Julien Ponge

Other Decks in Research

Transcript

  1. SRDS Industrial Session
    October 2019 — Lyon, France
    Scalability and resilience in practice:
    current trends and opportunities
    Dr Julien Ponge
    Principal Software Engineer
    Dr Mark Little
    VP Middleware Engineering
    1

    View full-size slide

  2. The following content reflects the views of the authors, not
    necessarily those of Red Hat.
    They do not constitute in any way a binding or legal agreement
    or impose any legal obligation or duty on Red Hat.
    This information is provided for discussion purposes only and
    is subject to change for any or no reason.
    2
    Disclaimer

    View full-size slide

  3. Today’s topic
    3
    Scalability + Resilience
    (in practice)
    ❌ Deep learning
    ❌ AI
    ❌ Dark mode
    ❌ Blockchain

    View full-size slide

  4. 4
    Modern
    Distributed
    Systems
    “You can’t ignore the network anymore…”

    View full-size slide

  5. Sample use-case
    5
    "
    Walk 10k steps every day

    Wear a pedometer

    Be congratulated!

    View full-size slide

  6. 7
    User profile service
    Activity service
    Ingestion service
    Public API
    User webapp Dashboard webapp
    Event stats service
    Congrats service
    Kafka topics
    MongoDB SMTP
    PostgreSQL
    AMQP HTTP
    HTTP
    HTTP HTTP
    ActiveMQ
    Artemis
    Event-driven
    micro-services

    View full-size slide

  7. Elasticity and application state
    8
    Persistent & replicated state
    Micro-service
    (or “function”)
    Events
    Streams
    Other services

    View full-size slide

  8. Elasticity and application state
    9
    Persistent & replicated state
    Micro-service
    (or “function”)
    Events
    Streams
    State boundaries + life-time
    Idem-potency?
    Other services

    View full-size slide

  9. Service mesh: connect, secure, control and observe
    10

    View full-size slide

  10. Density is key
    11
    From https://kubernetes.io/docs/concepts/overview/what-is-kubernetes/

    View full-size slide

  11. 12
    Reactive
    Systems
    “Searching for resource-efficiency”

    View full-size slide

  12. Reactive
    systems
    Reactive
    streams
    Reactive
    programming
    Reactive
    “Responding to stimuli”
    Manifesto, Actor, Messages
    Resilience, Elasticity, Scalability,
    Asynchronous, non-blocking
    Data flow
    Back-pressure
    Non-blocking
    Data flow
    Events, Observable
    Spreadsheets
    Akka, Vert.x Akka Streams, RxJava,
    Reactor, Vert.x
    Reactor, Reactive Spring,
    MS Excel, RxJava, Vert.x

    View full-size slide

  13. Reactive Manifesto
    14
    Message Driven
    Elastic Resilient
    Responsive
    Asynchronous, location-transparent
    Isolation,
    replication
    Start / stop
    instances
    Consistent latency

    View full-size slide

  14. 15
    x 1000 =
    Async I/O to the rescue!

    View full-size slide

  15. Isolation and error management
    17
    Service Database
    Request

    Can we still provide a response?
    Cached data
    Default response
    …or just a timely error

    View full-size slide

  16. Damage control with a circuit breaker
    18
    Database

    Circuit breaker
    Closed Open
    Half-open
    fail (threshold reached)
    call
    reset timeout
    fail
    success
    success
    fail (below threshold)

    View full-size slide

  17. 19
    Reactive toolkit for the JVM
    All kinds of distributed services
    Resource-friendly
    Fast

    View full-size slide

  18. 20
    Related
    Research
    & Opportunities

    View full-size slide

  19. 21
    Async is hard(er)

    (callback hell is just one facet)
    Image from https://adrianalonso.es/desarrollo-web/apis/trabajando-con-promises-pagination-promise-chain/

    View full-size slide

  20. Taming asynchronous operations
    22
    Promise / Future
    Reactive extensions
    Coroutines / fibers
    1 item (or none)
    Hot / cold streams,
    back-pressure*,
    functional combinators
    Async disguised as
    regular imperative,
    rewritten as
    continuations
    while(stream.hasNext()) {
    stream.fetchNextElement()
    .then(this67storeInDb)
    .then(this67incrementDistributedCounter)
    .catch(this67handleError);
    }
    stream.toFlowable()
    .flatMap(db67store)
    .flatMap(distributedCounter67increment)
    .timeout(5, SECONDS)
    .retry(3)
    .subscribe(this67onNext, this67onError,
    this67onComplete());
    try {
    while(stream.hasNext()) {
    item = stream.next();
    db.store(item);
    distributedCounter.increment();
    }
    } catch (Throwable err) {
    weHaveAProblem(err);
    }
    only with reactive-streams implementations,
    not in the original Erik Meijer paper
    *

    View full-size slide

  21. Language and runtime
    23
    Coroutines and reactive extensions
    do not solve all problems
    Asynchronous abstractions
    in programming languages remains an interesting topic!
    Deadlocks
    Soundness
    Expressiveness
    Back-pressure tuning
    Memory exhaustion
    Error handling
    (…)

    View full-size slide

  22. Compilation and runtime
    24
    JVM
    Open world assumption
    Speculative code generation
    Peak performance
    Needs more RAM
    Native images
    Closed world assumption
    No JIT compiler
    Boots fast
    Needs less RAM
    GraalVM from Oracle Labs
    (with Linz University and more)
    OpenJDK
    Backed by 20+ years of research

    View full-size slide

  23. Powered by and more!

    View full-size slide

  24. Distributed consensus
    26
    { active_users : 32168 }
    [123, “Lyon”, 10000]
    Replicated data stores Discovery, global state, …

    View full-size slide

  25. Raft
    27
    “Paxos, but with bolts and
    nuts to implement it”

    View full-size slide

  26. Flexible Paxos (Heidi Howard)
    28

    View full-size slide

  27. These are exciting times
    for research and practice
    in distributed systems!
    29

    View full-size slide

  28. linkedin.com/company/red-hat
    youtube.com/user/RedHatVideos
    facebook.com/redhatinc
    twitter.com/RedHat
    Red Hat is the world’s leading provider of enterprise
    open source software solutions. Award-winning
    support, training, and consulting services make 

    Red Hat a trusted adviser to the Fortune 500.
    Thank you
    30

    View full-size slide