Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Reactive Patterns and Distributed Systems - OUGN 2018

Reactive Patterns and Distributed Systems - OUGN 2018

Presented by me and my colleague Arturo Viveros (https://twitter.com/gugalnikov) at OUGN 2018.

Jorge Quilcate

March 08, 2018
Tweet

More Decks by Jorge Quilcate

Other Decks in Technology

Transcript

  1. data on the outside vs data on the inside “Going

    [from monolithic architecture] to SOA is like going from Newton’s physics to Einstein’s physics. Newton’s time marched forward uniformly with instant knowledge at a distance. Before SOA, distributed computing strove to make many systems look like one with RPC, 2PC, etc [...]
  2. data on the outside vs data on the inside [...]

    In Einstein’s universe, everything is relative to one’s perspective. SOA has “now” inside and the “past” arriving in messages.” - Pat Helland
  3. data on the outside vs data on the inside “perhaps

    we should rename the “extract microservice” refactoring operation to “change model of time and space” ;).” - Adrian Colyer
  4. What is Back Pressure? • Systems should be designed and

    implemented to react assertively under sustained load • This is often neglected although such conditions are usually expected and highly predictable • Negotiating back pressure usually requires some sort of Flow Control • Which in simpler terms is the ability to say “hey, slow down!!”, or even “stop it!!”
  5. There are Several types of Back Pressure • Fast publisher

    / slow subscriber • Insufficient buffer • Accumulation without discharge • Packet size variation • Clogged channel • Etc.
  6. Dealing with Back Pressure within Integrations is quite Important… Failing

    to account for it can have a “domino effect”, overloading and eventually crippling many of the participating systems.
  7. Meterpoint Data Collection IoT Devices Customer Management Work Order Management

    Delivery Experience Web Data Warehouse System AQ REST / FTP SOAP Event Streams REST CQRS DSO Central Authorities Grid Companies REST Mobile The more Complex the Integration, the more Important (and challenging) it is to Handle Back Pressure
  8. There are Several ways to Handle Back Pressure… • Dynamic

    push / pull • Static / Dynamic Quotas • Buffers • Ack / Nack • Throttling • Caching • Circuit breaker • Scaling • Some / all of the above • Etc…
  9. But make sure you are doing something to handle it

    “Simply put, an unchecked Kafka consumer is a DoS attack on your application waiting to happen” -https://medium.com/@petermelias/kafka-consumer-patterns-and-gotchas-1bfc04cd643b “Unbounded queues are fundamentally broken because it puts the correctness of your system in someone else's hand” -@ztellman
  10. And Always Take into Account System’s Reactions to Back Pressure…

    are highly contagious (Newton’s 3rd law) can propagate exponentially (butterfly effect)
  11. So, a Robust Solution for Back Pressure / Flow Control

    should always be non-blocking requires some complex service choreography
  12. Backpressure with Akka Streams + Kafka • Akka Streams ◦

    Ideal for Fast Data ◦ Seamless Integration ◦ Abstraction to configure end-to-end Backpressure ◦ Fully Asynchronous & non-blocking • Kafka ◦ Dynamic durable buffer ◦ Horizontally scalable ◦ Load distribution ◦ No data loss settings (Producer, Consumer, Broker) ◦ Quotas
  13. “With the use of asynchronous and back-pressured APIs, we’re able

    to push our systems to their limits, but not beyond them” -Why Reactive?, O’Reilly, Konrad Malawski
  14. Demo: Back-Pressure -Fast Producer, Fast Consumers -Fast Producer, Fast and

    Slow Consumers -Fast Producer, Fast and Slow Consumers with buffering: OverflowStrategy.dropNew -Fast Producer, Fast and Slow Consumers with buffering: OverflowStrategy.backpressure
  15. data input # ETL * Batches of records (Bounded sequence)

    * Time inside records * Denormalized View * Stage Tables, Files # EAI * Individual Messages * Batches are discouraged * MQ, Web Services * Normalized View # Stream Processing * Streams of Events (Unbounded sequence) * Event Log
  16. data processing (i.e. vetro) # ETL * Transforming operational model

    into target schema. * Denormalization # EAI * Stateless processing * Remote Lookups (e.g. DB queries) * Cache for performance (e.g. Coherence) * Imperative approach / Side effects # Streaming Platform * Stateless and *Stateful* processing * Local Materialized views / Avoid remote Lookups * Functional approach / No side effects
  17. data output # ETL * Data Warehouse # EAI *

    Operational back-ends # Streaming Platform * Both (Lambda Architecture)
  18. # Event-Driven Services * embedded library in any application *

    event-log + your application * makes stream processing accessible to any use case (very low latency) Stream Processing visions # Big data, Real-Time Map-Reduce * central CLUSTER * custom packaging, deployment & monitoring * suitable for analytics-type use-cases (+/- low latency)
  19. immutability changes everything “transaction logs records ALL THE CHANGES made

    to the database [...] THE TRUTH IS IN THE LOG. the database is a cache of a subset of log.” Immutability changes everything - Pat Helland http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
  20. From Caches to Materialized Views r = cache.get(key) if (!r)

    { r = db.get(key) cache.put(key, r) } return k App Cache Database
  21. From Caches to Materialized Views * cache invalidation? * race

    conditions? * consistency issues? * cold start/bootstrapping? App Cache Database
  22. From Caches to Materialized Views Traditional Views: * *procrastinating* approach

    create view example from select foo from bar …; => rewrite result at query time create materialized view example from select foo from bar …; Materialized Views: * *proactive* approach
  23. materialized views in an *unbounded database* Turning the Database Inside

    Out - Martin Kleppmann https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/
  24. Kafka Streams: Streams-Relational Processing Platform Build Services on a Backbone

    of Events - Ben Stopford https://www.confluent.io/blog/build-services-backbone-events/
  25. Observability is for **unknown unknowns** Is about making a system

    more: * Debuggable: tracking down failures and bugs * Understandable: answer questions, trends A Superset of: * Monitoring: how to operate a system * Instrumentation: how to develop a system to be monitoriable Observability for Emerging Infra: What Got You Here Won't Get You There - Charity Majors https://www.youtube.com/watch?v=1wjovFSCGhE