Reactive Patterns and Distributed Systems - OUGN 2018

reactive patterns and distributed systems [email protected] [email protected]

agenda • Intro to Reactive Systems • Back-Pressure • Stream-Processing
• Observability

Reactive Systems

data on the outside vs data on the inside “Going
[from monolithic architecture] to SOA is like going from Newton’s physics to Einstein’s physics. Newton’s time marched forward uniformly with instant knowledge at a distance. Before SOA, distributed computing strove to make many systems look like one with RPC, 2PC, etc [...]

data on the outside vs data on the inside [...]
In Einstein’s universe, everything is relative to one’s perspective. SOA has “now” inside and the “past” arriving in messages.” - Pat Helland

data on the outside vs data on the inside “perhaps
we should rename the “extract microservice” refactoring operation to “change model of time and space” ;).” - Adrian Colyer

distributed systems & complexity

“we want more, of everything, and we want it now”
- YOUR CUSTOMERS

Reactive Manifesto Value Form Means

Adopting Reactive Patterns (i.e. Ivy Pattern) -Why Reactive?, O’Reilly, Konrad
Malawski

Back-Pressure

What is Back Pressure? • Systems should be designed and
implemented to react assertively under sustained load • This is often neglected although such conditions are usually expected and highly predictable • Negotiating back pressure usually requires some sort of Flow Control • Which in simpler terms is the ability to say “hey, slow down!!”, or even “stop it!!”

There are Several types of Back Pressure • Fast publisher
/ slow subscriber • Insufficient buffer • Accumulation without discharge • Packet size variation • Clogged channel • Etc.

Dealing with Back Pressure within Integrations is quite Important… Failing
to account for it can have a “domino effect”, overloading and eventually crippling many of the participating systems.

Meterpoint Data Collection IoT Devices Customer Management Work Order Management
Delivery Experience Web Data Warehouse System AQ REST / FTP SOAP Event Streams REST CQRS DSO Central Authorities Grid Companies REST Mobile The more Complex the Integration, the more Important (and challenging) it is to Handle Back Pressure

There are Several ways to Handle Back Pressure… • Dynamic
push / pull • Static / Dynamic Quotas • Buffers • Ack / Nack • Throttling • Caching • Circuit breaker • Scaling • Some / all of the above • Etc…

But make sure you are doing something to handle it
“Simply put, an unchecked Kafka consumer is a DoS attack on your application waiting to happen” -https://medium.com/@petermelias/kafka-consumer-patterns-and-gotchas-1bfc04cd643b “Unbounded queues are fundamentally broken because it puts the correctness of your system in someone else's hand” -@ztellman

And Always Take into Account System’s Reactions to Back Pressure…
are highly contagious (Newton’s 3rd law) can propagate exponentially (butterfly effect)

So, a Robust Solution for Back Pressure / Flow Control
should always be non-blocking requires some complex service choreography

Backpressure with Akka Streams + Kafka • Akka Streams ◦
Ideal for Fast Data ◦ Seamless Integration ◦ Abstraction to configure end-to-end Backpressure ◦ Fully Asynchronous & non-blocking • Kafka ◦ Dynamic durable buffer ◦ Horizontally scalable ◦ Load distribution ◦ No data loss settings (Producer, Consumer, Broker) ◦ Quotas

“With the use of asynchronous and back-pressured APIs, we’re able
to push our systems to their limits, but not beyond them” -Why Reactive?, O’Reilly, Konrad Malawski

Demo: Back-Pressure -Fast Producer, Fast Consumers -Fast Producer, Fast and
Slow Consumers -Fast Producer, Fast and Slow Consumers with buffering: OverflowStrategy.dropNew -Fast Producer, Fast and Slow Consumers with buffering: OverflowStrategy.backpressure

Stream Processing and Event-Driven Services

“MAKING SURE **ALL DATA** ENDS UP IN THE **RIGHT PLACES**”
data integration

data input # ETL * Batches of records (Bounded sequence)
* Time inside records * Denormalized View * Stage Tables, Files # EAI * Individual Messages * Batches are discouraged * MQ, Web Services * Normalized View # Stream Processing * Streams of Events (Unbounded sequence) * Event Log

data processing (i.e. vetro) # ETL * Transforming operational model
into target schema. * Denormalization # EAI * Stateless processing * Remote Lookups (e.g. DB queries) * Cache for performance (e.g. Coherence) * Imperative approach / Side effects # Streaming Platform * Stateless and *Stateful* processing * Local Materialized views / Avoid remote Lookups * Functional approach / No side effects

data output # ETL * Data Warehouse # EAI *
Operational back-ends # Streaming Platform * Both (Lambda Architecture)

# Event-Driven Services * embedded library in any application *
event-log + your application * makes stream processing accessible to any use case (very low latency) Stream Processing visions # Big data, Real-Time Map-Reduce * central CLUSTER * custom packaging, deployment & monitoring * suitable for analytics-type use-cases (+/- low latency)

One more reason to start with event-driven stream processing

state and events: “the future is a function of the
past”

immutability changes everything “transaction logs records ALL THE CHANGES made
to the database [...] THE TRUTH IS IN THE LOG. the database is a cache of a subset of log.” Immutability changes everything - Pat Helland http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf

From Caches to Materialized Views r = cache.get(key) if (!r)
{ r = db.get(key) cache.put(key, r) } return k App Cache Database

From Caches to Materialized Views * cache invalidation? * race
conditions? * consistency issues? * cold start/bootstrapping? App Cache Database

From Caches to Materialized Views Traditional Views: * *procrastinating* approach
create view example from select foo from bar …; => rewrite result at query time create materialized view example from select foo from bar …; Materialized Views: * *proactive* approach

materialized views in an *unbounded database* Turning the Database Inside
Out - Martin Kleppmann https://www.confluent.io/blog/turning-the-database-inside-out-with-apache-samza/

Kafka Streams: Streams-Relational Processing Platform Build Services on a Backbone
of Events - Ben Stopford https://www.confluent.io/blog/build-services-backbone-events/

Demo: Comparing a EAI Integration with Stream Processing

observability

Observability is for **unknown unknowns** Is about making a system
more: * Debuggable: tracking down failures and bugs * Understandable: answer questions, trends A Superset of: * Monitoring: how to operate a system * Instrumentation: how to develop a system to be monitoriable Observability for Emerging Infra: What Got You Here Won't Get You There - Charity Majors https://www.youtube.com/watch?v=1wjovFSCGhE

Observability methods

> demo: understanding what your systems are and what they
could become

sysco reactive integration platform

Thanks! github.com/sysco-middleware/talk-observability-tracing-kafka

Reactive Patterns and Distributed Systems - OUG...

Reactive Patterns and Distributed Systems - OUGN 2018

More Decks by Jorge Quilcate

Other Decks in Technology

Featured

Transcript