Resilient distributed systems with Netflix Hystrix
Hystrix is a latency and fault tolerance library designed to enable resilience in distributed systems where failure is inevitable.
Oleksiy will introduce this library and show how and why they use it in mission critical projects.
o preventing any single dependency from using all container(Tomcat, etc) user threads o shedding load and failing fast instead of queueing o providing fallbacks wherever feasible to protect users from failure o Real-time metrics and monitoring
public class CommandHelloWorld extends HystrixCommand { private final String name; public CommandHelloWorld(String name) { super(HystrixCommandGroupKey.Factory.asKey("ExampleGroup")); this.name = name; } @Override protected String run() { // a real example would do work like a network call here return "Hello " + name + "!"; } } @Test public void testExecute() { assertEquals("Hello World!", new CommandHelloWorld("World").execute()); }
Primary datagrid App container (jetty, tomcat, etc) Spring Session Filter sacrificing consistency to availability Secondary datagrid WAN replication fallback You might not need this If the entire infrastructure replicated in another DC
.execute() Circuit- breaker open? .run() .getFallback() Return result of run() Thread pool rejected ? execution fails? timeout no no yes, short-circuit yes, reject yes yes no no
.execute() Circuit- breaker open? .run() .getFallback() Return result of run() Thread pool rejected ? execution fails? timeout Fallback successful ? no no yes, short-circuit yes, reject yes yes no no
.execute() Circuit- breaker open? .run() .getFallback() Return result of fallback() Return result of run() Thread pool rejected ? execution fails? timeout Fallback successful ? no no yes, short-circuit yes, reject yes yes yes no no
.execute() Circuit- breaker open? .run() .getFallback() Return result of fallback() Return exception Return result of run() Thread pool rejected ? execution fails? timeout Fallback successful ? no no yes, short-circuit yes, reject yes yes yes no no no
o Resilience can be a strong requirement o Distributed systems are complex o Isolate your dependencies o It’s not only about microservices, but very applicable there o Circuit Breaker is your friend o Monitoring is a must o Use