Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Hysterie in verteilten Systemen - Hystrix im Einsatz

Hysterie in verteilten Systemen - Hystrix im Einsatz

Benjamin Wilms

March 29, 2017
Tweet

More Decks by Benjamin Wilms

Other Decks in Programming

Transcript

  1. Agenda Resilience Timeout Bulkheads Circuit Breaker Simian Army Hystrix Fallback

    Configuration Dynamic Configuration Chaos Monkey Archaius Metrics Stream Dashboard Turbine RabbitMQ Spring Boot Admin
  2. Patterns of Resilience Timeout Bulkheads Circuit Breaker Steady State Fail

    Fast Handshaking Test harness Decoupling middleware
  3. Patterns of Resilience Timeout (Hystrix) Bulkheads (Hystrix) Circuit Breaker (Hystrix)

    Steady State Fail Fast (Hystrix) Handshaking (Hystrix) Test harness Decoupling middleware
  4. Semaphore vs. Threads Threads Timeout Handling Isolation vom Aufrufer Fallbacks

    Circuit Breaker Semaphore keine Isolation vom Aufrufer Fallbacks Circuit Breaker (Counting)
  5. Hystrix Overhead 10% > 3 ms Overhead 1% > 9

    ms Overhead 10+ Millarden Requests Quelle: Netflix
  6. Service Discovery - Eureka System Status DS Replicas Instances currently

    registered with Eureka Application AMIs Availability Zones Status ADDRESS-SERVICE n/a (2) (2) UP (2) - 2008f3e0f6fb:address-service (http://172.19.0.10:8080/info) , ad0992a451bc:address-service (http://172.19.0.13:8080/info) BOOKING-SERVICE n/a (2) (2) UP (2) - 181daeac41ee:booking-service (http://172.19.0.14:8080/info) , 1a19b053c2b3:booking-service (http://172.19.0.11:8080/info) CONNOTE-SERVICE n/a (2) (2) UP (2) - 8f95af38afa3:connote-service (http://172.19.0.15:8080/info) , 5d7d6317b504:connote-service (http://172.19.0.9:8080/info) CUSTOMER-SERVICE n/a (2) (2) UP (2) - 528b8d4c0960:customer-service (http://172.19.0.16:8080/info) , 5771c7be2118:customer-service (http://172.19.0.12:8080/info) HYSTRIX-TURBINE- DASHBOARD n/a (1) (1) UP (1) - a583a53203e9:hystrix-turbine-dashboard:8080 (http://172.19.0.7:8080/info) SPRING-BOOT-ADMIN n/a (1) (1) UP (1) - cf908142d5ff:spring-boot-admin (http://172.19.0.6:8080/info) Environment test Data center default Current time 2017-03-29T05:45:56 +0000 Uptime 00:06 Lease expiration enabled true Renews threshold 22 Renews (last min) 48 localhost (http://localhost:8761/eureka/)
  7. Transport Service - Booking Request code/booking-create.js var request = require("request");

    var moment = require('moment'); var startTime = moment(); // request var jsonRequest = { "customerId": 1, "sender-country": "de", "sender-city": "Solingen", "sender-postcode": "42697", "sender-street": "Hochstraße", "sender-streetnumber": "11", 1 2 3 4 5 6 7 8 9 10 11 12
  8. Dynamische Konfiguration zur Laufzeit Archaius dynamic & typed thread safe

    polling framework callback mechanism dynamic configuration
  9. Chaos Monkey aktivieren code/archaius_demo/step_1.js var request = require("request"); var baseUrl

    = "http://0.0.0.0:2379/v2/keys/hystrix"; var data = {value: "true"}; request({ url: baseUrl + '/chaos.monkey.active', method: 'PUT', qs: data, }, function(error, response, body){ if(error) { console.log(error); } else { 1 2 3 4 5 6 7 8 9 10 11 12
  10. Transport Service - Booking Request (Chaos) code/booking-create.js var request =

    require("request"); var moment = require('moment'); var startTime = moment(); // request var jsonRequest = { "customerId": 1, "sender-country": "de", "sender-city": "Solingen", "sender-postcode": "42697", "sender-street": "Hochstraße", "sender-streetnumber": "11", 1 2 3 4 5 6 7 8 9 10 11 12
  11. Mögliche Fallback Strategien? Caching Messaging Stubbed Fallback - static defaults

    Scaling docker-compose scale booking-service=2 connote-service=2 customer-service=2 address-service=2
  12. Service Discovery - Eureka System Status DS Replicas Instances currently

    registered with Eureka Application AMIs Availability Zones Status ADDRESS-SERVICE n/a (2) (2) UP (2) - 2008f3e0f6fb:address-service (http://172.19.0.10:8080/info) , ad0992a451bc:address-service (http://172.19.0.13:8080/info) BOOKING-SERVICE n/a (2) (2) UP (2) - 181daeac41ee:booking-service (http://172.19.0.14:8080/info) , 1a19b053c2b3:booking-service (http://172.19.0.11:8080/info) CONNOTE-SERVICE n/a (2) (2) UP (2) - 8f95af38afa3:connote-service (http://172.19.0.15:8080/info) , 5d7d6317b504:connote-service (http://172.19.0.9:8080/info) CUSTOMER-SERVICE n/a (2) (2) UP (2) - 528b8d4c0960:customer-service (http://172.19.0.16:8080/info) , 5771c7be2118:customer-service (http://172.19.0.12:8080/info) HYSTRIX-TURBINE- DASHBOARD n/a (1) (1) UP (1) - a583a53203e9:hystrix-turbine-dashboard:8080 (http://172.19.0.7:8080/info) SPRING-BOOT-ADMIN n/a (1) (1) UP (1) - cf908142d5ff:spring-boot-admin (http://172.19.0.6:8080/info) Environment test Data center default Current time 2017-03-29T05:45:56 +0000 Uptime 00:06 Lease expiration enabled true Renews threshold 22 Renews (last min) 48 localhost (http://localhost:8761/eureka/)
  13. Feature toggle - aktiviere "Second Service Call" code/enable_second_call.js var request

    = require("request"); var baseUrl = "http://0.0.0.0:2379/v2/keys/hystrix"; var data = {value: "true"}; request({ url: baseUrl + '/second.service.call', method: 'PUT', qs: data, }, function(error, response, body){ if(error) { console.log(error); } else { 1 2 3 4 5 6 7 8 9 10 11 12
  14. Transport Service - Booking Request (Second Service Call) code/booking-create.js var

    request = require("request"); var moment = require('moment'); var startTime = moment(); // request var jsonRequest = { "customerId": 1, "sender-country": "de", "sender-city": "Solingen", "sender-postcode": "42697", "sender-street": "Hochstraße", "sender-streetnumber": "11", 1 2 3 4 5 6 7 8 9 10 11 12
  15. Hystrix Command Timeout - Connote code/timeouts.js var request = require("request");

    var baseUrl = "http://0.0.0.0:2379/v2/keys/hystrix"; request({ url: baseUrl + '/hystrix.command.ConnoteServiceClient.execution.isolation.thread.timeoutInMillisecon ds', method: 'PUT', qs: {value: "300"}, }, function(error, response, body){ console.log(response.statusCode, body); }); 1 2 3 4 5 6 7 8 9 10
  16. Transport Service - Booking Request (Second Service Call) code/booking-create.js var

    request = require("request"); var moment = require('moment'); var startTime = moment(); // request var jsonRequest = { "customerId": 1, "sender-country": "de", "sender-city": "Solingen", "sender-postcode": "42697", "sender-street": "Hochstraße", "sender-streetnumber": "11", 1 2 3 4 5 6 7 8 9 10 11 12
  17. Hystrix Monitoring & Metriken via REST Endpoint via JMX via

    Logfile via Elastic, Splunk und Co. via Messaging
  18. REST Endpoint Zugriff via HystrixCommandMetrics: Result: HystrixCommandMetrics.getInstance(ConnoteRESTCommand.CONNOTE_KEY) { averageExecutionTime: 0,

    commandGroupKey: "ConnoteRESTCommandGroupKey", commandKey: "ConnoteCommand", concurrentExecutionCount: 0, errorCount: 1, healthCounts: "HealthCounts[1 / 1 : 100%]", totalRequests: 1 }
  19. Hystrix Dashboard Spring Boot Admin Spring Boot applications Filter 

    Application ▲ / URL Version Info Status  ADDRESS-SERVICE 2 UP  BOOKING-SERVICE 2 UP  CONNOTE-SERVICE 2 UP  CUSTOMER-SERVICE 2 UP HYSTRIX-TURBINE-DASHBOARD (63b55a23) http://172.19.0.7:8080 UP SPRING-BOOT-ADMIN (adddb55b) http://172.19.0.6:8080 UP TRANSPORT-API-GATEWAY (2ed68569) http://172.19.0.8:8080 UP ZIPKIN-SERVICE (260fe666) http://172.19.0.5:9411 UP Reference Guide (https://codecentric.github.io/spring-boot-admin/1.4.5) - Sources (https://github.com/codecentric/spring-boot-admin) - Code licensed under Apache License 2.0 (http://www.apache.org/licenses/LICENSE-2.0)  Details    Details    Details    Details   APPLICATIONS JOURNAL ABOUT
  20. Hystrix Metriken via RabbitMQ Metriken werden via RabbitMQ bereitgestellt Aggregation

    via Turbine als Consumer Hystrix Dashboard verarbeitet den Turbine Stream
  21. Transport Service - Booking Request Simulation code/booking-create-simulate.js var request =

    require("request"); var moment = require('moment'); var startTime = moment(); // request var jsonRequest = { "customerId": 1, "sender-country": "de", "sender-city": "Solingen", "sender-postcode": "42697", "sender-street": "Hochstraße", "sender-streetnumber": "11", 1 2 3 4 5 6 7 8 9 10 11 12
  22. Hystrix Dashboard & Turbine Stream Circuit Thread Pools Hystrix Stream:

    http://localhost:8989/ Sort: Error then Volume | Alphabetical | Volume | Error | Mean | Median | 90 | 99 | 99.5 Success | Short-Circuited | Timeout | Rejected | Failure | Error % tran...ustomerServiceClient 50.0 % 2 0 3 5 0 Hosts 1 90th 1019ms Median 15ms 99th 2002ms Mean 516ms 99.5th 2002ms Host: 1.0/s Cluster: 1.0/s Circuit Closed tran...AddressServiceClient 25.0 % 3 0 1 12 0 Hosts 1 90th 1010ms Median 10ms 99th 2003ms Mean 290ms 99.5th 2003ms Host: 1.6/s Cluster: 1.6/s Circuit Closed tran...BookingServiceClient 0.0 % 0 0 0 4 0 Hosts 1 90th 320ms Median 17ms 99th 611ms Mean 139ms 99.5th 611ms Host: 0.5/s Cluster: 0.5/s Circuit Closed book...ConnoteServiceClient 41.5 % 3 0 0 4 0 Hosts 2 90th 350ms Median 9ms 99th 458ms Mean 100ms 99.5th 458ms Host: 0.4/s Cluster: 0.7/s Circuit Closed Sort: Alphabetical | Volume | AddressServiceClientGroup Active 3 Max Active 2 Queued 0 Executions 16 Pool Size 10 Queue Size 5 Host: 1.6/s Cluster: 1.6/s CustomerServiceClientGroup Active 1 Max Active 2 Queued 0 Executions 10 Pool Size 10 Queue Size 5 Host: 1.0/s Cluster: 1.0/s ConnoteServiceClientGroup Active 2 Max Active 4 Queued 0 Executions 7 Pool Size 20 Queue Size 2 Host: 0.7/s Cluster: 1.4/s BookingServiceClientGroup Active 0 Max Active 1 Queued 0 Executions 5 Pool Size 10 Queue Size 5 Host: 0.5/s Cluster: 0.5/s
  23. Finished Vielen Dank für Eure Aufmerksamkeit! FEEDBACK & FRAGEN? GitHub:

    Short URL: https://github.com/MrBW/resilient-transport- service https://goo.gl/cIqNoi