Slide 1

Slide 1 text

Autor: Claudio Oliveira | [email protected] Autor: Matheus Moraes | [email protected] Data: 06/05/2019 Incorpore a Supernanny para seus microservices Entenda o que eles andam aprontando em produção

Slide 2

Slide 2 text

▰ Microservices ▰ North/South vs East/West Traffic ▰ Monitoring ▰ Observability ○ Logs ○ Traces ○ Metrics ▰ Kiali ▰ Demo AGENDA

Slide 3

Slide 3 text

whoami I am Claudio de Oliveira Book Author, Speaker , Software Architect and Developer @sensedia Spring, Java, Microservices and Docker enthusiast

Slide 4

Slide 4 text

whoami I am Matheus Moraes Developer & Speaker @sensedia Java, NoSQL and Microservices enthusiast

Slide 5

Slide 5 text

sensedia.com Microservices

Slide 6

Slide 6 text

1 The term "Microservice Architecture" ... there are certain common characteristics around organization around business capability... https:/ /martinfowler.com/articles/microservices.html

Slide 7

Slide 7 text

Microservices architecture is not about technologies and frameworks is about How to Scale Business

Slide 8

Slide 8 text

https://www.thoughtworks.com/pt/insights/blog/applying-conways-law-improve-your-software-development ‘ Conway’s law

Slide 9

Slide 9 text

it means...

Slide 10

Slide 10 text

Microservices are in general Distributed systems

Slide 11

Slide 11 text

an user’s transaction happens in different services

Slide 12

Slide 12 text

Usually these transactions happens over http:// protocol

Slide 13

Slide 13 text

sensedia.com

Slide 14

Slide 14 text

in the east/west direction, inside in our microservices infrastructure

Slide 15

Slide 15 text

sensedia.com

Slide 16

Slide 16 text

sensedia.com Our microsservices

Slide 17

Slide 17 text

like children, our microservices can...

Slide 18

Slide 18 text

they can go down

Slide 19

Slide 19 text

do not scale

Slide 20

Slide 20 text

they can get overloaded

Slide 21

Slide 21 text

memory leak

Slide 22

Slide 22 text

they may be unhealthy

Slide 23

Slide 23 text

sensedia.com Monitoring

Slide 24

Slide 24 text

“Monitoring is the practice of collecting signals, telemetry, traces, etc and aggregating them, and matching them against some pre-defined criteria of system states we should carefully watch. When we find that one of our signals has crossed a threshold and may be heading toward a known bad state, we take action to remedy the system - Christian Posta https://www.manning.com/books/istio-in-action

Slide 25

Slide 25 text

Monitoring is a subset of Observability

Slide 26

Slide 26 text

sensedia.com Observability

Slide 27

Slide 27 text

The term Observability sprung around in the software community

Slide 28

Slide 28 text

“Observability on the other hand supposes up front that our systems are highly unpredictable and we cannot know all of the possible failure modes up front https://www.manning.com/books/istio-in-action

Slide 29

Slide 29 text

We need to collect much more data, even high-cardinality data like userIDs, requestIDs, source IPs, etc where the entire set could be exponentially large https://www.manning.com/books/istio-in-action “

Slide 30

Slide 30 text

a user goes to pay for the items in their cart and experiences a 10s delay choosing a payment option. All of the pre-defined metric thresholds (disk usage, queue depth, machine health, etc) might be at acceptable levels https://www.manning.com/books/istio-in-action “

Slide 31

Slide 31 text

With observability, we need fine-grained data “ https://www.manning.com/books/istio-in-action

Slide 32

Slide 32 text

sensedia.com Pillars

Slide 33

Slide 33 text

The 3 pillars of observability Do I have a problem? Dashboarding, Trending & Problem Detection Where exactly is my problem? Cross-Service Debug & Performance Optimization What is causing it? Root Cause & Forensics Logs Traces Metrics 1 2 3

Slide 34

Slide 34 text

sensedia.com Logs

Slide 35

Slide 35 text

Who What When Where Why INFO DEBUG WARNING ERROR

Slide 36

Slide 36 text

sensedia.com [INFO ][2019-05-28 00:51:18][de4c1b04-9ca1][c.e.m.g.u.domain.service.PaymentsService] - finding user by id 140708 [WARN ][2019-05-28 00:51:18][de4c1b04-9ca1][c.e.m.g.u.domain.service.PaymentsService] - user non-cached, calling user service [ERROR][2019-05-28 00:51:18][de4c1b04-9ca1][c.e.m.g.u.domain.service.PaymentsService] - error when calling /users/140708 org.springframework.web.server.ResponseStatusException: 404 NOT_FOUND

Slide 37

Slide 37 text

We need to aggregate the logs

Slide 38

Slide 38 text

sensedia.com

Slide 39

Slide 39 text

Do I have a problem? Dashboarding, Trending & Problem Detection Where exactly is my problem? Cross-Service Debug & Performance Optimization What is causing it? Root Cause & Forensics Logs Traces Metrics 1 2 3

Slide 40

Slide 40 text

sensedia.com Distributed Tracing

Slide 41

Slide 41 text

We need to understand the application's behavior and be able to troubleshoot problems. https://microservices.io/patterns/observability/distributed-tracing.html

Slide 42

Slide 42 text

sensedia.com

Slide 43

Slide 43 text

Assign external unique request id passes it to all services that are involved includes the request id in log messages Record times information e.g start and end time https://microservices.io/patterns/observability/distributed-tracing.html

Slide 44

Slide 44 text

sensedia.com

Slide 45

Slide 45 text

sensedia.com

Slide 46

Slide 46 text

Solution should have minimal overhead External tool to analysis the data https://microservices.io/patterns/observability/distributed-tracing.html

Slide 47

Slide 47 text

We need an instrumentation standard. Introducing…..

Slide 48

Slide 48 text

Vendor-neutral APIs and instrumentation for distributed tracing #9 languages

Slide 49

Slide 49 text

We need a tool to enable us to analyze data. Introducing...

Slide 50

Slide 50 text

JAEGER Monitor and troubleshoot transactions in complex distributed systems https://www.jaegertracing.io/

Slide 51

Slide 51 text

sensedia.com https://www.jaegertracing.io/

Slide 52

Slide 52 text

sensedia.com

Slide 53

Slide 53 text

sensedia.com

Slide 54

Slide 54 text

Do I have a problem? Dashboarding, Trending & Problem Detection Where exactly is my problem? Cross-Service Debug & Performance Optimization What is causing it? Root Cause & Forensics Logs Traces Metrics 1 2 3

Slide 55

Slide 55 text

sensedia.com Metrics

Slide 56

Slide 56 text

Health Check ≠ Metrics

Slide 57

Slide 57 text

sensedia.com payments_http_requests_total{env="prod",method="POST",code="200",type="infra"} 647.0 payments_http_requests_total{env="prod",method="POST",code="400",type="infra"} 74.0

Slide 58

Slide 58 text

↑ Cumulative, increasing metric payments_technology_total{env="prod",method="nfc",type="business"} 152.0 Counter ↑ ↓ Gauge Single metric that goes up or down payments_hikaricp_connections_active{env="prod",type="infra"} 4.0 Timer Samples and buckets observation payments_crypto_seconds_count{env="prod",type="infra"} 77.0 payments_crypto_seconds_sum{env="prod",type="infra"} 34.97

Slide 59

Slide 59 text

sensedia.com

Slide 60

Slide 60 text

store metrics is not the responsibility of microservices

Slide 61

Slide 61 text

but …. Where is these metrics??? Prometheus

Slide 62

Slide 62 text

Prometheus is an open source, metrics-based monitoring system

Slide 63

Slide 63 text

sensedia.com

Slide 64

Slide 64 text

sensedia.com

Slide 65

Slide 65 text

We need dashboards more useful and beautiful than the previous one

Slide 66

Slide 66 text

The open platform for beautiful analytics and monitoring

Slide 67

Slide 67 text

sensedia.com

Slide 68

Slide 68 text

Do I have a problem? Dashboarding, Trending & Problem Detection Where exactly is my problem? Cross-Service Debug & Performance Optimization What is causing it? Root Cause & Forensics Logs Traces Metrics 1 2 3

Slide 69

Slide 69 text

sensedia.com

Slide 70

Slide 70 text

Kiali project provides answers to the questions: What microservices are part of my Istio service mesh? How are they connected? How are they performing?

Slide 71

Slide 71 text

sensedia.com

Slide 72

Slide 72 text

sensedia.com

Slide 73

Slide 73 text

sensedia.com DEMO

Slide 74

Slide 74 text

sensedia.com

Slide 75

Slide 75 text

sensedia.com

Slide 76

Slide 76 text

sensedia.com

Slide 77

Slide 77 text

and… the final Question..

Slide 78

Slide 78 text

Where has Supernanny been working on?

Slide 79

Slide 79 text

sensedia.com

Slide 80

Slide 80 text

sensedia.com

Slide 81

Slide 81 text

sensedia.com

Slide 82

Slide 82 text

sensedia.com sensedia.com

Slide 83

Slide 83 text

Kubernetes Prometheus Grafana Micrometer Metrics Fluentd Log Aggregator OpenTracing Jaeger Tracing Kiali Istio Sensedia API Platform