Microservices Observability Zup Open Talks

Slide 1

Slide 1 text

Microservices Observability

Slide 2

Slide 2 text

Hello! I am Cláudio Oliveira Technical Lead API Team Book Author @luizalabs Java, Golang, k8s & microservices 2

Slide 3

Slide 3 text

Agenda ● Metrics ● Distributed Tracing ● Logs ● Progressive Delivery ● Demos 3

Slide 4

Slide 4 text

Slide 5

Slide 5 text

Glossary 5 Telemetry How to collect data that will provide observability (sensors) Observability Monitoring, Alerting and Visualizations, Distributing tracing and Log Aggregation

Slide 6

Slide 6 text

Glossary 6 Monitoring Is the practice of collecting signals, aggregating them, and matching them against some predefined criteria

Slide 7

Slide 7 text

7 Microservices Drawbacks Sh*** happens

Slide 8

Slide 8 text

Fallacies of Distributed Computing

Slide 9

Slide 9 text

Fallacies of Distributed Computing 9 ● Network is Reliable ● Latency is Zero ● Bandwidth is Infinite

Slide 10

Slide 10 text

10 ● Understand how microservices connect each other ● Network latencies can be a bottleneck (intense IPC ● Network can be unreliable ● Control the UP and Running instances ● Increase the non-functional requirements Microservices implies “some” challenges

Slide 11

Slide 11 text

Slide 12

Slide 12 text

Metrics

Slide 13

Slide 13 text

13 Metrics are the only way to get your job done

Slide 14

Slide 14 text

14 RED pattern to monitor Services

Slide 15

Slide 15 text

15 R - the number of request per second

Slide 16

Slide 16 text

16 E - the number of failed requests per second

Slide 17

Slide 17 text

17 D - distributions of the amount of time each request takes

Slide 18

Slide 18 text

“ The benefits of treating each service the same, from a monitoring perspective, is scalability in your operations teams 18

Slide 19

Slide 19 text

Use case

Slide 20

Slide 20 text

Slide 21

Slide 21 text

Slide 22

Slide 22 text

Distributed Tracing

Slide 23

Slide 23 text

How it works??? 23 ● Assign external Unique ID ● Passes it to all services that are involved ● Includes the Request ID in Log Messages ● Record times information e.g start and end time

Slide 24

Slide 24 text

OpenTracing OpenTelemetry 24 ● Cloud Native Computing Foundation CNCF ● It standardizes the instrumentation of apps for distributed tracing

Slide 25

Slide 25 text

OpenTracing OpenTelemetry Concepts 25 ● Trace tells the story of a transaction ● Span represents a single call ● Distributed Tracing systems collecting and we can see the graph in a nice interface

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Logs

Slide 28

Slide 28 text

Use 5’s W!!!! 28 ● who ● what ● when ● where ● why

Slide 29

Slide 29 text

Use severity correctly!!! 29 ● INFO ● DEBUG ● WARNING ● ERROR

Slide 30

Slide 30 text

Slide 31

Slide 31 text

Aggregate Logs Microservices is distributed systems

Slide 32

Slide 32 text

Slide 33

Slide 33 text

Slide 34

Slide 34 text

What are my opinions to get observability done??? 34

Slide 35

Slide 35 text

There are two ways to solve this problem 35

Slide 36

Slide 36 text

Before Service Mesh

Slide 37

Slide 37 text

Slide 38

Slide 38 text

38 Things to think about

Slide 39

Slide 39 text

Concerns about observability in the app 39 ● Increase the size of application ● Configuration should be done inside the application ● It will consume the application resources ● More control to “customize” metrics and distributed tracing ● There is no sidecar involved

Slide 40

Slide 40 text

After Service Mesh

Slide 41

Slide 41 text

Slide 42

Slide 42 text

42 Things to think about

Slide 43

Slide 43 text

Concerns about observability with sidecar 43 ● One more thing to care about ● Control Plane should configure the sidecars ● Not so intrusive ● The developers can focus on business rules ● It is a kind of industry standard today

Slide 44

Slide 44 text

Slide 45

Slide 45 text

Progressive App Delivery with ArgoCD && Rollouts

Slide 46

Slide 46 text

Progressive App Delivery 46 ● Rolling out new features gradually ● Avoid downtime as much as possible ● Stateless Application is mandatory ● The versions should be backwards compatible ● Blue-Green, Canary Release and others

Slide 47

Slide 47 text

47 Argo CD is a declarative, GitOps continuous delivery tool for Kubernetes

Slide 48

Slide 48 text

Slide 49

Slide 49 text

49 Argo Rollouts is a Kubernetes controller and set of CRDs which provide advanced deployment capabilities such as blue-green, canary, canary analysis, experimentation, and progressive delivery features to Kubernetes.

Slide 50

Slide 50 text

50 But, How it connect it Observability stuff????

Slide 51

Slide 51 text

Slide 52

Slide 52 text

52 It should be Automated

Slide 53

Slide 53 text

With everything metrified we can automate release process 53

Slide 54

Slide 54 text

54 HTTP Calls with Status Code ~2.* should be more than 95% Release is good to go!!! Else Ohhh sh****!!!

Slide 55

Slide 55 text

Slide 56

Slide 56 text

56 sum(irate(istio_requests_total{reporter="source",destination_service=~ "bets-canary.istio.svc.cluster.local",response_code=~"2.*"[2m])) / sum(irate(istio_requests_total{reporter="source",destination_servi ce=~"bets-canary.istio.svc.cluster.local"[2m])) Prometheus Query

Slide 57

Slide 57 text

Slide 58

Slide 58 text

Conclusions

Slide 59

Slide 59 text

59 Follow the industry standards. Homemade solution is not a good way. Always

Slide 60

Slide 60 text

60 You can start simple and then evolve step by step

Slide 61

Slide 61 text

61 Microservices without observability (monitoring, distributed tracing and log aggregate) is the worst thing in the world

Slide 62

Slide 62 text

62 Microservices is effective to delivery software frequently. But THINK seriously in Observability

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

Thanks! Any questions? You can find me on twitter and linkedin ● @claudioed 64