Glossary
5
Telemetry
How to collect data that will provide
observability (sensors)
Observability
Monitoring, Alerting and Visualizations,
Distributing tracing and Log Aggregation
Slide 6
Slide 6 text
Glossary
6
Monitoring
Is the practice of collecting signals,
aggregating them, and matching them
against some predefined criteria
Slide 7
Slide 7 text
7
Microservices Drawbacks
Sh*** happens
Slide 8
Slide 8 text
Fallacies of Distributed
Computing
Slide 9
Slide 9 text
Fallacies of Distributed Computing
9
● Network is Reliable
● Latency is Zero
● Bandwidth is Infinite
Slide 10
Slide 10 text
10
● Understand how microservices connect each other
● Network latencies can be a bottleneck (intense IPC
● Network can be unreliable
● Control the UP and Running instances
● Increase the non-functional requirements
Microservices implies “some” challenges
Slide 11
Slide 11 text
11
Slide 12
Slide 12 text
Metrics
Slide 13
Slide 13 text
13
Metrics are the only way to get your
job done
Slide 14
Slide 14 text
14
RED pattern to monitor Services
Slide 15
Slide 15 text
15
R - the number of request per second
Slide 16
Slide 16 text
16
E - the number of failed requests per
second
Slide 17
Slide 17 text
17
D - distributions of the amount of
time each request takes
Slide 18
Slide 18 text
“
The benefits of treating each service
the same, from a monitoring
perspective, is scalability in your
operations teams
18
Slide 19
Slide 19 text
Use case
Slide 20
Slide 20 text
20
Slide 21
Slide 21 text
21
Slide 22
Slide 22 text
Distributed
Tracing
Slide 23
Slide 23 text
How it works???
23
● Assign external Unique ID
● Passes it to all services that are involved
● Includes the Request ID in Log Messages
● Record times information e.g start and end
time
Slide 24
Slide 24 text
OpenTracing OpenTelemetry
24
● Cloud Native Computing Foundation CNCF
● It standardizes the instrumentation of apps
for distributed tracing
Slide 25
Slide 25 text
OpenTracing OpenTelemetry
Concepts
25
● Trace tells the story of a transaction
● Span represents a single call
● Distributed Tracing systems collecting and
we can see the graph in a nice interface
Slide 26
Slide 26 text
26
Slide 27
Slide 27 text
Logs
Slide 28
Slide 28 text
Use 5’s W!!!!
28
● who
● what
● when
● where
● why
Slide 29
Slide 29 text
Use severity correctly!!!
29
● INFO
● DEBUG
● WARNING
● ERROR
Slide 30
Slide 30 text
30
Slide 31
Slide 31 text
Aggregate Logs
Microservices is distributed systems
Slide 32
Slide 32 text
32
Slide 33
Slide 33 text
33
Slide 34
Slide 34 text
What are my
opinions to get
observability
done???
34
Slide 35
Slide 35 text
There are two ways to solve
this problem
35
Slide 36
Slide 36 text
Before Service Mesh
Slide 37
Slide 37 text
37
Slide 38
Slide 38 text
38
Things to think about
Slide 39
Slide 39 text
Concerns about observability in the
app
39
● Increase the size of application
● Configuration should be done inside the
application
● It will consume the application resources
● More control to “customize” metrics and
distributed tracing
● There is no sidecar involved
Slide 40
Slide 40 text
After Service Mesh
Slide 41
Slide 41 text
41
Slide 42
Slide 42 text
42
Things to think about
Slide 43
Slide 43 text
Concerns about observability with
sidecar
43
● One more thing to care about
● Control Plane should configure the sidecars
● Not so intrusive
● The developers can focus on business rules
● It is a kind of industry standard today
Slide 44
Slide 44 text
44
Slide 45
Slide 45 text
Progressive App
Delivery with
ArgoCD && Rollouts
Slide 46
Slide 46 text
Progressive App Delivery
46
● Rolling out new features gradually
● Avoid downtime as much as possible
● Stateless Application is mandatory
● The versions should be backwards
compatible
● Blue-Green, Canary Release and others
Slide 47
Slide 47 text
47
Argo CD is a declarative, GitOps
continuous delivery tool for
Kubernetes
Slide 48
Slide 48 text
48
Slide 49
Slide 49 text
49
Argo Rollouts is a Kubernetes controller and set of CRDs which
provide advanced deployment capabilities such as blue-green,
canary, canary analysis, experimentation, and progressive
delivery features to Kubernetes.
Slide 50
Slide 50 text
50
But, How it connect it Observability
stuff????
Slide 51
Slide 51 text
51
Slide 52
Slide 52 text
52
It should be Automated
Slide 53
Slide 53 text
With everything
metrified we can
automate release
process
53
Slide 54
Slide 54 text
54
HTTP Calls with Status Code ~2.*
should be more than 95%
Release is good to go!!!
Else
Ohhh sh****!!!