Slide 1

Slide 1 text

Cloud Run + Observability / Reliability @ KAUCHE Yuki Ito (@mrno110) Cloud Run Casual Talk #2

Slide 2

Slide 2 text

KAUCHE Architect / Platform Team Yuki Ito @mrno110

Slide 3

Slide 3 text

No content

Slide 4

Slide 4 text

Agenda ɾArchitecture ɾApproaches - Logs - Traces - Metrics - SLI / SLO

Slide 5

Slide 5 text

Architecture Run Tasks Pub/Sub Mobile App External Service Mobile API Web Hook API Job API Scheduler

Slide 6

Slide 6 text

Architecture

Slide 7

Slide 7 text

Observability our de fi nition of “observability” for software systems is a measure of how well you can understand and explain any state your system can get into, no matter how novel or bizarre. ... If you can understand any bizarre or novel state without needing to ship new code, you have observability. https://www.oreilly.com/library/view/observability-engineering/9781492076438/

Slide 8

Slide 8 text

Observability - Goal ✅ Enable new members to understand system states on Day 1.

Slide 9

Slide 9 text

Agenda ɾArchitecture ɾApproaches - Logs - Traces - Metrics - SLI / SLO

Slide 10

Slide 10 text

Logs • Request logs • Container logs Cloud Run generates two types of logs: https://cloud.google.com/run/docs/logging

Slide 11

Slide 11 text

Logs • Request logs • Container logs Cloud Run generates two types of logs: https://cloud.google.com/run/docs/logging

Slide 12

Slide 12 text

Logs Cloud Run generates Request Logs

Slide 13

Slide 13 text

Logs • Request logs • Container logs Cloud Run generates two types of logs: https://cloud.google.com/run/docs/logging

Slide 14

Slide 14 text

Logs Container (Application) Logs Structured Log

Slide 15

Slide 15 text

Logs Request Logs + Container Logs Correlate Logs https://cloud.google.com/run/docs/logging#correlate-logs

Slide 16

Slide 16 text

Logs Request Logs Container Logs Correlate Logs

Slide 17

Slide 17 text

Logs Correlate Logs

Slide 18

Slide 18 text

Logs { "message": "grpc request", "logger": "grpc.request_logger", "method": "/customer.v1.CustomerService/GetXXX", "level": "info", "timestamp": 1613885945098.689 "logging.googleapis.com/trace": "projects/.../traces/xxx", } https://cloud.google.com/logging/docs/structured-logging Container (Application) Logs

Slide 19

Slide 19 text

Agenda ɾArchitecture ɾApproaches - Logs - Traces - Metrics - SLI / SLO

Slide 20

Slide 20 text

Cloud Trace

Slide 21

Slide 21 text

OpenTelemetry OpenTelemetry is a collection of tools, APIs, and SDKs. Use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces) to help you analyze your software’s performance and behavior. https://opentelemetry.io/

Slide 22

Slide 22 text

Trace Just tracing is not enough...

Slide 23

Slide 23 text

Trace Attributes

Slide 24

Slide 24 text

Trace Events

Slide 25

Slide 25 text

Trace Correlate Logs

Slide 26

Slide 26 text

Agenda ɾArchitecture ɾApproaches - Logs - Traces - Metrics - SLI / SLO

Slide 27

Slide 27 text

Metrics Cloud Run Logging Monitoring Metrics Log Log Based Metrics

Slide 28

Slide 28 text

Metrics Log Based Metrics

Slide 29

Slide 29 text

Metrics Cloud Run OpenTelemetry Monitoring Metrics Metrics OpenTelemetry

Slide 30

Slide 30 text

Agenda ɾArchitecture ɾApproaches - Log - Trace - Metrics - SLI / SLO

Slide 31

Slide 31 text

Cloud Monitoring

Slide 32

Slide 32 text

Cloud Monitoring PromQL

Slide 33

Slide 33 text

Agenda ɾArchitecture ɾApproaches - Log - Trace - Metrics - SLI / SLO

Slide 34

Slide 34 text

No content