Enable effective Observability with Python

Enable effective observability with Python Ernesto Arbitrio

🤓 Who am I Senior Software Engineer at Crunch.io/ YouGov.
Python passionate. Python Italia Vicepresident. I do love cooking and eating 🍷 🍝 🥩

Enable effective observability with Python Ernesto Arbitrio

What is observability?

Looking inside the system

The 3 pillars of observability Logs Metrics Traces

Everything is based on events -Logging: recording events -Metrics: data
combined from measuring events -Tracing: recording events with casual ordering Credit: codahale

Everything is based on events -Log: single events (response time)
-Metric: aggregated data (response time) -Trace: detailed tree (response time)

Logs show response time 10.100.5.3 - - [23/Feb/2018:10:27:30 +0530] GET
/store/ HTTP/1.1 200 6406 - Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.167 Safari/537.36 493ms This request took 493ms!

Metrics show response time Is 493ms slow? How fast were
most requests at 10.27am? 12:00 AM 02:00 AM 04:00 AM 06:00 AM 08:00 AM 10:00 AM 12:00 PM 02:00 PM 04:00 PM 06:00 PM 08:00 PM 10:00 PM SUN WED FRI SLOWEST FASTEST

Traces show response time What caused the request to take
~493 ms?

Thoughts -Log: easy to “grep”, manually read -Metric: trend identification
-Trace: identify cause across services

Distributed tracing

Distributed tracing -Span ID -Parent Span ID -Start time -End
time -Additional context (metadata)

Distributed tracing Store FE, ID:1, PID: none, start 8:30, end
8:50 Catalog, ID: 2, PID: 1, start: 8:32, end: 8:40 Stock, ID: 3, PID: 2, s: 8:34, e: 8:36 Stock, ID: 4, PID: 2, s: 8:35, e: 8:37 Stock: ID: 5, PID: 2, s: 8:36, e: 8:38 Auth, ID: 6, PID: 1, start: 8:40, end: 8:47

Sampling Traces that finish with no issues Statistically representative sample
of all OK traces Traces with errors Traces with high latency Or would the right sampling be sufficient? Do you really need all of this data? Traces with specific attributes

Sampling techniques

How to? OpenTelemetry is a collection of APIs, SDKs, and
tools. Use it to instrument, generate, collect, and export telemetry data (metrics, logs, and traces). Helps you analyze your software’s performance and behavior.

A bit of context

Reduce tools

Using the OTEL collector

Architecture of OTEL collector

Example trace.set_tracer_provider( TracerProvider(resource=Resource.create({SERVICE_NAME: "ecommerce-web-service"})) ) app = Flask(__name__) FlaskInstrumentor().instrument_app(app) RequestsInstrumentor().instrument()
jaeger_exporter = JaegerExporter(agent_host_name="jaeger", agent_port=6831) trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(jaeger_exporter)) products_api_url = os.environ.get("PRODUCTS_API_URL") tracer = trace.get_tracer(__name__) @app.get("/") def index(): with tracer.start_as_current_span("/ GET"): with tracer.start_as_current_span("/ products"): r = requests.get(products_api_url) items = r.json()["items"] return render_template("index.html", items=items) if __name__ == "__main__": app.run(debug=os.environ.get("DEBUG") or True, host="0.0.0.0", port=5001)

Example trace.set_tracer_provider( TracerProvider(resource=Resource.create({SERVICE_NAME: "ecommerce-web-service"})) ) app = Flask(__name__) FlaskInstrumentor().instrument_app(app) RequestsInstrumentor().instrument()
jaeger_exporter = JaegerExporter(agent_host_name="jaeger", agent_port=6831) trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(jaeger_exporter)) products_api_url = os.environ.get("PRODUCTS_API_URL") tracer = trace.get_tracer(__name__) @app.get("/") def index(): with tracer.start_as_current_span("/ GET"): with tracer.start_as_current_span("/ products"): r = requests.get(products_api_url) items = r.json()["items"] return render_template("index.html", items=items) if __name__ == "__main__": app.run(debug=os.environ.get("DEBUG") or True, host="0.0.0.0", port=5001) Manually sent to backend

DEMO   🤞

Thanks! [email protected] github.com/ernestoarbitrio

Enable effective Observability with Python

Enable effective Observability with Python

ernestoarbitrio

More Decks by ernestoarbitrio

Other Decks in Programming

Featured

Transcript