Observability of your app in production

September 1st, 2022 Observability of your app in production Kurt
Nelson Senior Software Engineer, Pinterest @kurtisnelson

2. When do I set it up? 1. What is
it? Observing Today’s Talk 3. How do I do it? 4. Now what?

Observability is the ability to see what is happening in
production on user devices with your binary Unstructured logging, crash reporting, performance metrics, and analytics are the common forms.

As soon as an app is in the public’s hands,
it should have basic, manual, observability. Don’t rely on just a QA process.

Google provides the basics, for free! Bigger companies often roll
at least part of their observability stack, often analytics so that it can tie into other data pipelines.

Bread & Butter Traditional Logging Think LogCats, non-fatal stack traces,
unexpected but recoverable behavior all sent as strings. Crash Reporting The crash itself is a structured log; a good system will collate these with relevant unstructured logs. Clustering and attaching of metadata pertaining to the build is also performed. Structured Logging This is a catch-all term, and is often speci f ic to the business needs of your application. Think performance data, analytic events, or ad impression data.

Think JSON without a well-de f ined schema, or LogCat
output on the terminal. Extra work required for machine consumption. Structured Logging Think something represented by a data class, protocol buffer or thrift de f inition. Easily consumed by machines. Unstructured Logging

Do you use structured logging?

Unstructured data is always better than no data. (Break out
Grep in case of emergency)

Actions to take based on the data can be determined
on the f ly until process is achieved. The process resulting from observability is second. Getting the data f lowing into systems is the important part.

Toasting your Bread & Butter Experimentation Observing the known unknowns
Alerting Observing the unknown unknowns Feature Flags/Circuit Breakers Tools to trigger when you observe something wrong

When do we implement? Observability

In a microservice architecture, this comes out of the box.
But as usual, your mobile app is probably the only one.

Google Play Console Vitals Crashes, ANRs, and glaring performance issues

When do I need more?

When do I need more? Communicate to business stakeholders what
you are able to observe!

Align with success criteria Communicate to business stakeholders what you
are able to observe!

How can I implement observability? Observability

Structured Crash Logging

Structured Everything Logging

Honorable Mentions Bug Snag A commercial crash reporting system mobile.dev
Catch performance issues pre-production Jaeger Tracing With proper backend instrumentation, observe the user f low through the backend too.

Big Tech likes to do these in house.

Now what? Observability

Relax. Next time there is an outage, you’ll be equipped.
But what if you can avoid the outage in the f irst place?

Google came up with SRE to describe the backend engineers
treating operations as code. For mobile, the service is the app itself. We can crib some of their best practices. See https://sre.google/sre-book for a deep dive!

1. Alert when things get strange but still seem to
work. 2. Guard changes and automatically revert them based on observations. Once we have observability into our app we can…

September 1st, 2022 Observability of your app in production Kurt
Nelson Senior Software Engineer, Pinterest @kurtisnelson

Observability of your app in production

Observability of your app in production

Kurt Nelson

More Decks by Kurt Nelson

Other Decks in Technology

Featured

Transcript

September 1st, 2022 Observability of your app in production Kurt

2. When do I set it up? 1. What is

Observability is the ability to see what is happening in

As soon as an app is in the public’s hands,

Google provides the basics, for free! Bigger companies often roll

Bread & Butter Traditional Logging Think LogCats, non-fatal stack traces,

Think JSON without a well-de f ined schema, or LogCat

Do you use structured logging?

Unstructured data is always better than no data. (Break out

Actions to take based on the data can be determined

Toasting your Bread & Butter Experimentation Observing the known unknowns

When do we implement? Observability

In a microservice architecture, this comes out of the box.

Google Play Console Vitals Crashes, ANRs, and glaring performance issues

When do I need more?

When do I need more? Communicate to business stakeholders what

Align with success criteria Communicate to business stakeholders what you

How can I implement observability? Observability

Structured Crash Logging

Structured Everything Logging

Honorable Mentions Bug Snag A commercial crash reporting system mobile.dev

Big Tech likes to do these in house.

Now what? Observability

Relax. Next time there is an outage, you’ll be equipped.

Google came up with SRE to describe the backend engineers

1. Alert when things get strange but still seem to

September 1st, 2022 Observability of your app in production Kurt