Performance Monitoring

September 23, 2016

150

Performance Monitoring

Herein we explore approaching service level objectives a bit differently and explore how one can measure time with less overhead to enable accounting of low-latency operations.

Theo Schlossnagle

September 23, 2016

More Decks by Theo Schlossnagle

See All by Theo Schlossnagle

The Paradox of Software Craftsmanship

postwait

320

Metrics Done Right

postwait

230

SRECon Coherent Performance

postwait

130

Adaptive Availability

postwait

The Architecture of a Distributed Analytics and Storage Engine for Massive Time-Series Data

postwait

1.5k

The math behind big systems analysis.

postwait

1.2k

Understanding Slowness

postwait

570

The Paradox of Software Craftsmanship

postwait

2.9k

Leveraging Failure

postwait

400

Other Decks in Technology

See All in Technology

AWS Well-Architected から考えるオブザーバビリティの勘所 / Considering the Essentials of Observability from AWS Well-Architected

sms_tech

820

DATA+AI SummitとSnowflake Summit: ユーザから見た共通点と相違点 / DATA+AI Summit and Snowflake Summit

nttcom

150

How Do I Contact Jetblue Airlines® Reservation Number: Fast Support Guide

thejetblueairhelpsupport

270

データ駆動経営の道しるべ：プロダクト開発指標の戦略的活用法

ham0215

220

(HackFes)米国国防総省のDevSecOpsライフサイクルをAWSのセキュリティサービスとOSSで実現

syoshie

640

AI Ready API ─ AI時代に求められるAPI設計とは？/ AI-Ready API - Designing MCP and APIs in the AI Era

yokawasa

5.6k

怖くない！GritQLでBiomeプラグインを作ろうよ

pal4de

110

Recoil脱却の現状と挑戦

kirik

100

TROCCO今昔

gtnao

200

LLM拡張解体新書/llm-extension-deep-dive

oracle4engineer

PRO

7.9k

「現場で活躍するAIエージェント」を実現するチームと開発プロセス

tkikuchi1002

950

Bliki (ja), and the Cathedral, and the Bazaar

koic

980

Featured

See All Featured

Evolution of real-time – Irina Nazarova, EuRuKo, 2024

irinanazarova

840

The Pragmatic Product Professional

lauravandoore

6.8k

Rebuilding a faster, lazier Slack

samanthasiow

9.1k

The Success of Rails: Ensuring Growth for the Next 100 Years

eileencodes

7.5k

Large-scale JavaScript Application Architecture

addyosmani

512

110k

For a Future-Friendly Web

brad_frost

179

9.8k

How to Think Like a Performance Engineer

csswizardry

1.8k

Art, The Web, and Tiny UX

lynnandtonic

301

21k

Mobile First: as difficult as doing things right

swwweet

223

9.7k

KATA

mclloyd

14k

The Straight Up "How To Draw Better" Workshop

denniskardys

235

140k

Visualization

eitanlees

146

16k

Transcript

PERFORMANCE MONITORING AND NOW FOR SOMETHING  ENTIRELY DIFFERENT @postwait
None
PERFORMANCE IMPACTS PEOPLE REMEMBER WHY YOU DO THIS
CONSIDER A GOAL 99TH PERCENTILE AT 1500MS
THEY AREN’T HARD TO UNDERSTAND, JUST DECEPTIVE AT TIMES. QUICK
TL;DR ON PERCENTILES • 99th percentile: q(0.99) • 99% of the samples are lower • 1% of the samples are higher q(0.99) = 149μs q(1) = 63ms
OCCUPY! PERFORMANCE OF THE 99%
None
NOW CONSIDER THE PEOPLE YOU’RE BLIND 1266ms 860
PERCENTAGES  ARE NOT PEOPLE
None
COMPARE YOUR SLA TWO MOMENTARY VIOLATIONS VS. AN EPIC OUTAGE
WITH THE ACTUAL TRAGEDY
WHAT IF I TOLD YOU IT WAS OKAY TO CARE
I KNOW IT SOUNDS CRAZY, BUT
THEY’RE FASTER  THAN  USERS SYSTEMS
IT’S REAL PROBE EFFECT important_op() st := hrtime() important_op() fn
:= hrtime() log(fn-st)
OF PROBING RULES • fixed O(1) operations • no latency
bubbles • no allocations
TIME IS AN ILLUSION, LUNCHTIME DOUBLY SO - Douglas Adams
TIME BUT FASTER MTEV_TIME https://github.com/circonus-labs/libmtev
FAST  &  CORRECT LIBCIRCMETRICS https://github.com/circonus-labs/libcircmetrics
OCTOPUS THE TECHNOLOGY
FIGHT  THE OCTOPUS GET OUT THERE - @postwait
None