Monitoring time in a distributed database: a play in three acts

Monitoring time in distributed databases: a play in three acts
Shlomi Noach GitHub StatsCraft 2019

Agenda TL;DR: time adventures and mishaps    Throttling Consistent reads
And all that follows

About me @github/database-infrastructure Author of orchestrator, gh-ost, freno, ccql and
other open source tools. Blog at http://openark.org   github.com/shlomi-noach  @ShlomiNoach

GitHub  Built for developers Largest open source hosting 100M+ repositories 
36M+ developers  1B+ contributions Largest supplier of octocat T-Shirts and stickers

Prelude

Asynchronous replication Single writer node Asynchronous replicas Multi layered Scale
reads across replicas ! ! ! ! ! !

Replication lag Desired behavior: smallest possible lag • Consistent reads
(aka read your own writes) • Faster/lossless/less lossy failovers ! ! ! ! ! !

Replication lag ! ! ! ! ! !

Measuring lag via heartbeat Inject heartbeat on master Read replicated
value on replica, compare with time now() ! ! ! ! ! !

Inject and read Heartbeat generated locally on writer node !
! ! ! ! ! Inject Read & compare " Read & compare " Read & compare "

create table heartbeat (  anchor int unsigned not null,  ts
timestamp(6),  primary key (anchor)  ); Heartbeat ! ! ! ! ! !

timestamp(6),  primary key (anchor)  ); replace into heartbeat values (  1, now(6)  ); Heartbeat: inject on master ! ! ! ! ! !

timestamp(6),  primary key (anchor)  ); select   unix_timestamp(now(6)) -   unix_timestamp(ts) as lag   from   heartbeat  where  anchor = 1 Heartbeat: read on replica ! ! ! ! ! !

Replication lag: graphing ! ! ! ! ! !

Objective: throttling

Throttling Break large writes into small tasks Allow writes to
take place if lag is low Hold off writes when lag is high Threshold: 1sec

! Heartbeat injection 15:07:00.00 .050 .100 .150 .200 .950 15:07:00.000

! Heartbeat injection: applied on replica 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004

! Heartbeat injection: read by app 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004 # 15:07:00.007 0.007

! Heartbeat injection: delayed app read 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004 # 15:07:00.047 0.047

! Heartbeat injection: delayed apply 15:07:00.00 .050 .100 .150 .200
.950 ! 15:07:00.000 15:07:00.044 # 15:07:00.047 0.047

Heartbeat injection: granularity +50ms

Act II

Practical constraints

Lag monitor service ! ! ! ! ! ! freno
to monitor replication lag: • Polls all replicas at 50ms interval • Aggregates data per cluster at 25ms interval • https://githubengineering.com/mitigating-replication-lag-and-reducing-read-load-with-freno/ • https://github.com/github/freno

! Heartbeat injection: applied on replica 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004

! Heartbeat injection: read by freno 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004 15:07:00.007 0.007

.200 .950 ! 15:07:00.000 15:07:00.004 15:07:00.007 0.007 # 15:07:00.009

! Heartbeat injection: delayed app read 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004 15:07:00.007 0.007 # 15:07:00.048

! Delayed app read, broken replica 15:07:00.00 .050 .100 .150
.200 .950 ! 15:07:00.000 15:07:00.004 15:07:00.007 0.007 # 15:07:00.048 xx

Heartbeat injection with freno: granularity ±50ms

Actual safety margins: 50ms freno sampling interval 25ms freno aggregation
interval Allow additional 25ms for “extra complications” Total 100ms

Throttling:   granularity is not important

Granularity is important

Objective: consistent reads

Consistent reads,   aka read-your-own-writes A classic problem of distributed
databases ! ! ! ! ! ! write expect data "

Consistent read checks ! ! ! ! ! ! App
asks freno: “I made a write 350ms ago. Are all replicas up to date?” Client auto-requires 100ms error margin We compare replication lag with 250ms write read " check

Everything is terrible ! ! ! ! ! ! 100ms
is where interesting stuff happens, and it’s within our error margin. write read " check

The metrics dilemma The metrics dilemma Can’t we just reduce
the interval?

Act III

Beyond our control

Latency

High latency networks Minimal lag ! ! ! ! !
!

Latency: consistent reads App close to writer node, far from
replica ! ! ! ! ! ! write check lag "

Skewed clocks

! Heartbeat injection: applied on skewed replica 15:07:00.00 .050 .100
.150 .200 .950 ! 15:07:00.000 15:07:00.004 -> 15:06:59.994

.200 .950 ! 15:07:00.000 15:07:00.004 -> 15:06:59.994 # 15:07:00.007 -0.003

! Heartbeat injection on skewed master 15:07:00.00 .050 .100 .150
.200 .950 15:07:00.025

! Heartbeat injection: applied on skewed replica 15:07:00.00 .050 .100
.150 .200 .950 ! 15:07:00.025 15:07:00.004

.200 .950 ! 15:07:00.025 15:07:00.004 # 15:07:00.007 -0.018

Timer skew

Granularity limitation

Everything is still terrible

Atomic clocks

Clock synchronization: veriﬁcation

A late mitigation

An untimely postlude:    Can we do without clocks?

$ $ $ $ $ Consensus protocols

$ $ $ $ $ Lamport timestamps

MySQL: GTID Each transaction generates a GTID:  00020192-1111-1111-1111-111111111111:830541 Each server
keeps track of gtid_executed: all transactions ever executed:  00020192-1111-1111-1111-111111111111:1-830541 SELECT GTID_SEUBSET(  ‘00020192-1111-1111-1111-111111111111:830541’,  @@gtid_executed  );

And yet the search for time metrics endures… %

Questions? github.com/shlomi-noach @ShlomiNoach Thank you!

Monitoring time in a distributed database: a pl...

Monitoring time in a distributed database: a play in three acts

More Decks by Shlomi Noach

Other Decks in Technology

Featured

Transcript