Slide 1

Slide 1 text

Maslow’s Hierarchy of Needs For Databases Charity Majors, CTO honeycomb.io @mipsytipsy

Slide 2

Slide 2 text

Maslow’s Hierarchy of Needs For Databases Charity Majors, CTO honeycomb.io @mipsytipsy

Slide 3

Slide 3 text

Dr. Abraham Maslow, Psychologist real talk about storage systems … for non-DBAs

Slide 4

Slide 4 text

“Database Reliability Engineering” with Laine Campbell

Slide 5

Slide 5 text

Dr DevOps is in the house breakin’ down yr silos

Slide 6

Slide 6 text

Maslow’s hierarchy of needs

Slide 7

Slide 7 text

Databases: They’re Just Like Should Be Just Like Other Services!

Slide 8

Slide 8 text

“self-actualized database” • Empowering for developers • Resilient to common failures • Friendly to operations (debuggable, understandable)

Slide 9

Slide 9 text

The “right database” is the one that empowers you to achieve your mission.

Slide 10

Slide 10 text

Survival

Slide 11

Slide 11 text

Selecting a storage system • Choose boring technology, when you can • Reuse solutions. Resist software sprawl • Spend innovation tokens only on key differentiators h/t @mcfunley

Slide 12

Slide 12 text

• If your company is very young, optimize for velocity and developer productivity. • The more mature your company is, the more operational impact over the long term trumps all.

Slide 13

Slide 13 text

Safety

Slide 14

Slide 14 text

Backups • How much data can you afford to lose? • Monitor backup freshness • Archive remotely. Test your restore process.

Slide 15

Slide 15 text

If you are not regularly testing restores, you have Schrodinger’s backups.

Slide 16

Slide 16 text

Replication • Multiple live copies • Consider write concern, replication factor, quorum, votes, dedicated backup nodes … • Distribute across AZs, regions, or DCs

Slide 17

Slide 17 text

Failover • What happens when any node dies? • Practice and document under controlled circumstances • Human SPOFs are just as bad as machine SPOFs

Slide 18

Slide 18 text

Ponder your path to horizontal scalability Forecasting more than 10x growth is mostly impossible Just try to not screw over your future self.

Slide 19

Slide 19 text

Love & Belonging

Slide 20

Slide 20 text

“Show me the DevOps!” Making your data a first-class citizen of both dev and ops processes

Slide 21

Slide 21 text

Dev, Ops, DBA • Use the same code review and deploy processes • Use the same infrastructure provisioning and config management • Data-land should feel familiar and intuitive, not alienating or “special”

Slide 22

Slide 22 text

Embrace failure

Slide 23

Slide 23 text

Guard Rails • Backups, restores. Verified. • Unit tests • Boring failovers • Shared on call rotations • Network isolation between prod & other • Don’t get pissy when people mess up, help them fix it • Don’t swoop in and do it all yourself

Slide 24

Slide 24 text

Right-sizing your automation • Indexing • Scaling up or down • Failover • Killing bad queries

Slide 25

Slide 25 text

Esteem

Slide 26

Slide 26 text

Your services need: • Observability • Instrumentation • Graceful degradation

Slide 27

Slide 27 text

Your humans need: • Tools for debugging • Checklists, documentation • Tractable levels of graphs and alerting

Slide 28

Slide 28 text

Lifecycle of instrumentation Is it up? Is it slow? Canned graphs for system and db metrics Hand crafted dashboards Lots of outages Collection of “heroic” debugging commands Automated query profiling / analysis Auto-remediation Unique request ids, full-stack tracing Realtime exploratory tools Predictive analysis / precognition skip or change shrink if we’re at 20 min or less, include this slide use this for silvia’s talk?

Slide 29

Slide 29 text

Golden rules for alerting • Emphasize end-to-end checks on key indicators • Health of the service, not individual nodes • Page only if actionable • Track auto-remediation events • Shared dev/ops rotation by service owners

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

Self-Actualization

Slide 32

Slide 32 text

basically … (credit: @zakgreant)

Slide 33

Slide 33 text

Self-actualized database • Your data is safe and recoverable • Resilient to common errors • Understandable, debuggable • Shares processes and tooling with the rest of the stack • Empowers you to achieve your mission.

Slide 34

Slide 34 text

in conclusion Don’t be scared. Mind your damn backups. Everything else will probably be ok.