Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Deconstructing an Abstraction to Reconstruct an Outage

Deconstructing an Abstraction to Reconstruct an Outage

Abstractions are what allow us to build the complex applications that we all use day-to-day. For example, it's rare for us to care about the precise details of on-disk storage when building an application — that's why databases exist!

Debugging is different though. It forces us to break through those abstractions in order to understand what the computer is really doing.

In this talk, we'll explore the aftermath of a complex outage in a Postgres cluster. We'll retrace the steps we took to reliably reproduce the failure in a local environment and pull out lessons about debugging complex systems along the way. At one point, we'll dive into the depths of how Postgres represents data on disk, and realise that even unfamiliar layers of a system don't need to be scary.

Chris Sinjakli

March 28, 2023
Tweet

More Decks by Chris Sinjakli

Other Decks in Programming

Transcript

  1. Deconstructing an
    Abstraction to


    Reconstruct


    an Outage sinjo.dev

    View Slide

  2. A familiar


    story 📚

    View Slide

  3. View Slide

  4. View Slide

  5. View Slide

  6. View Slide

  7. 2xx


    5xx
    Percentage
    Time
    API response status

    View Slide

  8. DB::ConnectionFailure - could

    not connect to server:

    Connection refused
    💥

    View Slide

  9. View Slide

  10. Hi

    View Slide

  11. sinjo.dev

    View Slide

  12. sinjo.dev

    View Slide

  13. Infra Engineer

    View Slide

  14. Databases &


    Distributed Systems


    😍

    View Slide

  15. View Slide

  16. View Slide

  17. Deconstructing an
    Abstraction to


    Reconstruct


    an Outage sinjo.dev

    View Slide

  18. First:


    Our cluster setup

    View Slide

  19. Postgres
    API backend

    View Slide

  20. Postgres
    Postgres
    Postgres
    Repl Repl
    API backend

    View Slide

  21. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    API backend

    View Slide

  22. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  23. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  24. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  25. Postgres
    Postgres
    Postgres
    Repl
    Pacemaker Pacemaker Pacemaker
    API backend
    VIP

    View Slide

  26. Postgres
    Postgres
    Postgres
    Repl
    VIP
    Pacemaker Pacemaker Pacemaker
    API backend

    View Slide

  27. Postgres
    Postgres
    Postgres
    Repl
    VIP
    Pacemaker Pacemaker Pacemaker
    API backend

    View Slide

  28. Postgres
    Postgres
    Postgres
    Repl
    VIP
    Pacemaker Pacemaker Pacemaker
    API backend

    View Slide

  29. Postgres
    Postgres
    Postgres
    Repl
    VIP
    Pacemaker Pacemaker Pacemaker
    Repl
    API backend

    View Slide

  30. Note:


    One replica

    always synchronous

    View Slide

  31. So...

    View Slide

  32. So...
    Unfortunately...

    View Slide

  33. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  34. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  35. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  36. Except it


    didn't

    View Slide

  37. Our API


    was down

    View Slide

  38. Fallback:


    fully manual setup

    View Slide

  39. Postgres
    Postgres
    Postgres
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend
    👩💻

    View Slide

  40. Postgres
    Postgres
    Postgres
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend
    👩💻

    View Slide

  41. Postgres
    Postgres
    Postgres
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend
    👩💻
    Repl

    View Slide

  42. Postgres
    Postgres
    Postgres
    Pacemaker Pacemaker Pacemaker
    👩💻
    Repl
    API backend

    View Slide

  43. Postgres
    Postgres
    Postgres
    Pacemaker Pacemaker Pacemaker
    👩💻
    Repl
    API backend
    Repl

    View Slide

  44. https://gocardless.com/blog/incident-review-api-and-dashboard-outage-on-10th-
    october/

    View Slide

  45. We're safe,


    for now...

    View Slide

  46. But only one


    failure away


    from downtime

    View Slide

  47. Mission:


    Recreate the outage

    View Slide

  48. There's a lot


    We'll go step-by-step

    View Slide

  49. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  50. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  51. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  52. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  53. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  54. 2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180
    Suspicious log on synchronous replica

    View Slide

  55. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  56. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  57. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  58. Everyone's
    favourite fault-
    injection tool

    View Slide

  59. You know


    it well...

    View Slide

  60. KILL(1) General Commands Manual KILL(1)


    NAME


    kill – terminate or signal a process


    SYNOPSIS


    kill [-s signal_name] pid ...


    kill -l [exit_status]


    kill -signal_name pid ...


    kill -signal_number pid ...


    DESCRIPTION


    The kill utility sends a signal to the processes


    specified by the pid operands.


    Only the super-user may send signals to other users'


    processes.


    The options are as follows:

    View Slide

  61. View Slide

  62. # on primary - hard kill


    kill -SIGKILL


    # on synchronous replica - subprocess crash


    kill -SIGABRT

    View Slide

  63. # on primary - hard kill


    kill -SIGKILL


    # on synchronous replica - subprocess crash


    kill -SIGABRT

    View Slide

  64. We kept our


    expectations


    low...

    View Slide

  65. ...which was


    the right


    choice

    View Slide

  66. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  67. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  68. 2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180
    Suspicious log on synchronous replica

    View Slide

  69. What do we
    mean by "log"?

    View Slide

  70. [2023-02-26 23:02:37Z] GET / - 200


    [2023-02-26 23:02:49Z] GET /favicon.ico - 200


    [2023-02-26 23:02:52Z] POST /login - 200


    [2023-02-26 23:33:52Z] POST /posts - 201


    [2023-02-26 23:33:57Z] GET /posts/binary-logs—talk - 200
    What we normally mean by logs

    View Slide

  71. A different kind
    of log:


    binary logs

    View Slide

  72. INSERT INTO users VALUES ('codd');


    INSERT INTO users VALUES ('lovelace');


    INSERT INTO users VALUES ('turing');
    Some extremely boring SQL

    View Slide

  73. View Slide

  74. View Slide

  75. Warning:


    simplifying lie ahead

    View Slide

  76. INSERT INTO users VALUES ('codd');


    INSERT INTO users VALUES ('lovelace');


    INSERT INTO users VALUES ('turing');





    Wrote 'codd' into table 'users'


    Wrote 'lovelace' into table 'users'


    Wrote 'turing' into table 'users'
    A different kind of logs

    (if they were textual)

    View Slide

  77. Postgres calls these
    "Write Ahead Logs"


    (WALs)

    View Slide

  78. But why bother
    doing that?

    View Slide

  79. Crash safety

    View Slide

  80. Index Table
    id username
    1 codd
    2 lovelace
    id
    1
    2

    View Slide

  81. Index Table
    id username
    1 codd
    2 lovelace
    3 turing
    id
    1
    2

    View Slide

  82. Index Table
    id
    1
    2
    💥 .
    id username
    1 codd
    2 lovelace
    3 turing

    View Slide

  83. Index Table
    id
    1
    2
    ???
    id username
    1 codd
    2 lovelace
    3 turing

    View Slide

  84. We can replay this operation
    INSERT INTO users VALUES ('codd');


    INSERT INTO users VALUES ('lovelace');


    INSERT INTO users VALUES ('turing');





    Wrote 'codd' into table 'users'


    Wrote 'lovelace' into table 'users'


    Wrote 'turing' into table 'users'

    View Slide

  85. Index Table
    id
    1
    2
    ???
    id username
    1 codd
    2 lovelace
    3 turing

    View Slide

  86. Index Table
    id
    1
    2
    3
    id username
    1 codd
    2 lovelace
    3 turing

    View Slide

  87. Also:


    replication

    View Slide

  88. Postgres
    Postgres
    Postgres
    Repl Repl
    API backend

    View Slide

  89. Postgres
    Postgres
    Postgres
    Repl Repl
    API backend
    WALs

    View Slide

  90. 2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180
    Suspicious log on synchronous replica

    View Slide

  91. 2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180
    Suspicious log on synchronous replica

    View Slide

  92. 2ways


    to


    replicate

    View Slide

  93. Streaming replication


    &


    WAL archival

    View Slide

  94. Postgres
    Postgres
    Postgres
    Repl Repl
    Streaming replication

    View Slide

  95. WAL archival

    View Slide

  96. Primary
    WAL archival

    View Slide

  97. Primary
    WAL archival
    archive_command

    View Slide

  98. Primary
    WAL archival
    Replica
    archive_command
    restore_command

    View Slide

  99. Why use


    both?

    View Slide

  100. Reduce in-cluster


    storage


    &


    Reduce


    bootstrap load

    View Slide

  101. Suspicious log on synchronous replica
    2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180

    View Slide

  102. Issue restoring WAL





    Cause of failure to
    promote replica?

    View Slide

  103. We already had
    those writes!

    View Slide

  104. Just because something
    shouldn't happen


    doesn't mean it


    didn't happen

    View Slide

  105. 2023-02-24 17:23:01 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-24 17:23:02 GMT LOG: invalid record length
    at 0/3000180
    Suspicious log on synchronous replica

    View Slide

  106. I had zero experience


    working with


    binary


    formats

    View Slide

  107. None of it


    is magic

    View Slide

  108. We can cheat:


    Postgres is


    open source

    View Slide

  109. But!

    View Slide

  110. These techniques
    also work on closed
    source software

    View Slide

  111. We just call that
    reverse engineering

    View Slide

  112. $ git checkout REL9_4_26 # we were running 9.4


    $ git grep -n "invalid record length"


    src/backend/access/transam/xlogreader.c:295: [...]
    src/backend/access/transam/xlogreader.c:604: [...]
    src/backend/access/transam/xlogreader.c:678: [...]
    Let's
    fi
    nd the error

    View Slide

  113. src/backend/access/transam/xlogreader.c:291-300:


    {


    /* XXX: more validation should be done here */


    if (total_len < SizeOfXLogRecord)


    {


    report_invalid_record(state, "invalid record length at %X/%X",


    (uint32) (RecPtr >> 32), (uint32) RecPtr);


    goto err;


    }


    gotheader = false;


    }
    Let's
    fi
    nd the error

    View Slide

  114. src/backend/access/transam/xlogreader.c:291-300:


    {


    /* XXX: more validation should be done here */


    if (total_len < SizeOfXLogRecord)


    {


    report_invalid_record(state, "invalid record length at %X/%X",


    (uint32) (RecPtr >> 32), (uint32) RecPtr);


    goto err;


    }


    gotheader = false;


    }
    Let's
    fi
    nd the error

    View Slide

  115. src/backend/access/transam/xlogreader.c:291-300:


    {


    /* XXX: more validation should be done here */


    if (total_len < SizeOfXLogRecord)


    {


    report_invalid_record(state, "invalid record length at %X/%X",


    (uint32) (RecPtr >> 32), (uint32) RecPtr);


    goto err;


    }


    gotheader = false;


    }
    Let's
    fi
    nd the error

    View Slide

  116. src/include/access/xlog.h:58:


    #define SizeOfXLogRecord MAXALIGN(sizeof(XLogRecord))
    Let's
    fi
    nd the error

    View Slide

  117. Wouldn't it be convenient
    if we could make
    total_len == 0?

    View Slide

  118. src/backend/access/transam/xlogreader.c:272-273:


    record = (XLogRecord *) (state->readBuf + RecPtr % XLOG_BLCKSZ);


    total_len = record->xl_tot_len;
    Let's
    fi
    nd the error

    View Slide

  119. src/include/access/xlog.h:41:


    typedef struct XLogRecord


    {


    uint32 xl_tot_len; /* total len of entire record */


    TransactionId xl_xid; /* xact id */


    uint32 xl_len; /* total len of rmgr data */


    uint8 xl_info; /* flag bits, see below */


    RmgrId xl_rmid; /* resource manager for this record */


    /* 2 bytes of padding here, initialize to zero */


    XLogRecPtr xl_prev; /* ptr to previous record in log */


    pg_crc32 xl_crc; /* CRC for this record */


    /* If MAXALIGN==8, there are 4 wasted bytes here */


    /* ACTUAL LOG DATA FOLLOWS AT END OF STRUCT */


    } XLogRecord;
    Let's
    fi
    nd the error

    View Slide

  120. src/include/access/xlog.h:41-56:


    typedef struct XLogRecord


    {


    uint32 xl_tot_len; /* total len of entire record */


    TransactionId xl_xid; /* xact id */


    uint32 xl_len; /* total len of rmgr data */


    uint8 xl_info; /* flag bits, see below */


    RmgrId xl_rmid; /* resource manager for this record */


    /* 2 bytes of padding here, initialize to zero */


    XLogRecPtr xl_prev; /* ptr to previous record in log */


    pg_crc32 xl_crc; /* CRC for this record */


    /* If MAXALIGN==8, there are 4 wasted bytes here */


    /* ACTUAL LOG DATA FOLLOWS AT END OF STRUCT */


    } XLogRecord;
    Let's
    fi
    nd the error

    View Slide

  121. View Slide

  122. src/backend/access/transam/xlogreader.c:291-300:


    {


    /* XXX: more validation should be done here */


    if (total_len < SizeOfXLogRecord)


    {


    report_invalid_record(state, "invalid record length at %X/%X",


    (uint32) (RecPtr >> 32), (uint32) RecPtr);


    goto err;


    }


    gotheader = false;


    }
    What was that check doing?

    View Slide

  123. What was that check doing?
    Size the record says it is
    Smallest possible size it can be
    src/backend/access/transam/xlogreader.c:291-300:


    {


    /* XXX: more validation should be done here */


    if (total_len < SizeOfXLogRecord)


    {


    report_invalid_record(state, "invalid record length at %X/%X",


    (uint32) (RecPtr >> 32), (uint32) RecPtr);


    goto err;


    }


    gotheader = false;


    }

    View Slide

  124. INSERT INTO users VALUES ('codd');


    INSERT INTO users VALUES ('lovelace');


    INSERT INTO users VALUES ('turing');





    Wrote 'codd' into table 'users'


    Wrote 'lovelace' into table 'users'


    Wrote 'turing' into table 'users'
    A different kind of logs

    (if they were textual)

    View Slide

  125. Let's see what they
    look like in
    practice

    View Slide

  126. INSERT INTO users VALUES ('codd');


    INSERT INTO users VALUES ('lovelace');


    INSERT INTO users VALUES ('turing');
    Some extremely boring SQL

    View Slide

  127. Grab the binary
    log
    fi
    le, and...

    View Slide

  128. A barely comprehensible

    wall of data 😅

    View Slide

  129. A barely comprehensible

    wall of data 😅
    Hex ASCII

    View Slide

  130. Same data, rendered differently
    Decimal Hexadecimal Character
    62 3E >
    63 3F ?
    64 40 @
    65 41 A
    66 42 B

    View Slide

  131. Decimal Hexadecimal Character
    62 3E >
    63 3F ?
    64 40 @
    65 41 A
    66 42 B
    Same data, rendered differently

    View Slide

  132. Hex ASCII
    Some good news

    View Slide

  133. Some good news
    We can see our users!!

    View Slide

  134. How can we
    fi
    nd
    xl_tot_len?

    View Slide

  135. INSERT INTO repro VALUES ('A');


    INSERT INTO repro VALUES ('AB');


    INSERT INTO repro VALUES ('ABC');


    INSERT INTO repro VALUES ('ABCD');


    INSERT INTO repro VALUES ('ABCDE');


    ...
    Some even more boring SQL

    View Slide

  136. Look for a
    fi
    eld
    increasing

    by 1

    View Slide

  137. Guesswork incoming!

    View Slide

  138. Guesswork incoming!
    The data we inserted

    View Slide

  139. A little help: ASCII codes
    Decimal Hexadecimal Character
    62 3E >
    63 3F ?
    64 40 @
    65 41 A
    66 42 B

    View Slide

  140. Notice anything?
    The data we inserted

    View Slide

  141. Notice anything?
    The data we inserted
    Familiar characters

    View Slide

  142. Notice anything?
    Decimal Hexadecimal Character
    63 3F ?
    64 40 @
    65 41 A
    The data we inserted
    Familiar characters

    View Slide

  143. Notice anything?
    The data we inserted
    Familiar characters

    View Slide

  144. Notice anything?
    The data we inserted
    Familiar characters

    View Slide

  145. Notice anything?
    The data we inserted
    Familiar characters
    Familiar characters (hex)
    The data we inserted (hex)

    View Slide

  146. Wouldn't it be convenient
    if we could make
    total_len == 0?

    View Slide

  147. We could import the
    Postgres structs and do
    this properly...

    View Slide

  148. ...or we could write a
    regex 🤔

    View Slide

  149. Let's write a regex

    View Slide

  150. View Slide

  151. Let's break this one

    View Slide

  152. wal_file_name = ARGV[0]


    puts wal_file_name


    wal_contents = IO.read(wal_file_name, encoding: "BINARY")


    hex = wal_contents.unpack("H*").first


    replaced = hex.gsub(/3f(000000.+41424300)/, "00\\1")


    bindata = [replaced].pack("H*")


    File.write(wal_file_name + ".broken", bindata)
    break_wal.rb

    View Slide

  153. wal_file_name = ARGV[0]


    puts wal_file_name


    wal_contents = IO.read(wal_file_name, encoding: "BINARY")


    hex = wal_contents.unpack("H*").first


    replaced = hex.gsub(/3f(000000.+41424300)/, "00\\1") # Replaces 'ABC' size


    bindata = [replaced].pack("H*")


    File.write(wal_file_name + ".broken", bindata)
    break_wal.rb

    View Slide

  154. Let's break this one

    View Slide

  155. Broken!!

    View Slide

  156. And if we give it to
    a Postgres

    replica?

    View Slide

  157. 2023-02-28 19:24:11 GMT LOG: restored log file
    "000000020000000000000003" from archive


    2023-02-28 19:24:11 GMT LOG: invalid record length
    at 0/3000148
    We reproduced the error!

    View Slide

  158. Success 😄

    View Slide

  159. Success, with a
    caveat...😔

    View Slide

  160. This wasn't enough
    to reproduce


    the outage

    View Slide

  161. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica

    View Slide

  162. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica


    6. ...

    View Slide

  163. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  164. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend
    Backup


    VIP

    View Slide

  165. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica


    6. Backup VIP on synchronous replica

    View Slide

  166. We added it to
    the cluster

    View Slide

  167. Ran the repro
    script

    View Slide

  168. and...

    View Slide

  169. Success


    (no caveats)


    😄

    View Slide

  170. Abstraction: deconstructed


    Outage: recreated

    View Slide

  171. but...


    why?

    View Slide

  172. Background:


    how Pacemaker
    schedules resources

    View Slide

  173. 2relevant
    settings

    View Slide

  174. By default:
    reschedule
    without penalty

    View Slide

  175. Postgres
    Postgres
    Postgres Repl Repl
    Pacemaker Pacemaker Pacemaker
    VIP
    API backend

    View Slide

  176. Postgres
    Postgres
    Postgres
    Repl
    VIP
    Pacemaker Pacemaker Pacemaker
    Repl
    API backend

    View Slide

  177. Setting:


    default-resource-stickiness

    View Slide

  178. By default:
    resources can run
    anywhere

    View Slide

  179. Setting:


    colocation

    View Slide

  180. default-resource-stickiness = 100


    &


    colocation -inf: BackupVIP Primary

    View Slide

  181. default-resource-stickiness = 100


    &


    colocation -inf: BackupVIP Primary

    View Slide

  182. default-resource-stickiness = 100


    &


    colocation -inf: BackupVIP Primary

    View Slide

  183. A very subtle
    semantic
    difference

    View Slide

  184. -1000 -inf

    View Slide

  185. -1000 -inf
    "Avoid
    scheduling
    these together"

    View Slide

  186. -1000 -inf
    "Avoid
    scheduling
    these together"
    "Literally never
    schedule these
    together"

    View Slide

  187. default-resource-stickiness = 100


    &


    colocation -inf: BackupVIP Primary

    View Slide

  188. default-resource-stickiness = 100


    &


    colocation -1000: BackupVIP Primary

    View Slide

  189. Failover works


    properly

    View Slide

  190. P.S. The WAL error was a red herring

    View Slide

  191. Sorry

    View Slide

  192. I know it was the most interesting part

    View Slide

  193. and it would have been kinda cool

    View Slide

  194. but it was part of the debugging process

    View Slide

  195. 💖

    View Slide

  196. 1. RAID array loses disks


    2. Kernel sets
    fi
    lesystem read-only


    3. Pacemaker detects primary failure


    4. Synchronous replica crash


    5. Suspicious log on synchronous replica


    6. Backup VIP on synchronous replica

    View Slide

  197. What can we


    learn?

    View Slide

  198. None of the


    stack


    is magic

    View Slide

  199. None of the


    stack


    is magic
    😁

    View Slide

  200. None of the


    stack


    is magic
    😁
    😰

    View Slide

  201. "It's just someone else's
    computer"

    View Slide

  202. "It's just someone else's
    abstraction"

    View Slide

  203. Read


    other people's


    code...

    View Slide

  204. ...and


    try to


    modify it

    View Slide

  205. Automation


    erodes


    knowledge

    View Slide

  206. Game days are
    a partial
    fi
    x

    View Slide

  207. "What if we had to
    recover our database
    server manually?"

    View Slide

  208. Don't stop
    questioning
    your repro

    View Slide

  209. 1. No magic in the stack


    2. Automation erodes knowledge


    3. Always question the repro

    View Slide

  210. View Slide

  211. JSON


    over


    HTTP

    View Slide

  212. Binary


    formats


    are coming


    to web development

    View Slide

  213. Protobuf


    over


    HTTP/2

    View Slide

  214. Protobuf


    over


    HTTP/2
    (e.g. gRPC)

    View Slide

  215. There is tooling
    help

    View Slide

  216. https://buf.build/blog/buf-curl/

    View Slide

  217. https://github.com/wader/fq

    View Slide

  218. Hex: the lowest common
    denominator

    View Slide

  219. It's


    worth


    getting


    familiar

    View Slide

  220. One last thing to
    ask of


    you

    View Slide

  221. Most computing
    happens
    successfully

    View Slide

  222. The


    0.00001%


    * not a real statistic

    View Slide

  223. Outsized


    negative


    impact

    View Slide

  224. It's a shame


    not to


    learn

    View Slide

  225. "We noticed a problem."


    "We
    fi
    xed the problem."


    "We'll make sure the problem doesn't
    happen again."

    View Slide

  226. 3good


    examples

    View Slide

  227. https://slack.engineering/slacks-outage-on-january-4th-2021/

    View Slide

  228. https://incident.io/blog/intermittent-downtime

    View Slide

  229. https://about.gitlab.com/blog/2017/02/10/postmortem-of-database-outage-of-
    january-31/

    View Slide

  230. Please


    Share the dif
    fi
    cult stories too

    View Slide

  231. Thank you
    ✌❤
    @planetscaledata
    sinjo.dev

    View Slide

  232. View Slide

  233. Image credits
    • Programmer's Laptop - Wall Boat - Public Domain - https://www.
    fl
    ickr.com/photos/
    wallboat/36819065315/


    • Pouring Latte Art - Craft Coffee Spot - CC-BY - https://www.
    fl
    ickr.com/photos/
    195403219@N08/52200966448/


    • microscope - Milosz1 - CC-BY - https://www.
    fl
    ickr.com/photos/mikolski/3269906279


    • Hard Disk Guts - CC-BY - https://www.
    fl
    ickr.com/photos/mattandkim/97533589/


    • Corsair ForceGT 180GB - CC-BY - https://www.
    fl
    ickr.com/photos/ruocaled/8173124575/

    View Slide

  234. Image credits
    • Server - The Noun Project (via WikiMedia) - CC0 - https://commons.wikimedia.org/wiki/
    File:Server_-_The_Noun_Project.svg


    • Rope - Robo Android - CC-BY - https://www.
    fl
    ickr.com/photos/
    49140926@N07/6798304070/


    • Stargazing - Max Delaquis - CC-BY - https://www.
    fl
    ickr.com/photos/
    115000114@N07/28861043652

    View Slide

  235. Questions?
    ✌❤
    @planetscaledata
    sinjo.dev

    View Slide