ко» из коробки Patroni решает вопрос «HA» из HA» из ко» Иногда случаются проблемы разной степени тяжести Расскажу, с чем сталкивались мы у наших клиентов Как диагностировали, что подкрутили в итоге
из ко • Агентская служба, init-системасистема для postgres • Хранит «состояние кластера» в DCS • Автофайловер включен по умолчанию • ... есть много других вещей
автофайловер * - Patroni не должен следить за оборудованием, слать нотификации в слак, делать сложные вещи по потенциальному предотвращению аварий и т.п.
by acquiring session lock pgdb-2 patroni: WARNING: Loop time exceeded, rescheduling immediately. pgdb-2 patroni: INFO: Lock owner: pgdb-2; I am pgdb-2 pgdb-2 patroni: INFO: updated leader lock during promote pgdb-2 patroni: server promoting pgdb-2 patroni: INFO: cleared rewind state after becoming the leader pgdb-2 patroni: INFO: Lock owner: pgdb-2; I am pgdb-2
patroni.utils.RetryFailedError: 'Exceeded retry deadline' pgdb-2 patroni: ERROR: Error communicating with DCS pgdb-2 patroni: INFO: demoted self because DCS is not accessible and i was a leader pgdb-2 patroni: WARNING: Loop time exceeded, rescheduling immediately.
leader by acquiring session lock pgdb-2 patroni: WARNING: Loop time exceeded, rescheduling immediately. pgdb-2 patroni: INFO: Lock owner: pgdb-2; I am pgdb-2 pgdb-2 patroni: INFO: updated leader lock during promote pgdb-2 patroni: server promoting pgdb-2 patroni: INFO: cleared rewind state after becoming the leader pgdb-2 patroni: INFO: Lock owner: pgdb-2; I am pgdb-2
Traceback (most recent call last): ... python trace maindb-1 patroni: RetryFailedError: 'Exceeded retry deadline' maindb-1 patroni: ERROR: Error communicating with DCS maindb-1 patroni: INFO: closed patroni connection to the postgresql cluster maindb-1 patroni: INFO: postmaster pid=214121 ... ... maindb-1 patroni: INFO: demoted self because DCS is not accessible and i was a leader maindb-1 patroni: WARNING: Loop time exceeded, rescheduling immediately. Когда и почему произошел файловер?
I am maindb-1 maindb-1 patroni: INFO: does not have lock maindb-1 patroni: INFO: running pg_rewind from maindb-2 maindb-1 patroni: INFO: running pg_rewind from user=postgres host=192.168.11.18 port=5432 ... maindb-1 patroni: servers diverged at WAL location 1FA/A38FF4F8 on timeline 6 maindb-1 patroni: rewinding from last common checkpoint at 1FA/A38FF450 on timeline 6 maindb-1 patroni: INFO: Lock owner: maindb-2; I am maindb-1 maindb-1 patroni: INFO: running pg_rewind from maindb-2 in progress maindb-1 patroni: Done!
patroni: INFO: Lock owner: maindb-1; I am maindb-1 maindb-1 patroni: INFO: no action. i am the leader with the lock maindb-1 patroni: INFO: Lock owner: maindb-1; I am maindb-1 maindb-1 patroni: INFO: no action. i am the leader with the lock maindb-1 patroni: INFO: Lock owner: maindb-1; I am maindb-1 maindb-1 patroni: INFO: no action. i am the leader with the lock Что предшествовало файловеру?
serf: EventMemberFailed: maindb-1 192.168.11.19 maindb-2 consul[11581]: [INFO] serf: EventMemberFailed: maindb-1 192.168.11.19 maindb-2 consul[11581]: memberlist: Suspect maindb-1 has failed, no acks received maindb-2 consul[11581]: [INFO] memberlist: Suspect maindb-1 has failed, no acks received Они сомневаются в существовании и других соседей тоже... И вы тоже сомневаетесь... Что предшествовало файловеру?
patroni[1425]: ERROR: failed to update leader lock Pg02 patroni[1425]: INFO: demoted self because failed to update leader lock in DCS pg02 patroni[1425]: WARNING: Loop time exceeded, rescheduling immediately. pg02 patroni[1425]: INFO: Lock owner: None; I am pg02.dev.zirconus.local pg02 patroni[1425]: INFO: not healthy enough for leader race pg02 patroni[1425]: INFO: starting after demotion in progress
promoted self to leader by acquiring session lock pg03 patroni[9648]: server promoting pg03 patroni[9648]: INFO: cleared rewind state after becoming the leader
с перемотки журнала транзакций pg02 patroni[1425]: INFO: running pg_rewind from pg03 pg02 patroni[1425]: INFO: running pg_rewind from user=postgres host=10.202.1.103 port=5432 ... pg02 patroni[1425]: servers diverged at WAL location 33F/68E6AD10 on timeline 28 pg02 patroni[1425]: could not open file "/data/pgdb/11/pg_wal/0000001C0000033F00000059": No such file or directory pg02 patroni[1425]: could not find previous WAL record at 33F/59FFE368 pg02 patroni[1425]: Failure, exiting pg02 patroni[1425]: ERROR: Failed to rewind from healty master: pg03 pg02 patroni[1425]: WARNING: Postgresql is not running.
by acquiring session lock pgdb-1 patroni[17836]: server promoting pgdb-1 patroni[17836]: INFO: cleared rewind state after becoming the leader pgdb-1 patroni[17836]: INFO: Lock owner: pgdb-1; I am pgdb-1
from pgdb-1 pgdb-2 patroni[4149]: Lock owner: pgdb-1; I am pgdb-2 pgdb-2 patroni[4149]: Deregister service pgdb/pgdb-2 pgdb-2 patroni[4149]: running pg_rewind from pgdb-1 in progress pgdb-2 patroni[4149]: running pg_rewind from user=replica host=10.10.1.31 port=5432 ... pgdb-2 patroni[4149]: servers diverged at WAL location 0/5AD1BFD8 on timeline 66 pgdb-2 patroni[4149]: rewinding from last common checkpoint at 0/59DD1070 on timeline 66 pgdb-2 patroni[4149]: Done!
realtime • Риск «HA» из потеряшек» остается: • В худшем случае: maximum_lag_on_failover + ttl • В среднем случае: maximum_lag_on_failover + (loop_wait / 2) Хорошая новость — в «HA» из потеряшках» могут быть WA» из коL от фоновых процессов
pgdb-1 patroni[6202]: INFO: not healthy enough for leader race pgdb-1 patroni[6202]: INFO: changing primary_conninfo and restarting in progress ... pgdb-1 patroni[6202]: INFO: following a different leader because i am not the healthiest node pgdb-1 patroni[6202]: INFO: following a different leader because i am not the healthiest node pgdb-1 patroni[6202]: INFO: following a different leader because i am not the healthiest node pgdb-1 patroni[6202]: INFO: following a different leader because i am not the healthiest node pgdb-1 patroni[6202]: INFO: following a different leader because i am not the healthiest node
data to client: Broken pipe 14:48:58.078 [idle] LOG: could not send data to client: Broken pipe 14:48:58.078 [idle] FATAL: connection to client lost 14:48:58.107 [idle in transaction] FATAL: connection to client lost
process 4298 acquired ExclusiveLock on tuple (2,10) of relation 52082 of database 50587 after 52487.463 ms 14:49:26.929 [UPDATE waiting] STATEMENT: UPDATE sessions SET lastaccess='1565005714' WHERE sessionid=... 14:49:27.929 [UPDATE waiting] LOG: process 4298 still waiting for ShareLock on transaction 364118337 after 1000.088 ms 14:49:27.929 [UPDATE waiting] DETAIL: Process holding the lock: 4294. Wait queue: 4298. 14:49:27.929 [UPDATE waiting] CONTEXT: while locking tuple (7,60) in relation "sessions" 14:49:27.929 [UPDATE waiting] STATEMENT: UPDATE sessions SET lastaccess='1565005714' WHERE sessionid=...
received immediate shutdown request 14:49:34.689 [authentication] WARNING: terminating connection because of crash of another server process 14:49:34.689 [authentication] DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory
За это время одна из реплик сделала «HA» из promote» pgsql03 patroni: 14:48:25,000 INFO: Lock owner: pgsql03; I am pgsql03 pgsql03 patroni: 14:48:25,013 INFO: no action. i am the leader with the lock pgsql03 patroni: 14:48:37,159 INFO: Lock owner: pgsql03; I am pgsql03 pgsql03 patroni: 14:49:31,155 WARNING: Exception hened during processing of request from 10.1.0.12 pgsql03 patroni: 14:49:31,297 WARNING: Exception hened during processing of request from 10.1.0.11 pgsql03 patroni: 14:49:31,298 WARNING: Exception hened during processing of request from 10.1.0.11
leader by acquiring session lock pgsql01 patroni: server promoting pgsql01 patroni: 14:48:57,214 INFO: cleared rewind state after becoming the leader pgsql01 patroni: 14:49:05,013 INFO: Lock owner: pgsql01; I am pgsql01 pgsql01 patroni: 14:49:05,023 INFO: updated leader lock during promote
not take out TTL lock pgsql02 patroni: 14:48:57,137 INFO: following new leader after trying and failing to obtain lock pgsql02 patroni: 14:49:05,014 INFO: Lock owner: pgsql01; I am pgsql02 pgsql02 patroni: 14:49:05,025 INFO: changing primary_conninfo and restarting in progress pgsql02 patroni: 14:49:15,011 INFO: Lock owner: pgsql01; I am pgsql02 pgsql02 patroni: 14:49:15,014 INFO: changing primary_conninfo and restarting in progress pgsql02 patroni: 14:49:25,011 INFO: Lock owner: pgsql01; I am pgsql02 pgsql02 patroni: 14:49:25,014 INFO: changing primary_conninfo and restarting in progress pgsql02 patroni: 14:49:35,011 INFO: Lock owner: pgsql01; I am pgsql02 pgsql02 patroni: 14:49:35,014 INFO: changing primary_conninfo and restarting in progress
Реплика аварийно перезапускается и начинает recovery Recovery прерывается, т.к. старый мастер в это время выключался 14:49:34.293 [idle] LOG: received replication command: IDENTIFY_SYSTEM WARNING: terminating connection because of crash of another server process DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 14:49:35.232 FATAL: could not receive data from WAL stream: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. 14:49:35.232 LOG: record with incorrect prev-link 142D46/315602C at 14CF/CF38C160 14:49:35.305 FATAL: could not connect to the primary server: FATAL: the database system is shutting down 14:49:40.241 FATAL: could not connect to the primary server: FATAL: the database system is shutting down
with incorrect prev-link 142D46/315602C at 14CF/CF38C160 14:50:14.028 [] LOG: fetching timeline history file for timeline 72 from primary server 14:50:14.104 [] FATAL: could not start WAL streaming: ERROR: requested starting point 14CF/CF000000 on timeline 71 is not in this server's history DETAIL: This server's history forked from timeline 71 at 14CF/CEC32E40. 14:50:14.104 [] LOG: new timeline 72 forked off current database system timeline 71 before current recovery point 14CF/CF38C160
коTA» из коL: could not start WA» из коL streaming: ERROR: requested starting point 14CF/watchdoCF000000 on timeline 71 is not in this server's history DETA» из коIL: This server's history forked from timeline 71 at 14CF/watchdoCEC32E40. 14:50:44.143 LOG: new timeline 72 forked off current database system timeline 71 before current recovery point 14CF/watchdoCF38C160
идет recovery, и даже база открылась 15:14:13.511 LOG: consistent recovery state reached at 14CF/A3F657B0 15:14:13.511 LOG: database system is ready to accept read only connections
14CF/CF38C160 15:14:17.077 LOG: started streaming WAL from primary at 14CF/CF000000 on timeline 72 15:14:17.536 LOG: invalid record length at 14CF/CF38C160: wanted 24, got 1 и через минуту отвалилась 15:15:27.823 FATAL: terminating walreceiver process due to administrator command 15:15:27.895 LOG: invalid record length at 14CF/CF38C160: wanted 24, got 1 15:15:27.895 LOG: invalid record length at 14CF/CF38C160: wanted 24, got 1
contrecord length 5913 at 14CF/CEFFF7B0 15:17:37.304 LOG: started streaming WAL from primary at 14CF/CE000000 on timeline 72 и снова останавливается, причем сообщение уже другое 15:18:12.208 FATAL: terminating walreceiver process due to administrator command 15:18:12.240 LOG: record with incorrect prev-link 60995000/589DF000 at 14CF/CEFFF7B0 15:18:12.240 LOG: record with incorrect prev-link 60995000/589DF000 at 14CF/CEFFF7B0
запускается уже с другого места и больше не рвется 15:22:43.727 LOG: invalid record length at 14D1/BCF3610: wanted 24, got 0 15:22:43.731 LOG: started streaming WAL from primary at 14D1/B000000 on timeline 72
кластер • количество реплик • лаг репликации • ошибки в логах Автоматика может успешно отработать, но при этом объекты автоматики могут находиться в полурабочем состоянии Нужно уметь чинить руками Иметь в виду при эксплуатации Patroni
Файловер имеет два timestamp — начало и конец Почему произошел файловер? Ищем по логам первые события, когда что-системато пошло не так: • Логи patroni • Логи postgres • Логи DCS • Логи системы Этапы диагностики
я Patroni? Определенно, да Хотите больше интересных историй? https://github.com/zalando/patroni/issues Спасибо Zalando, A» из коlexander Kukushkin, Oleksii Kliukin и всем контрибьюторам Послесловие