Stuff to cover •How Postgres fits in • Migrations • Middleware, stored procedures, crontabber • How to love JSON • Writing Queries • Common Admin issues
What setupdb_app.py does • Connects as database superuser, creates a database, creates socorro users, disconnects • Connects as user, creates a schema based on socorro/external/postgresql/model.py • Optional: loads data using fakedata.py • Stamps alembic revision to latest
Stuff to cover • How Postgres fits in •Migrations • Middleware, stored procedures, crontabber • How to love JSON • Writing Queries • Common Admin issues
Most common issues • Failure to test migration (see: Makefile) • Problem with non-linear history (two migrations inherit from same parent) • Assumptions that don’t pan out on downgrade (protip: run downgrade then upgrade after a setupdb_app.py run)
Internal middleware • socorro/external/postgresql/XXX.py • see: socorro/external/postgresql/ crashes.py for an example • Expose a route to the new endpoint
Stuff to cover • How Postgres fits in • Migrations • Middleware, stored procedures, crontabber •How to love JSON • Writing Queries • Common Admin issues
with a as ( select processed_crash from processed_crashes limit 1 ), b as ( select json_array_elements(processed_crash->'addons') as addons from a ) select addons->>0 from b;
But, easier than reports • Big unanswered question: Can we kill the reports table? • advanced search • reports/list • update_reports_clean() • 3 reports THAT SHOULD DIE
Stuff to cover • How Postgres fits in • Migrations • Middleware, stored procedures, crontabber • How to love JSON •Writing Queries • Common Admin issues
Protips! • Don’t ever use now()::timestamp (magical, immediate performance killer) • Run EXPLAIN, talk to a buddy • Check for unnecessary JOINs • Don’t expect local/stage performance to be similar to prod performance • Always use “WHERE date_processed ...” (more on this later!)
Avoid reports • Avoid using the reports table • Deprecation target AND usually slower than reports_clean • matview? Use raw_crashes or processed_crashes and make indexes!
Adding indexes to partitioned tables • see report_partition_info • Happens weekly • To add to old partitions, do “manually” in a migration based on pg_class data
Adding other indexes • Must exactly match predicate! • Best practice: Create in a transaction and test exactly the query you want to use the index to verify your index is correct
Altering tables • Never specify a DEFAULT in a CREATE or ALTER that creates a column on an existing table • Drop children before parents • Never LOCK EXCLUSIVE on parents unless you want to have a downtime • DROPs and ALTERs on children might deadlock with processors (they retry, NBD)
Stuff to cover • How Postgres fits in • Migrations • Middleware, stored procedures, crontabber • How to love JSON • Writing Queries •Common Admin issues
High load on master • Check if a backup is running (OOPS) SOLUTION: kill backup LONG TERM: Investigate why backup was running against master, correct configuration/process documents
High load on master • Check if load is coming from middleware (cacheable query?) SOLUTION: kill middleware processes, sometimes have to restart middleware LONG TERM: Cache a query or refactor SQL
Maintenance of tables • We now drop old partitions without an exclusive lock on the parent • This is mostly safe -- might screw up READ ONLY queries looking at the data at the time of the drops • Worth it, because otherwise we need downtime windows • TODO: Now can be automated!
Maintenance of tables • Probably could be doing more • Probably not worth the time given that we drop old partitions • Newer Postgres will automatically FREEZE tables, reducing overhead of periodic VACUUMs for transaction wraparound