- Promote PostgreSQL Declarative Table Partitioning
- Grow knowledge and usage, especially for app developers
- Discuss challenges, mistakes, and share open source code samples from a recent conversion
Audience Polling #1: Who has used table partitioning in production in any capacity? #2: Converted from unpartitioned to partitioned? #3: Online conversion with pgslice or similar? bit.ly/PartPG2023 Slide 3 of 60
Assumptions Audience has some familiarity with what PostgreSQL Table Partitioning is. Crunchy Playground Partitioning Large Table Partitioning with PostgreSQL and Django How to use table partitioning to scale PostgreSQL A slice of life: table partitioning in PostgreSQL databases Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 4 of 60
Learner Outcomes PostgreSQL Table Partitioning Types, History, and Case Study Review Online table conversion and swap, practical details Lessons from our challenges and mistakes How partitioned tables help with cost management and data lifecycle management Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 5 of 60
App Background and Context HR Tech App, SaaS, B2B, Monolithic ( ~100K LOC , > 90% test coverage) Running Ruby on Rails 6.1 Using public_activity Ruby gem 3K High growth activities table (Not partitioned) Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 6 of 60
Event log for applicants in their hiring process Events are visible to hiring managers and recruiters A lot of data. Denormalized YAML. There can be around ~40 activities per applicant, and there are millions of applicants Errors in this area of the app are highly visible to users Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 7 of 60
PostgreSQL 13.x ~10 production databases each with activities table (sizes from 50 GB to 1.5 TB ) Mix of regular RDS PostgreSQL and Aurora Sum of rows across ~10 DBs is ~3 billion rows and ~2 TB of data PK column is id , uses sequence, bigint Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 10 of 60
Problems activities table was very large, modifications like adding indexes are painful Range queries can time out Despite 8 years of data from 2015, users interact most with recent data, and thus most data is never accessed Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 11 of 60
We'd like to stop paying forever increasing costs storing data that's not accessed We'd like to reduce overall storage costs, including from consumption, backups, and transfers We'd like to modify the table easily Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 12 of 60
Isn't Partitioning too much work? Old method is a lot of work, Inheritence based, a.k.a "trigger based" partitioning Requires functions, triggers, CHECK constraints to INSERT data in correct spot Uses table inheritance, the INHERITS keyword Considered a "Legacy Method" Sources: #1 Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 14 of 60
How to get started? Engineering's job is to build a reliable and scalable SaaS platform. 1. Form hypotheses about acceptable changes for customers 2. Test and validate hypotheses Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 15 of 60
Hypothesis: 2 years of data is more than enough Added Feature Flag to LIMIT queries, to test hypothesis Scream test that for a few weeks After weeks passed w/o concerns being raised, we moved forward with partitioning Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 16 of 60
Partition Strategy Use RANGE type Partition on created_at timestamp column Monthly partitions Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 17 of 60
Declarative Partitioning Launched in Version 10 (2017) LIST , HASH (11), we used RANGE Achieves slicing up data from one big table, into parent and child tables, with less to do No triggers, functions, or CHECK constraints needed Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 18 of 60
Declarative Partitioning Improvements 10 Introduction 11 Added DEFAULT partition, HASH type 11 Support for PK , FK constraints, Indexes, Triggers 11 INSERT to correct partition, UPDATE between partitions 12 FK can reference partitions Sources: #1, #2 Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 19 of 60
Partitioning Options pg_partman 1.5k (w/ pg_cron 1.5k ) pg_party 400 No libraries or extensions, roll your own: John Nunemaker Blog, Brandur Blog pgslice 405 Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 21 of 60
Selected pgslice CLI program, written in Ruby, language used by team, open source Small and minimal library Non-Destructive, Row Copying approach Iterative rollout, ability to step forward and backward Leaves unmodified original table as retired Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 22 of 60
pgslice Quick Tour prep creates _intermediate add_partitions Adds past and future partitions fill (Batched copies, used --where option to limit to 2 years) analyze Perform SQL ANALYZE on all partitions swap Swap table names Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 23 of 60
pgslice Implementation Added gem to Gemfile Created binstub: bundle exec binstubs pgslice , e.g. bin/pgslice Added helper Ruby class ( PgsliceHelper ) to help app integration Invoked helper with rails runner Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 24 of 60
Downsides of pgslice and Row Copying Intended for Append Only tables, i.e. no UPDATE or DELETE statements during copy By copying rows, need more storage space Online migration adds load to server, need spare capacity Requires a lot of advanced development and testing Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 25 of 60
Managing Partitions In Prod and Dev We wanted developers to use a partitioned table to discover bugs or performance problems, but not have to manage partitions In Prod, we wanted 2 years of partitions Dev doesn't need partitions at all Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 26 of 60
Selected DEFAULT Partition (11+) Keeps table Partitioned for Dev Avoids partition maintenance in Dev A nice middle ground Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 27 of 60
Partition Management In Prod, Cron Job Creates future partitions ( monitor this! ) In Dev, DEFAULT partition info is dumped to db/structure.sql A Cron job could DETACH partitions to cycle out old ones as well Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 28 of 60
Implementation Notes Added retire / unretire DEFAULT partition Added customization of schema Configured where clause to fill only 2 years Added dump ( pg_dump ) and DROP support for future partition management Source: andyatkinson/rideshare/pgslice Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 31 of 60
Rails and CI Integration No Rails integration No Migrations for _intermediate table or DEFAULT partition No automatic partitions or Cron guidance. (We used whenever) No test or CI guidance Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 32 of 60
Find Bugs Using Test Suite Should have performed partition conversion in CI database from the beginning Code compatibility was not as "seamless" as expected with partitioned table Surprises like UPDATE of records, YAML serialized data in column Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 34 of 60
Code Compatibility Problems PRIMARY KEY changes activerecord-import 405 did not work Rails needed to know how to find the id Serialized content in text column required careful escaping Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 36 of 60
Code Compatibility Fixes Explicit primary key configuration self.primary_key = :id (dual support) Replaced activerecord-import with .insert_all() for multi-row insert Set unique_by: [] option for .insert_all() ON CONFLICT clause, adding UNIQUE constraint/index Added created_at to INSERT statements Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 37 of 60
Fixing Test Code Tests write activities records as the base data. Unfortunately they hard-coded historical dates, INSERT fails due to missing partition Updated dozens of tests to not use hard-coded dates, e.g. used Time.current Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 38 of 60
Ready To Go General Steps 1. Intermediate table is ready to fill 2. Add env vars like PGSLICE_URL 3. Run steps one by one at off-peak time, coordinating with team Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 39 of 60
Run pgslice fill in a tmux session, detach, monitor INSERT INTO ... SELECT * FROM WHERE id > 0 AND id < 10000 Options: --batch-size , --where , --sleep Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 41 of 60
Customer Accomodation Out of hundreds of customers, one wanted old data Solution was to add more partitions and fill from further back for customer Added a customer identifier column to use with pgslice and the --where "customer_id = 123" Added partial index, populated column Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 43 of 60
ALTER TABLE activities ADD COLUMN account_id INTEGER; ALTER TABLE activities_intermediate ADD COLUMN account_id INTEGER; CREATE SCHEMA temp; CREATE TABLE temp.customer ( id SERIAL PRIMARY KEY, applicant_id BIGINT UNIQUE, -- JOIN on activities.owner_id account_id INTEGER); -- fill temp.customer INSERT INTO temp.customer (applicant_id, account_id) SELECT id AS applicant_id, account_id FROM WHERE account_id = AND id > 0 AND id < 10000; Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 44 of 60
pgslice multiple fill passes pgslice only does INSERT statements Stops when it finds duplicates, so can't make multiple passes Solution was to TRUNCATE some recent partitions, then fill again from there Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 45 of 60
Beware of Schema Cache Following pgslice swap , running app had stale Active Record Schema Cache Solution was to restart app pods, but also catch an exception and force a reset of the Schema Cache while rolling restart completed Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 46 of 60
Speeding Up Row Copies DROP all or most indexes from intermediate table. Set the table as unlogged. ALTER TABLE activities_intermediate SET UNLOGGED; When copying is done, add indexes back manually ( CREATE INDEX and match source table columns) Set table back to LOGGED Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 47 of 60
Analyzing Results With only 2 years of data, the replacement table was much smaller 215 GB 39 GB 82% decrease 163 GB 45 GB 72% decrease 1500 GB 500 GB 67% decrease relation_size_extended.sql Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 48 of 60
AWS PostgreSQL Costs and Savings RDS PostgreSQL and Aurora costs are different Storage and IOPS are combined for RDS General Purpose SSD $0.115 per GB-month, Backups: $0.095 per GB-month, Also: Snapshot exports, transfers S3: $0.025 per GB-month 1TB comparison: RDS: $115. S3: $25. 78% less. Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 49 of 60
AWS RDS Cost Reduction Pay for what's provisioned, not for I/O used Storage auto-scaling (2019), you keep paying With space reduction, still pay for what's provisioned Not "easy" to pay less. Need new instance, but less space comes with less IOPS. Source: CloudZero Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 50 of 60
Aurora Cost Reduction You can reduce storage charges by dropping tables, indexes, databases, and so on that you no longer need. 10 GB increments Set up partitioning plus regularly archive old partitions $0.10 per GB-month. 1TB for 1 month = $100 Perhaps most useful as a "cost control mechanism" Source: #1 Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 51 of 60
Cleaning Up Remove transitional code Remove feature flagging Document steps for team members Add monitoring for partition management What about the _retired table? Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 53 of 60
Unlocked Capabilities Retired table can now be archived to low cost storage like S3 Old partitions can be automatically detached and archived By regularly archiving old partitions, overall growth slows and may flatten Steps like DETACH CONCURRENTLY , pg_dump , and DROP TABLE can all be automated Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 54 of 60
Archive to S3 SELECT aws_commons.create_s3_uri( '', '', 'us-west-2' -- region ) AS s3_uri_1 \gset Sources: AWS Exports Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 56 of 60
Summary Partitioned a 2 billion rows table using pgslice without planned downtime With tables smaller by up to 80%, modifications, queries and maintenance, may be faster Regularly exporting unneeded data is now possible Helped improve our data lifecycle management practices Intro and App Problem Solution Implementation Next Steps bit.ly/PartPG2023 Slide 57 of 60
SQL Goodie Bag To Go Find the tables 10 largest tables Find append onlys Number of DMLs (append only) Index restore assistance List CREATE INDEX statements Partitions row counts List partition row counts Partition size estimates Estimated sizes for partitions following pattern bit.ly/PartPG2023 Slide 58 of 60
Special Recognition! PG Day Chicago organizers for accepting this talk! Team members that supported this project Andrew, the creator of pgslice, and reviewer! Reviewers: Haki, Ryan, Elizabeth bit.ly/PartPG2023 Slide 59 of 60