Slide 1

Slide 1 text

Life in the Fast Lane Speeding up an AWS RDS for PostgreSQL Migra8on

Slide 2

Slide 2 text

Azavea builds mobile and web geospa'al data applica4ons.

Slide 3

Slide 3 text

2001 Founded 69 Employees Philly Headquartered 4 Products 10% Research Time B Corpora4on

Slide 4

Slide 4 text

Hector Castro (@hectcastro), VP of Engineering

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Amazon Database Migra8on Service

Slide 7

Slide 7 text

Amazon Database Migra8on Service

Slide 8

Slide 8 text

NAME pg_dump - extract a PostgreSQL database into a script file or other archive file SYNOPSIS pg_dump [connection-option""...] [option""...] [dbname] DESCRIPTION pg_dump is a utility for backing up a PostgreSQL database. It makes consistent backups even if the database is being used concurrently. pg_dump does not block other users accessing the database (readers or writers). pg_dump only dumps a single database. To backup global objects that are common to all databases in a cluster, such as roles and tablespaces, use pg_dumpall(1). Dumps can be output in script or archive file formats. Script dumps are plain-text files containing the SQL commands required to reconstruct the database to the state it was in at the time it was saved. To restore from such a script, feed it to psql(1). Script files can be used to reconstruct the database even on other machines and other architectures; with some modifications, even on other SQL database products.

Slide 9

Slide 9 text

NAME pg_restore - restore a PostgreSQL database from an archive file created by pg_dump SYNOPSIS pg_restore [connection-option""...] [option""...] [filename] DESCRIPTION pg_restore is a utility for restoring a PostgreSQL database from an archive created by pg_dump(1) in one of the non-plain-text formats. It will issue the commands necessary to reconstruct the database to the state it was in at the time it was saved. The archive files also allow pg_restore to be selective about what is restored, or even to reorder the items prior to being restored. The archive files are designed to be portable across architectures. pg_restore can operate in two modes. If a database name is specified, pg_restore connects to that database and restores archive contents directly into the database. Otherwise, a script containing the SQL commands necessary to rebuild the database is created and written to a file or standard output. This script output is equivalent to the plain text output format of pg_dump. Some of the options controlling the output are therefore analogous to pg_dump options.

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Amazon RDS

Slide 12

Slide 12 text

Amazon Rela8onal Database Service

Slide 13

Slide 13 text

–Amazon Web Services Amazon Rela4onal Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a rela4onal database in the cloud. It provides cost-efficient, resizable capacity for an industry- standard rela4onal database and manages common database administra4on tasks.

Slide 14

Slide 14 text

–Amazon Web Services Amazon Rela4onal Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a rela4onal database in the cloud. It provides cost-efficient, resizable capacity for an industry- standard rela4onal database and manages common database administra4on tasks.

Slide 15

Slide 15 text

https:!//rds.us-east-1.amazonaws.com/ ?Action=CreateDBInstance &AllocatedStorage=64 &DBInstanceClass=db.m4.large &DBInstanceIdentifier=business-db &Engine=Postgres &MasterUserPassword=s3cr3t &MasterUsername=business &SignatureMethod=HmacSHA256 &SignatureVersion=4 &Version=2014-09-01 &X-Amz-Algorithm=AWS4-HMAC-SHA256 &X-Amz-Credential=AKIADQKE4SARGYLE/20140424/us-east-1/rds/aws4_request &X-Amz-Date=20140424T194844Z &X-Amz-SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date &X-Amz-Signature=bee4aabc750bf7dad0cd9e22b952bd6089d91e2a16592c2293

Slide 16

Slide 16 text

No content

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

–Amazon Web Services Amazon Rela4onal Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a rela4onal database in the cloud. It provides cost-efficient, resizable capacity for an industry- standard rela4onal database and manages common database administra4on tasks.

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

https:!//rds.us-east-1.amazonaws.com/ ?Action=CreateDBInstance &AllocatedStorage=64 &DBInstanceClass=db.m4.large &DBInstanceIdentifier=business-db &Engine=Postgres &MasterUserPassword=s3cr3t &MasterUsername=business &SignatureMethod=HmacSHA256 &SignatureVersion=4 &Version=2014-09-01 &X-Amz-Algorithm=AWS4-HMAC-SHA256 &X-Amz-Credential=AKIADQKE4SARGYLE/20140424/us-east-1/rds/aws4_request &X-Amz-Date=20140424T194844Z &X-Amz-SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date &X-Amz-Signature=bee4aabc750bf7dad0cd9e22b952bd6089d91e2a16592c2293

Slide 23

Slide 23 text

Instance Type Memory (GB) vCPU Network Performance Price (Hourly) * db.m4.large 8 2 Moderate $0.182 db.m4.xlarge 16 4 High $0.365 db.m4.2xlarge 32 8 High $0.730 db.m4.4xlarge 64 16 High $1.461 db.m4.10xlarge 160 40 10 Gigabit $3.654 * Prices from us-east-1 as of 7/4/2016

Slide 24

Slide 24 text

Instance Type Memory (GB) vCPU Network Performance Price (Hourly) db.r3.large 15.25 2 Moderate $0.250 db.r3.xlarge 30.5 4 Moderate $0.500 db.r3.2xlarge 61 8 High $0.995 db.r3.4xlarge 122 16 High $1.990 db.r3.8xlarge 244 32 10 Gigabit $3.980 * Prices from us-east-1 as of 7/4/2016

Slide 25

Slide 25 text

Instance Type Memory (GB) vCPU Network Performance Price (Hourly) db.t2.micro 1 1 Low to Moderate $0.018 db.t2.small 2 1 Low to Moderate $0.036 db.t2.medium 4 2 Low to Moderate $0.073 db.t2.large 8 2 Low to Moderate $0.145 * Prices from us-east-1 as of 7/4/2016

Slide 26

Slide 26 text

https:!//rds.us-east-1.amazonaws.com/ ?Action=CreateDBInstanceReadReplica &DBInstanceIdentifier=business-db-replica &SignatureMethod=HmacSHA256 &SignatureVersion=4 &SourceDBInstanceIdentifier=business-db &Version=2014-09-01 &X-Amz-Algorithm=AWS4-HMAC-SHA256 &X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/rds/aws4_request &X-Amz-Date=20140425T170525Z &X-Amz-SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date &X-Amz-Signature=a5bc7bb9648272e9967c76fc582b308d3ee37d6c4f7a4eb62c

Slide 27

Slide 27 text

https:!//rds.us-east-1.amazonaws.com/ ?Action=CreateDBInstanceReadReplica &DBInstanceIdentifier=business-db-replica &SignatureMethod=HmacSHA256 &SignatureVersion=4 &SourceDBInstanceIdentifier=business-db &Version=2014-09-01 &X-Amz-Algorithm=AWS4-HMAC-SHA256 &X-Amz-Credential=AKIADQKE4SARGYLE/20140425/us-east-1/rds/aws4_request &X-Amz-Date=20140425T170525Z &X-Amz-SignedHeaders=content-type;host;x-amz-content-sha256;x-amz-date &X-Amz-Signature=a5bc7bb9648272e9967c76fc582b308d3ee37d6c4f7a4eb62c

Slide 28

Slide 28 text

–Amazon Web Services Amazon Rela4onal Database Service (Amazon RDS) is a web service that makes it easier to set up, operate, and scale a rela4onal database in the cloud. It provides cost-efficient, resizable capacity for an industry- standard rela4onal database and manages common database administra4on tasks.

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

!

Slide 34

Slide 34 text

RDS Parameter Groups

Slide 35

Slide 35 text

hKps:/ /www.postgresql.org/docs/current/sta8c/populate.html

Slide 36

Slide 36 text

hKps:/ /docs.aws.amazon.com/AmazonRDS/latest/UserGuide/PostgreSQL.Procedural.Impor8ng.html

Slide 37

Slide 37 text

"

Slide 38

Slide 38 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 39

Slide 39 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 40

Slide 40 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 41

Slide 41 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 42

Slide 42 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 43

Slide 43 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 44

Slide 44 text

Parameter Value Unit maintenance_work_mem 524288-4194304 Kilobytes max_wal_size 256 16 Megabytes checkpoint_timeout 1800 Seconds synchronous_commit off N/A wal_buffers 8192 8 Kilobytes autovacuum 0 N/A

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

No content

Slide 47

Slide 47 text

$ ssh hector@current-database

Slide 48

Slide 48 text

osm=# SELECT * FROM pg_stat_activity; -[ RECORD 1 ]----+-------------------------------- datid | 16385 datname | hector pid | 93232 usesysid | 16384 usename | hector application_name | psql client_addr | client_hostname | client_port | -1 backend_start | 2017-07-03 22:02:36.071841-04 xact_start | 2017-07-03 22:11:08.038188-04 query_start | 2017-07-03 22:11:08.038188-04 state_change | 2017-07-03 22:11:08.038192-04 wait_event_type | wait_event | state | active backend_xid | backend_xmin | 1871 query | SELECT * FROM pg_stat_activity; osm=#

Slide 49

Slide 49 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 50

Slide 50 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 51

Slide 51 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 52

Slide 52 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 53

Slide 53 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 54

Slide 54 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 55

Slide 55 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 56

Slide 56 text

$ pg_dump -Fc -h localhost -U hector -v -f osm.dump osm 2>&1 | tee dump.log

Slide 57

Slide 57 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 58

Slide 58 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 59

Slide 59 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 60

Slide 60 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 61

Slide 61 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 62

Slide 62 text

$ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp osm.dump s3:!//business-db/osm.dump

Slide 63

Slide 63 text

No content

Slide 64

Slide 64 text

No content

Slide 65

Slide 65 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 66

Slide 66 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 67

Slide 67 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 68

Slide 68 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 69

Slide 69 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 70

Slide 70 text

✓At least 10 Gigabit network performance ✓At least enough instance storage for database dump ✓Launch in same availability zone as target database ✓Use spot market, if you’re feeling lucky ✓If within private subnet, enable S3 VPC endpoint

Slide 71

Slide 71 text

$ ssh ec2-user@beefy-ec2-instance

Slide 72

Slide 72 text

$ sudo mkfs.ext4 /dev/xvdb $ sudo mount /dev/xvdb /mnt $ sudo chown -R ec2-user:ec2-user /mnt/ $ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp s3:!//business-db/osm.dump /mnt/osm.dump

Slide 73

Slide 73 text

$ sudo mkfs.ext4 /dev/xvdb $ sudo mount /dev/xvdb /mnt $ sudo chown -R ec2-user:ec2-user /mnt/ $ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp s3:!//business-db/osm.dump /mnt/osm.dump

Slide 74

Slide 74 text

$ sudo mkfs.ext4 /dev/xvdb $ sudo mount /dev/xvdb /mnt $ sudo chown -R ec2-user:ec2-user /mnt/ $ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp s3:!//business-db/osm.dump /mnt/osm.dump

Slide 75

Slide 75 text

$ sudo mkfs.ext4 /dev/xvdb $ sudo mount /dev/xvdb /mnt $ sudo chown -R ec2-user:ec2-user /mnt/ $ aws configure set default.s3.max_concurrent_requests 16 $ aws configure set default.s3.max_queue_size 10000 $ aws configure set default.s3.multipart_threshold 64MB $ aws configure set default.s3.multipart_chunksize 16MB $ aws s3 cp s3:!//business-db/osm.dump /mnt/osm.dump

Slide 76

Slide 76 text

No content

Slide 77

Slide 77 text

No content

Slide 78

Slide 78 text

✓At least 10 Gigabit network performance ✓Set parameter group tuned for imports ✓Disable mutli-AZ feature ✓Launch in same availability zone as restore instance

Slide 79

Slide 79 text

✓At least 10 Gigabit network performance ✓Set parameter group tuned for imports ✓Disable mutli-AZ feature ✓Launch in same availability zone as restore instance

Slide 80

Slide 80 text

✓At least 10 Gigabit network performance ✓Set parameter group tuned for imports ✓Disable mutli-AZ feature ✓Launch in same availability zone as restore instance

Slide 81

Slide 81 text

✓At least 10 Gigabit network performance ✓Set parameter group tuned for imports ✓Disable mutli-AZ feature ✓Launch in same availability zone as restore instance

Slide 82

Slide 82 text

✓At least 10 Gigabit network performance ✓Set parameter group tuned for imports ✓Disable mutli-AZ feature ✓Launch in same availability zone as restore instance

Slide 83

Slide 83 text

$ hostname beefy-ec2-instance

Slide 84

Slide 84 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 85

Slide 85 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 86

Slide 86 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 87

Slide 87 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 88

Slide 88 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 89

Slide 89 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 90

Slide 90 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 91

Slide 91 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 92

Slide 92 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 93

Slide 93 text

$ pg_restore -C -j16 -Fc -h …rds.amazonaws.com -U hector -v \ -d postgres /mnt/osm.dump 2>&1 | tee restore.log

Slide 94

Slide 94 text

No content

Slide 95

Slide 95 text

$

Slide 96

Slide 96 text

osm=# VACUUM (VERBOSE, ANALYZE);

Slide 97

Slide 97 text

✓Set parameter group back to normal ✓Enable mutli-AZ feature ✓Scale back down to desired instance type ✓Keep an eye on metrics once traffic starts back up

Slide 98

Slide 98 text

✓Set parameter group back to normal ✓Enable mutli-AZ feature ✓Scale back down to desired instance type ✓Keep an eye on metrics once traffic starts back up

Slide 99

Slide 99 text

✓Set parameter group back to normal ✓Enable mutli-AZ feature ✓Scale back down to desired instance type ✓Keep an eye on metrics once traffic starts back up

Slide 100

Slide 100 text

✓Set parameter group back to normal ✓Enable mutli-AZ feature ✓Scale back down to desired instance type ✓Keep an eye on metrics once traffic starts back up

Slide 101

Slide 101 text

✓Set parameter group back to normal ✓Enable mutli-AZ feature ✓Scale back down to desired instance type ✓Keep an eye on metrics once traffic starts back up

Slide 102

Slide 102 text

No content

Slide 103

Slide 103 text

%

Slide 104

Slide 104 text

Thank You