Slide 1

Slide 1 text

Postgres & Rails 6 Multi-DB: Pitfalls, Patterns and Performance Gabriel Enslein

Slide 2

Slide 2 text

Who uses Postgres and Rails? Who is on the latest and greatest versions of both?

Slide 3

Slide 3 text

What does Rails 6 Multi-DB allow? ● Read and Write databases can be created in rake tasks ● Specify read and write DBs in the ActiveRecord Models ● API request-based DB selection ○ GET, HEAD -> Read DB ○ POST, PUT -> Write DB But how does this interact with Postgres?

Slide 4

Slide 4 text

Rails 6 Multi-DB Part 1: https://github.com/rails/rails/pull/32274 Part 2: https://github.com/rails/rails/pull/33637 Part 3: https://github.com/rails/rails/pull/33770 Part 4: https://github.com/rails/rails/pull/34052 Part 5: https://github.com/rails/rails/pull/34491 Part 6: https://github.com/rails/rails/pull/34495 Part 7: https://github.com/rails/rails/pull/34505 Part 8: https://github.com/rails/rails/pull/35073 Parts ongoing: too many to list Thank you to the Rails contributors!

Slide 5

Slide 5 text

Who is this Gabriel character? Heroku Data Engineer since 2016 Sinatra and RoR API Services Worked with many different NoSQL and SQL Databases Interested in Postgres high traffic and data growth

Slide 6

Slide 6 text

Terms/Phrases in the talk you’ll hear ● Bloat ● Vacuum ● Partitioning ● Sharding If you want to learn more, see me after ● Replication ● Statistics ● Caching ● Connection Pooling

Slide 7

Slide 7 text

Self-Help/Life Coaching Business is booming!

Slide 8

Slide 8 text

Super Secret Social interest platform revisited ● Customers have profiles ○ Shipping addresses ○ Email preferences ○ Social Media links ○ Payment Info ○ Etc. ● Each customer can posts from areas of interest ○ Favorite Quotes ○ TV Shows ○ Movies ○ Foods ○ Etc. Millions of users and hundreds of millions of updates per day

Slide 9

Slide 9 text

Super Secret Social interest platform revisited Why do profile updates take so long? Why does it take so long to update addresses? Why does it take forever to load quotes a users added?

Slide 10

Slide 10 text

Super Secret Social interest platform revisited - The Characters table is huge (750 GB+) - I can’t wait for all the queries on to finish on that one table - Updates on the Characters table turn my Rails app to a snail What can I do to speed up my Rails App?

Slide 11

Slide 11 text

What can I do to speed up my Rails App? Add a read-replica Move Expensive Character queries to replica Why have a read replica if my app can’t use it directly? It feels like there’s a cap...

Slide 12

Slide 12 text

Let’s Talk about the CAP...Theorem

Slide 13

Slide 13 text

Let’s Talk about the CAP...Theorem There are 3 dimensions of Robust Distributed System 1. Consistency 2. Availability 3. Partition Tolerance The theorem states you can at most achieve 2 of the 3

Slide 14

Slide 14 text

Let’s Talk about the CAP...Theorem ● Help identify failure points ● Increase overall responsiveness ● Ensure longevity and stability of operations

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Curated Content: Characters of Interest Interest Outlet: Our favorite Characters Hundreds of thousands of TV Shows and Movies Show/Movie has many characters (some have hundreds) Varying information about each character or show

Slide 17

Slide 17 text

Curated Content: Characters of Interest Interest Outlet: Our favorite Characters Movies and TV Shows are internally updated Characters per show or movie will be updated similarly Depending on a show/movie/character’s popularity depends query volume

Slide 18

Slide 18 text

Curated Content: Characters of Interest Interest Outlet: Our favorite Characters Movies and TV Shows are internally updated Characters per show or movie will be updated similarly Depending on a show/movie/character’s popularity depends query volume

Slide 19

Slide 19 text

What can I do to speed up my Rails App?

Slide 20

Slide 20 text

Rails 6 can use them now with multi-DB This enables Rails to Read/Write split Interest Outlet: Our favorite Characters Pattern: Read DB for Characters of Interest

Slide 21

Slide 21 text

What tools/changes do I need to make to set up or recognize environments? How do I define different databases? Interest Outlet: Our favorite Characters Pattern: Read DB for Characters of Interest

Slide 22

Slide 22 text

Super Secret Social interest platform revisited production: primary: <<: *default url: <%= ENV['DATABASE_URL'] %> characters: <<: *default url: <%= ENV['CHARACTERS_DATABASE_URL'] %> migrations_paths: "db/characters_migrate" characters_replica: <<: *default url: <%= ENV['CHARACTERS_FOLLOWER_DATABASE_URL'] %> config/database.yml support was added for Multi-DB

Slide 23

Slide 23 text

How does my Character class understand multi-DB? What does the ActiveRecord look like for inheritance? Interest Outlet: Our favorite Characters Pattern: Read DB for Characters of Interest

Slide 24

Slide 24 text

Interest Outlet: Our favorite Characters class Character < ApplicationRecord connects_to database: { writing: :characters, reading: :characters_replica } end class TvShowCharacter < Character end class MovieCharacter < Character end Pattern: Read DB for Characters of Interest

Slide 25

Slide 25 text

Interest Outlet: Our favorite Characters Having the Base model define the connection strategy via connects_to Child models will maintain the connection strategy for reusability Enables more robust strategies for certain operations to prioritize workloads Performance: Read DB for Characters of Interest

Slide 26

Slide 26 text

Curated Content: Characters of Interest Interest Outlet: Our favorite Characters ActiveRecord::Base.connected_to(role: :reading) do TvShowCharacter.first # finds TvShowCharacter from replica connected to Character end Example: TV Show Characters query the read replica

Slide 27

Slide 27 text

Where’s the Hottest trending TV character my data upload process added in the database? Why aren’t my users seeing their newest Characters added to their favorites?

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

Interest Outlet: Our favorite Characters Pitfall: Follower Lag on Read Replicas heroku pg:info CHARACTERS_FOLLOWER_DATABASE-a super-secret-social Following: CHARACTERS_DATABASE_URL Behind By: 100K commits

Slide 30

Slide 30 text

Interest Outlet: Our favorite Characters Pitfall: Follower Lag on Read Replicas What is follower lag really? ● Postgres sends Write-Ahead Log (WAL) to replay changes ● The more traffic to primary, the more WAL replicas replay ● Other factors on read replicas can affect replay

Slide 31

Slide 31 text

Interest Outlet: Our favorite Characters Pitfall: Follower Lag on Read Replicas Are we missing any indexes for the read replica? What’s our connection usage look like? How are other database resources?

Slide 32

Slide 32 text

Interest Outlet: Our favorite Characters Performance: Optimizing Read Replicas ● Optimizing indexing in Postgres for reads is very different from indexing for writes ● Size of datasets can change query plans ● Caching changes on the DB can affect performance

Slide 33

Slide 33 text

EXPLAIN ANALYZE

Slide 34

Slide 34 text

Interest Outlet: Our favorite Characters Performance: Optimizing Read Replicas EXPLAIN ANALYZE SELECT * FROM characters WHERE show_id > 7000; QUERY PLAN ------------------------------------------------------------ Seq Scan on characters (cost=0.00..483350.00 rows=15078620 width=244) Filter: (show_id > 7000)

Slide 35

Slide 35 text

Interest Outlet: Our favorite Characters Performance: Optimizing Read Replicas ● Things to look out for ○ Sort Method ○ Seq (Sequence) Scan ○ Groupings ○ Conditionals

Slide 36

Slide 36 text

Interest Outlet: Our favorite Characters Performance: Optimizing Read Replicas ● Types of scans to favor ○ Bitmap Index Scan ○ Bitmap Heap Scan ○ Index or Index Only scans

Slide 37

Slide 37 text

Interest Outlet: Our favorite Characters Performance: Optimizing Read Replicas EXPLAIN ANALYZE SELECT * FROM characters WHERE show_id > 7000; QUERY PLAN ---------------------------------------------------------------------- -------- Bitmap Heap Scan on characters (cost=2.27..204.35 rows=106 width=244) Recheck Cond: (show_id > 7000) -> Bitmap Index Scan on characters_show_id (cost=0.00..2.27 rows=106 width=0) Index Cond: (show_id > 7000)

Slide 38

Slide 38 text

Interest Outlet: Our favorite Characters ● I updated my indexes and my lag is gone but queries aren’t improving ● I checked I’m not above connection limits too ● I reviewed my query analyzer Now what? Performance: Optimizing Read Replicas

Slide 39

Slide 39 text

No content

Slide 40

Slide 40 text

Interest Outlet: Our favorite Characters Pitfall: Write-heavy primaries slowing replication Users are adding tons of quotes every day (>1 TB+) They want even more features interacting with quotes Complaining about wanting to add Characters we don’t already have

Slide 41

Slide 41 text

Interest Outlet: Our favorite Characters Pitfall: Write-heavy primaries slowing replication ● Volume of heavy usage queries to the read database can slow replay by locking shared Postgres internals ● Connection churning or pool saturation on the database can slow replication ● Large table sizes ( > 1 TB+) ● Volume of data being written from the primary database can slow replication

Slide 42

Slide 42 text

Users only want quotes posted or updated in the last week Quotes older than a week just fill the DB cache… The database has a cache? We have way more Quotes than Users care about

Slide 43

Slide 43 text

Interest Outlet: Our favorite Characters Pitfall: Write-heavy primaries slowing replication ● Database caches recently accessed records in memory ● Writing a lot of new data tends to cause a lot of churn on the Memory usage, slowing the database down significantly ● Most index positions are also stored as pointer in memory for faster disk access ● Indexes that are being accessed with the same high rate of change tables can add to the DB load

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

We can Partition the Quotes Table!

Slide 46

Slide 46 text

Interest Outlet: Inspirational Quotes Performance: Partitioning Large Tables Partitioning is a good way to less active data Default partitions enabled in Postgres 11 Partition-level aggregation creates deep parallelization in queries

Slide 47

Slide 47 text

Interest Outlet: Inspirational Quotes Performance: Partitioning Large Tables psql=> ALTER TABLE quotes ATTACH PARTITION quotes_y2016m07 FOR VALUES FROM ('2016-07-01') TO ('2016-08-01');

Slide 48

Slide 48 text

POSTGRES PARTITIONING: NEEDS ITS OWN TALK

Slide 49

Slide 49 text

But what about User-sourced Characters AND Quotes?

Slide 50

Slide 50 text

User-sourced Content: Famous Character catchphrases Interest Outlet: Our favorite Characters Users want to be able to add Characters AND their famous Quotes What if we have separate databases for Quotes and Characters? Both Quotes and Characters have a primary and a read-replica database What does that look like from a code perspective?

Slide 51

Slide 51 text

Hang on a second; even getting last week’s posts are super slow though, they’re so many quotes... User-sourced Content: Famous Character catchphrases Interest Outlet: Our favorite Characters

Slide 52

Slide 52 text

Quotes TOTAL size 5 TB+ Characters are now 3 TB+ Uh-oh, we have way more Characters & Quotes...

Slide 53

Slide 53 text

How can we handle the data growth?

Slide 54

Slide 54 text

We can shard the Quotes to its own DB!

Slide 55

Slide 55 text

Interest Outlet: Our favorite Characters Pattern: Sharding the write database for Quotes ● Separation of Concerns ● Prioritizes critical functionality of the main Postgres primary ● DRY principle easier to achieve with simplistic view of the specific Quotes DB workload

Slide 56

Slide 56 text

Interest Outlet: Our favorite Characters Performance: Sharding the write database for Quotes ● Moving Quotes to its own DB relieves the rest of the Rails App ● Characters are flowing in a consistent fashion ● Other tables are performing as expected now

Slide 57

Slide 57 text

Interest Outlet: Our favorite Characters class Quotes < ApplicationRecord connects_to database: { writing: :quotes, reading: :quotes_replica } End # A Custom write handler to write each to their respective primary DBs ActiveRecord::Base.connected_to(mode: :writing) do TvShowCharacter.create! Quote.create! end User-sourced Content: Famous Character catchphrases

Slide 58

Slide 58 text

What’s happening to my Rails app?

Slide 59

Slide 59 text

No content

Slide 60

Slide 60 text

Interest Outlet: Our favorite Characters Pitfall: Sharding repeatedly resulting into DB loops ● Logic in App may repeatedly query data from sharded locations ○ e.g. Joins would would be done in Rails, not DBs ● Too many re-connections to multiple DBs can slow network time

Slide 61

Slide 61 text

Interest Outlet: Our favorite Characters Pitfall: Sharding repeatedly resulting into DB loops ● DB/QueryCache may be churning adding additional slowdown to all results ● 95/99th Percentiles can be causing the worst of partial or full outages or even resource saturation

Slide 62

Slide 62 text

No content

Slide 63

Slide 63 text

Where’s the CAP? Flip it to PACELC Theorem When Partition Tolerance (P) is the critical case choose one: ● Availability (A) ● Consistency(C) ELSE (E) choose one: ● Latency (L) ● Consistency(C)

Slide 64

Slide 64 text

What does that mean for me? ● Ultimately, Latency is our biggest concern now ● How do we avoid the network round trips? ● How can we minimize networking issues? ● How can we maximize our data usage for its size and this limitation? Where’s the CAP? Flip it to PACELC Theorem

Slide 65

Slide 65 text

Considerations

Slide 66

Slide 66 text

● Which connection pools are you referring to? ○ DB connection pools or Rails connection pools? ○ Are the Rails app servers connection pooling properly? ○ Are they not over server limits or DB limits? Consideration: What about connection pools?

Slide 67

Slide 67 text

● What about Database connection pools? ○ Are all connections going through direct to Postgres DBs? ○ Is connection pooling enabled or set up? ○ Is Rails saturating DB connection pooling limits? Consideration: What about connection pools?

Slide 68

Slide 68 text

● Which caches are you referring to? ○ QueryCache in your Rails app? ○ One of your database caches? ■ Is it your writable database? ■ Is it one of your read-replica databases? Consideration: Why do my caches seem useless?

Slide 69

Slide 69 text

● What happens if you’re putting things into memory that you don’t use? ○ Postgres will use temp disk space to manage large queries ● It caches each whole record, then creates pointers to disk for data overflow ● Postgres can stop accepting connections if we run too many of these queries at the same time Refresher: My data is too big for DB Caches, what happens now?

Slide 70

Slide 70 text

● How many resources are left on the Rails app servers? ○ Memory ○ Connections ○ CPU Consideration: Where are all my resources?

Slide 71

Slide 71 text

● How many resources are there left to spare on the DBs? ○ Memory ○ I/O ○ CPU ○ Connections Consideration: Where are all my resources?

Slide 72

Slide 72 text

No content

Slide 73

Slide 73 text

● Apps should have safety measures ○ Maintenance Mode capabilities ○ Feature Flagging ○ Circuit Breakers for when too many alerts fire Food For Thought

Slide 74

Slide 74 text

● Data Services should have thresholds ○ Tracking utilization of Data services to leave room for spikes ○ Web Traffic spikes can spike data storage and usage equally ○ Have a growth plan in place Food For Thought

Slide 75

Slide 75 text

● When needing new strategies, evaluate ideas early and often ○ Architecting Services for the principle of least privilege ○ Identifying services to lower scope to single purpose Food For Thought

Slide 76

Slide 76 text

● When needing new strategies, evaluate ideas early and often ○ Breaking up Monoliths ○ Identifying Critical/Real-time vs. non-real time or fault capable services ○ Refactoring Data relational restrictions Food For Thought

Slide 77

Slide 77 text

Heroku Postgres improvements Postgresql 11 Released Improved Postgres CLI analytics from improved Postgres Query Analytics Data.heroku.com Database Analytics Heroku Postgres Connection Pooling

Slide 78

Slide 78 text

Heroku Postgres 11 Generally Available New features include: Stored Procedures Additional Partitioning Features and Optimizations Enhanced Parallelization Learn More about Postgres 11 in our blog

Slide 79

Slide 79 text

Heroku Postgres CLI analytics Database tuning and monitoring still vital - https://devcenter.heroku.com/articles/heroku-postgres-database-tuning Extended CLI analytics improved from Postgres 10 internals changes - https://github.com/heroku/heroku-pg-extras heroku pg:diagnose #and more!

Slide 80

Slide 80 text

Heroku Connection Pooling: PgBouncer Beta ● Manages connections for your Postgres Database ○ Reduces need to open new connections ● Lives on the Database Server ● Uses Transactional connection Pooling What’s PgBouncer?

Slide 81

Slide 81 text

Heroku Connection Pooling: PgBouncer Beta ● Better Visibility on connection statistics ● Exposes guard rails to protect your database ● Excellent for high-throughput asynchronous workloads Why use Connection pooling?

Slide 82

Slide 82 text

Heroku Connection Pooling: PgBouncer Beta On Server Connection pooling is only available on Postgres 9.6, 10, 11 - Read more about it: Postgres connection pooling - Still in public beta, more features to come! - Default connections direct to DBs still available after enabling Still offering Client-side buildpack - Client-side PgBouncer article

Slide 83

Slide 83 text

Other resources for more learning ● Learn more about Rails DB performance monitoring with data.heroku.com ● Inspect your app with Seven Ways to Fortify your Application

Slide 84

Slide 84 text

Questions welcome! Come by the Booth!