Can Postgres scale like DynamoDB?

May 20, 2021

380

Can Postgres scale like DynamoDB?

DynamoDB is one of the most praised and reputed services from Amazon Web Services. While offering a very simple model to the users, and with some notable limitations, it can scale almost endlessly. It is reported to have achieved 80M transactions per second, when servicing Amazon Retail platform on Black Friday 2020.

Key to DynamoDB’s scalability is a shared-nothing, scale-out and multi-tenant architecture. Postgres doesn’t have a native sharding capability, but would it be needed to offer similar performance and scalability characteristics to those of DynamoDB? How could it be done?

This talk is about DynamoDB’s architecture, similarities and differences with Postgres, and understand how Postgres may scale in a similar way.

OnGres

May 20, 2021

Tweet

More Decks by OnGres

See All by OnGres

Postgres à la carte: dynamic container images with your choice of extensions

1

34

Reproducible Postgres

1

54

Will the Era of Specialized Databases be Over?

0

44

Postgres in the Containers Era

0

340

What are Containerized Postgres Storage Requirements

1

93

Where should I run my Database? Databases on Kubernetes?

0

190

Why you should be running Postgres on Kubernetes

0

65

Postgres on Kubernetes Hands-on Lab

1

73

Time-series on SQL Server on Kubernetes on ARM64… without SQL Server!

0

82

Other Decks in Technology

See All in Technology

エンジニアリングマネージャー“お悩み相談”パネルセッション

1

450

MCP とマネージド PaaS で実現する大規模 AI アプリケーションの高速開発

1

1k

Shadow DOMとセキュリティ - 光と影の境界を探る / Shibuya.XSS techtalk #13

0

210

“日本一のM&A企業”を支える、少人数SREの効率化戦略 / SRE NEXT 2025

1

300

20150719_Amazon Nova Canvas Virtual try-onアプリ作成裏話

0

100

毎晩の負荷試験自動実行による効果

recruitengineers

PRO

5

200

AWS 怖い話 WAF編 @fillz_noh #AWSStartup #AWSStartup_Kansai

0

140

サービスを止めるな！ DDoS攻撃へのスマートな備えと最前線の事例

coconala_engineer

1

210

【あのMCPって、どんな処理してるの？】 AWS CDKでの開発で便利なAWS MCP Servers特集

6

1k

AWS CDK 入門ガイドこれだけは知っておきたいヒント集

5

820

RapidPen: AIエージェントによる高度なペネトレーションテスト自動化の研究開発

1

320

スプリントレビューを効果的にするために

3

670

Featured

See All Featured

Making the Leap to Tech Lead

134

9.4k

Keith and Marios Guide to Fast Websites

411

22k

37

3.5k

Performance Is Good for Brains [We Love Speed 2024]

10

980

We Have a Design System, Now What?

53

7.7k

The Art of Delivering Value - GDevCon NA Keynote

15

1.6k

Creating an realtime collaboration tool: Agile Flush - .NET Oxford

30

2.2k

Making Projects Easy

116

6.3k

Balancing Empowerment & Direction

1

480

Refactoring Trust on Your Teams (GOTO; Chicago 2020)

34

3.1k

"I'm Feeling Lucky" - Building Great Search Experiences for Today's Users (#IAC19)

229

22k

Building Applications with DynamoDB

95

6.5k

Transcript

Can Postgres Scale like DynamoDB?
` whoami ` Álvaro Hernández aht.es @ahachete • Founder &
CEO, OnGres • 20+ years Postgres user and DBA • Mostly doing R&D to create new, innovative software on Postgres • Frequent speaker at Postgres, database conferences • Principal Architect of StackGres, ToroDB • Founder and President of the NPO Fundación PostgreSQL • AWS Data Hero
A little bit about DynamoDB
Is DynamoDB good? https://aws.amazon.com/blogs/aws/amazon-prime-day-2020-powered -by-aws/
A high-traffic Postgres example GitLab.com spikes to >300K Postgres tx/s
on a single cluster: https://about.gitlab.com/blog/2020/09/11/gitlab-pg-upgrade/
DynamoDB is a building block, too https://aws.amazon.com/message/5467D2/
What is DynamoDB • A scale-out, NoSQL database • Key-Value:
◦ Key: a simple or composite PK ◦ Value: a JSON blob • Consistent performance at any scale: single-digit ms queries • Severless • Pay-per-use ◦ WCUs, RCUs ◦ Storage, data transfer
What makes DynamoDB so successful • Yeah, that it’s serverless.
• Yeah, that it scales without limits. • But in reality, what makes DynamoDB unique is: Consistent and low latency at any scale. Below 10ms
What makes DynamoDB so special • Yeah, that it’s serverless.
• Yeah, that it scales without limits. • But in reality, what makes DynamoDB unique is: Consistent and low latency at any scale. Below 10ms • What, 10ms???? My Postgres answers queries in less than 1ms!
What makes DynamoDB so special • Yeah, that it’s serverless.
• Yeah, that it scales without limits. • But in reality, what makes DynamoDB unique is: Consistent and low latency at any scale. Below 10ms • What, 10ms???? My Postgres answers queries in less than 1ms! • At any scale? • Consistently? What are your p99 response times?
DynamoDB Data Model
DynamoDB Sharding Logic
DynamoDB (simpliﬁed) Request Routing
DynamoDB (relevant) Operations • Single-value, single-partition operations: ◦ PutItem, DeleteItem,
GetItem, UpdateItem ◦ Compute hash of partition key, go to shard, operate on value • Multiple-value, single-partition operations: ◦ Query. Reads values with the same hash, sorted by sort key • Multiple-value, multiple-partition operations: ◦ Scan ◦ Supports (server assisted) parallelism • Multiple-value operations: max 1MB results, provides pagination mechanisms, ﬁltering (still consumes RCUs!)
DynamoDB (missing?) Operations • No joins • No aggregations •
No advanced queries (windows, subqueries…) Why?? By design. To keep latency single-digit ms.
DynamoDB Scaling
DynamoDB Scaling
DynamoDB Scaling
Can Postgres scale like DynamoDB?
Option #1. Coordinator model: Citus
Citus limitations for DynamoDB scale • Single controller ◦ Controller
has a bit of state (metadata + local tables) ◦ It’s possible to have multiple (with replication among them), but is not mainstream ◦ Don’t use local tables • Main reason: processing time in the controller is not guaranteed to scale like DynamoDB. Complex queries and scatter-gather communication with shards are an anti-pattern in DynamoDB model.
Option #2. Coordinator model: postgres_fdw
postgres_fdw limitations for DynamoDB scale • postgres_fdw limitations ◦ Doesn’t
push down all the clauses ◦ When talking to multiple shards, it works serially ◦ Requires connection pooling • Main reason: processing time in the controller is not guaranteed to scale like DynamoDB. Complex queries and scatter-gather communication with shards are an anti-pattern in DynamoDB model.
Application-based sharding • Noted that the main reason for not
achieving DynamoDB scale with either Citus or postgres_fdw is essentially the same? • Processing time in the coordinator and complexity of allowed operations violate DynamoDB’s main promise: single-digit ms response times. • What’s the alternative then? • Application-based sharding. • Involving the client or application in the sharding process, sending the queries directly to the appropriate shard. • Except for scan, all operations are single-shard (single partition)
Postgres application-based sharding
Possible table structure Table "public.pglikedy_simple" ┌─────────┬────────┬───────────┬──────────┬─────────┐ │ Column │ Type
│ Collation │ Nullable │ Default │ ├─────────┼────────┼───────────┼──────────┼─────────┤ │ hash │ bigint │ │ not null │ │ │ content │ jsonb │ │ not null │ │ └─────────┴────────┴───────────┴──────────┴─────────┘ Indexes: "pglikedy_simple_hash_key" UNIQUE CONSTRAINT, btree (hash) "pglikedy_simple_pk" UNIQUE, btree ((content -> 'partitionKey'::text)) Table "public.pglikedy_composite" ┌─────────┬────────┬───────────┬──────────┬─────────┐ │ Column │ Type │ Collation │ Nullable │ Default │ ├─────────┼────────┼───────────┼──────────┼─────────┤ │ hash │ bigint │ │ not null │ │ │ content │ jsonb │ │ not null │ │ └─────────┴────────┴───────────┴──────────┴─────────┘ Indexes: "pglikedy_composite_pk" UNIQUE, btree ((content -> 'partitionKey'::text), (content -> 'sortKey'::text))
Would it scale like DynamoDB? • Scaling is essentially linear
with the number of shards (partitions) • Almost all (permitted) operations are single-partition, and the issuer knows which partition to be directed to: hash(primaryKey) -> partition • Scan is essentially a composition of Query commands, potentially out-of-order. • Architecture is complex, needing request routers, metadata servers for partition -> server placement, re-sharding… • But would allow, theoretically, Postgres to scale like DynamoDB!
Because, after all...
DynamoDB is “just” an HTTP application backed by MySQL! https://news.ycombinator.com/item?id=13173927
Stay tuned. Coming soon….
Stay tuned. Coming soon…. Postgres scaling like DynamoDB benchmark! Follow
@ahachete
Questions?