Slide 1

Slide 1 text

MONGODB vs POSTGRESQL BENCHMARKS MONGODB vs POSTGRESQL BENCHMARKS Álvaro Hernández

Slide 2

Slide 2 text

MONGODB vs POSTGRESQL BENCHMARKS ` whoami` Álvaro Hernández @ahachete ● Founder & CEO, OnGres ● 20+ years PostgreSQL user and DBA ● Mostly doing R&D to create new, innovative software on Postgres ● Frequent speaker at PostgreSQL, database conferences ● Principal Architect of ToroDB ● Founder and President of the NPO Fundación PostgreSQL

Slide 3

Slide 3 text

MONGODB vs POSTGRESQL BENCHMARKS Introduction

Slide 4

Slide 4 text

MONGODB vs POSTGRESQL BENCHMARKS OnGres Ethics Policy This work was sponsored by EnterpriseDB, and performed by OnGres. It was conducted according to OnGres Ethics Policy, that observes that: ● All the work is conducted with the maximum degree of professionalism and independence. ● No technology is favored over another. ● No results are edited or omitted. ● The sponsor of the work does not intervene in the strategy, implementation or execution of the work. ● Results are verifiable by external, third parties.

Slide 5

Slide 5 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmarking is hard ● Bench-marketing is easy --but not trustable. ● Benchmarking is hard. ● Benchmarking databases is harder. ● Benchmarking databases that follow different design models, is even harder. ● Are MongoDB and PostgreSQL comparable? ● The market demands this: informed decision, performance is key.

Slide 6

Slide 6 text

MONGODB vs POSTGRESQL BENCHMARKS Pursuing benchmarking fairness How to present a fair arena in which the technologies compete in an apples-to-apples scenario? ● Transparency and reproducibility. Infrastructure-as-Code. https://gitlab.com/ongresinc/benchplatform/ https://gitlab.com/ongresinc/txbenchmark/ http://benchplatform.ongres.com.s3.amazonaws.com/ ● Multiple benchmarks ● Close-to-real workloads ● Production-grade setups

Slide 7

Slide 7 text

MONGODB vs POSTGRESQL BENCHMARKS Types of benchmarks Three main benchmark categories: ● Transactions benchmark ● OLTP ○ In-memory dataset (4GB) ○ Larger, on-disk dataset (2TB) ● OLAP

Slide 8

Slide 8 text

MONGODB vs POSTGRESQL BENCHMARKS The contenders MongoDB 4.0 ● Community version used ● Journaling active ● Replication active (single node) ● No further tuning required PostgreSQL 11 ● Self-managed instance ● Basic production tuning ● Tested w/ and w/o PgBouncer

Slide 9

Slide 9 text

MONGODB vs POSTGRESQL BENCHMARKS Architecture: client-server, running on AWS Data volume: io1, with number of reserved IOPS depending on the test.

Slide 10

Slide 10 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmarks: Transactions

Slide 11

Slide 11 text

MONGODB vs POSTGRESQL BENCHMARKS Previous discussion: isolation levels

Slide 12

Slide 12 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmark description ● Custom-developed benchmark. ● Inspired by MongoDB post about MongoDB transactions. ● Simulates air reservation system: check flight schedule, plane availability, purchase, audit log. ● Tx with query, insertions and upsert. ● Programmed in Java. Official JDBC and MongoDB drivers. ● Open source: https://gitlab.com/ongresinc/benchplatform

Slide 13

Slide 13 text

MONGODB vs POSTGRESQL BENCHMARKS Transaction performance

Slide 14

Slide 14 text

MONGODB vs POSTGRESQL BENCHMARKS Transaction performance

Slide 15

Slide 15 text

MONGODB vs POSTGRESQL BENCHMARKS Transaction retries

Slide 16

Slide 16 text

MONGODB vs POSTGRESQL BENCHMARKS PostgreSQL latency (@ SERIALIZABLE)

Slide 17

Slide 17 text

MONGODB vs POSTGRESQL BENCHMARKS MongoDB latency

Slide 18

Slide 18 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmarks: OLTP

Slide 19

Slide 19 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmark description ● Industry standard Sysbench was used. ● Supports PostgreSQL and MongoDB (via sysbench-mongodb-lua). ● Resembles real-world OLTP workload. ● Different dimensions benchmarked: ○ Dataset size: fit in memory (4GB), on-disk (2TB) ○ Read/write work split: 50/50, 95/5. ○ Filesystems: XFS or ZFS. ○ Different levels of concurrency.

Slide 20

Slide 20 text

MONGODB vs POSTGRESQL BENCHMARKS Discussion about PostgreSQL Connection Pooling ● PostgreSQL best practice is to run always behind a connection pool. ● PostgreSQL proved to be highly sensitive to the number of connections, with degraded performance when overwhelmed. ● A connection pool offers close-to-ideal performance for almost any workload. ● MongoDB does not require connection pooling: is included in drivers.

Slide 21

Slide 21 text

MONGODB vs POSTGRESQL BENCHMARKS Performance: dataset in memory

Slide 22

Slide 22 text

MONGODB vs POSTGRESQL BENCHMARKS PG connection pooling effect (dataset in memory)

Slide 23

Slide 23 text

MONGODB vs POSTGRESQL BENCHMARKS Performance: dataset on disk

Slide 24

Slide 24 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmarks: OLAP

Slide 25

Slide 25 text

MONGODB vs POSTGRESQL BENCHMARKS Benchmark description ● JSON dataset (GitHub Archive). Native to MongoDB, jsonb in PG. ● 2015 dataset: 212M records (340GB on PG, 206GB on MongoDB). ● Timing of OLAP-style queries that resemble BI natural questions: ○ Repositories ordered by most opened issues. ○ Most frequent git event types, ordered. ○ Top 10 most active actors. ○ Repositories that have more than two comments and a specific event type, ordered.

Slide 26

Slide 26 text

MONGODB vs POSTGRESQL BENCHMARKS Query A in PostgreSQL and MongoDB

Slide 27

Slide 27 text

MONGODB vs POSTGRESQL BENCHMARKS Results

Slide 28

Slide 28 text

MONGODB vs POSTGRESQL BENCHMARKS QUESTIONS? Álvaro Hernández @ahachete / www.ongres.com Thank you