Upgrade to Pro — share decks privately, control downloads, hide ads and more …

A glimpse of Microsoft's open source journey (through the lens of PostgreSQL)

A glimpse of Microsoft's open source journey (through the lens of PostgreSQL)

Also see: https://speakerdeck.com/citusdata/what-microsoft-is-doing-with-postgres-and-the-citus-data-acquisition-pgconf-eu-2019-utku-azman?slide=12

PostgreSQL is not only a widely acclaimed database but one of the most venerable open source projects. In this talk, delivered at the Road to FOSDEM meetup in Mechelen, I share some of our PostgreSQL investment areas and how using Postgres internally at Microsoft is a reflection of the underlying cultural transformation around open source in the cloud.

José Miguel Parrella

January 30, 2020
Tweet

More Decks by José Miguel Parrella

Other Decks in Technology

Transcript

  1. A glimpse of
    Microsoft's open source
    journey (through the lens
    of PostgreSQL)
    Jose Miguel Parrella
    Office of the Azure CTO, Microsoft
    @bureado

    View Slide

  2. Open source at Microsoft: a cultural change driven
    by demographics and leadership affinity
    Phase I: 2000-
    2005
    • "Shared
    Source"
    • "Accidental"
    product
    truths
    (Interix)
    Phase II: 2005-
    2010
    • CodePlex
    • "Insular"
    product
    truths (PHP
    on Windows,
    but also Linux
    on Hyper-V)
    Phase III: 2010-
    2015
    • "Trying too
    hard"
    • Microsoft
    Open
    Technologies
    • Node.js,
    TypeScript
    Phase IV: 2015-
    2020
    • Collaborative
    • Linux:
    Canonical,
    Red Hat
    • Hadoop:
    Hortonworks,
    Cloudera
    Phase V:
    Tomorrow
    • Innovative
    • Docker &
    Kubernetes
    • Rust &
    Golang
    • Postgres
    Windows
    Azure
    Microsoft
    Azure

    View Slide

  3. “We can support 100s of
    concurrent users & more
    than 6M queries every day.
    With Citus, response times
    for 75% of queries are less
    than 200 ms. And response
    times for 95% of queries are
    less than 3 seconds.”

    View Slide

  4. Single Server Hyperscale (Citus) NEW
    Worry-free PostgreSQL in the cloud with an
    architecture that is built to scale out
    Example use cases
    • Scaling PostgreSQL multi-tenant, SaaS applications
    • Real-time operational analytics
    • Building high throughput transactional apps
    Fully-managed, single-node PostgreSQL database
    service with built-in HA
    Example use cases
    • Transactional and operational analytics workloads
    • Apps requiring JSON, geospatial support, or full-
    text search
    • Greenfield apps built with modern frameworks

    View Slide

  5. Take single node
    PostgreSQL across
    100s of nodes
    Shard your PostgreSQL database across
    multiple nodes to give your application more
    memory, compute, and disk storage
    Easily add worker nodes to achieve horizontal
    scale, while being able to deliver parallelism
    even within each node
    Scale out to 100s of nodes—without downtime
    Coordinator
    Table metadata
    Each node PostgreSQL
    with Citus installed
    1 shard = 1 Postgre SQL table

    View Slide

  6. Recent additions

    View Slide

  7. Postgres is more popular than ever
    One of most loved &
    wanted databases
    in Stack Overflow 2019
    Developer Survey
    Ranked 2017 & 2018
    DBMS of the Year
    by DB-Engines

    View Slide

  8. View Slide

  9. On-premises
    PostgreSQL/MySQL/
    MariaDB
    IaaS
    Azure VMs with
    PostgreSQL/MySQL/
    MariaDB
    PaaS
    Azure Database for
    MySQL/PostgreSQL/
    MariaDB
    Datacenter
    management
    Hardware
    O/S provision
    /patching
    Database provision/
    Patch/Scaling
    Virtualization
    Data
    Applications
    High availability
    /DR/Backups
    Datacenter
    management
    Hardware
    Virtualization
    O/S
    Database provision/
    Patch/Scaling
    Data
    Applications
    High availability
    /DR/Backups
    Data
    Applications
    Datacenter
    management
    Hardware
    Virtualization
    O/S
    Database provision/
    Patch/Scaling
    High availability/
    DR/Backups
    Intelligent
    performance/security
    Managed by Microsoft
    Managed by customer
    Machine learning capability
    More Postgres
    everywhere

    View Slide

  10. View Slide

  11. Postgres Is Underrated—It Handles More than You Think
    A webdev platform built entirely in PostgreSQL
    System design hack: Postgres is a great pub/sub and job server
    Turning PostgreSQL into a queue serving 10k jobs per second (2013)
    How much faster is Redis at storing a blob of JSON compared to Postgres?
    Advanced Kubernetes Namespace Management with the PostgreSQL
    Operator
    Why the Guardian Switched From MongoDB to PostgreSQL
    postgres-websockets
    Visualizing PostgreSQL Vacuum Progress

    View Slide

  12. Primary Use Cases for PostgreSQL
    Hyperscale (Citus)
    Digital transformations & data estate modernization
    Data intensive OSS relational apps: Scale from 100 GB, to multiple PBs
    Multi-tenant & SaaS
    applications
    Real-time, operational analytics
    applications
    Analytics on JSON data, Geospatial,
    Timeseries, In-Memory / HTAP workloads
    Transactional / OLTP
    applications
    B2B apps in Enterprise, Sharding,
    ISVs building SaaS applications
    Strong consistency, Relational semantics
    (foreign keys, joins), limitless data

    View Slide

  13. 5 requirements
    that real-time
    analytics
    applications all
    have

    View Slide

  14. View Slide

  15. View Slide

  16. View Slide

  17. View Slide

  18. View Slide

  19. View Slide