$30 off During Our Annual Pro Sale. View Details »

Building modern apps that scale to billions of events with Azure Database for Postgres | Ignite 2019 | Umur Cubukcu

Building modern apps that scale to billions of events with Azure Database for Postgres | Ignite 2019 | Umur Cubukcu

Come learn why customers call Hyperscale (Citus) a game changer (their words, not ours.) With the launch of Hyperscale (Citus), you can now scale out real-time analytics workloads horizontally using Azure Database for PostgreSQL. This session explores why so many developers are adopting the open source Postgres database, and highlights what makes our managed Postgres service on Azure unique. Then, we dive into real-world use cases to show how you can build scalable real-time analytics apps using Azure, Postgres, and Hyperscale (Citus). In particular, we show how one team built a petabyte-scale analytics dashboard that supports real-time decision making with response times of 90 ms—even with 6M queries/day across billions of rows.

Azure Database for PostgreSQL

November 05, 2019
Tweet

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. Building modern apps that scale with
    Azure Database for PostgreSQL & Hyperscale (Citus)

    View Slide

  2. PostgreSQL is more popular than ever
    loved
    wanted
    https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog
    https://db-engines.com/en/blog_post/76
    https://db-engines.com/en/ranking_trend/system/PostgreSQL
    DBMS of the Year
    DB-Engines’ ranking of PostgreSQL popularity
    PostgreSQL is more popular than ever

    View Slide

  3. PostgreSQL is more popular than ever
    Why PostgreSQL?
    Open source Proven resilience & stability Rich feature set
    enterprise-ready
    • Zero data loss
    • Rich indexing, high performance
    • Extensible
    and tooling

    View Slide

  4. What is Hyperscale (Citus)?
    Making PostgreSQL future-proof, at any scale
    Grow to 100’s of database nodes,
    without re-architecting your application
    Block growth on 1
    (monolithic) database
    vs.
    18
    Total Nodes

    View Slide

  5. PostgreSQL is more popular than ever
    Creating the world's best PostgreSQL on Azure
    Uniquely delivering on all pillars of the open database platform
    Open source Proven resilience & stability Rich feature set
    Cloud management
    Highly scalable
    • Add more nodes anytime
    • Limitless compute & memory
    • Scale cost-effectively
    • Built-in high availability
    • Intelligent security &
    performance
    • Backups, monitoring
    Hyperscale (Citus)

    View Slide

  6. Azure Database for PostgreSQL is available in
    two deployment options
    Single Server
    Fully-managed, single-node PostgreSQL
    Example use cases
    • Apps with JSON, geospatial support, or full-text search
    • Transactional and operational analytics workloads
    • Cloud-native apps built with modern frameworks
    Hyperscale (Citus)
    High-performance Postgres for scale out
    Example use cases
    • Scaling PostgreSQL multi-tenant, SaaS apps
    • Real-time operational analytics
    • Building high throughput transactional apps
    Enterprise-ready, fully
    managed community
    PostgreSQL with built-in HA
    and multi-layered security

    View Slide

  7. Microsoft Windows relies on
    Citus for mission-critical
    decisions
    6M+ queries per day;
    75%
    https://techcommunity.microsoft.com/t5/Azure-Database-for-
    PostgreSQL/Architecting-petabyte-scale-analytics-by-scaling-out-Postgres-on/ba-
    p/969685

    View Slide

  8. Enlyft provides 10x faster intelligence to
    their customers -- with simpler
    infrastructure

    View Slide

  9. Customers rely on Hyperscale (Citus) for mission critical
    workloads across industries
    Use Case
    Multi-Tenant, Industrial IoT
    storing measurement data
    from IoT platform.
    Use Case
    Patient data retention and
    access through bi-directional
    interface engine.
    Use Case
    Realtime Analytics with future
    Multi-Tenant and OLTP
    needs. B2B SaaS platform – AI
    for sales motions – buying
    “proclivity” engine
    Use Case
    Computer Vision software to
    optimize Flipkart supply chain
    Industrial IoT &
    Insurance
    Healthcare Retail ISVs: SaaS applications
    Value Prop
    Ability to parallelize data
    ingest, and roll-up to
    aggregated tables on same
    database.
    Keep up with data capture
    from sensors in the field,
    with fast read/write access
    Value Prop
    Scalability: average customer
    generates 3-4TB.
    Now running 5 (5 node)
    clusters, each with 10TB of
    customer data. Can rapidly
    expand cluster to meet
    customer requirement.
    Value Prop
    Efficient supply chain
    inventory tracking.
    Leveraging AI to improve
    product tracking data
    accuracy that scales.
    20x faster queries. Geospatial
    data with PostGIS.
    Value Prop
    Better scale, and 10x faster
    performance.
    PaaS and Microsoft ecosystem

    View Slide

  10. Under the Hood
    Azure Database for PostgreSQL with
    Hyperscale (Citus)

    View Slide

  11. Scale horizontally across hundreds of cores
    with Hyperscale (Citus)
    Shard your Postgres database across
    multiple nodes to give your application
    more memory, compute, and disk
    storage
    Easily add worker nodes to achieve
    horizontal scale
    Scale up to 100s of nodes
    Coordinator
    Table metadata
    Each node PostgreSQL with Citus installed
    1 shard = 1 PostgreSQL table
    Sharding data across multiple nodes

    View Slide

  12. Hyperscale (Citus) effectively manages data scale-out
    ž Shard rebalancer redistributes shards
    across old and new worker nodes for
    balanced data scale-out
    ž Shard rebalancer will recommend
    rebalance when shards can be placed
    more evenly
    ž For more control, use tenant isolation to
    easily allocate dedicated to specific
    tenants with greater needs
    Hyperscale (Citus) Cloud Shard Rebalancer

    View Slide

  13. Scaling out storage vs. compute for your RDBMS
    Query endpoint Query endpoint
    Primary Replica Replica
    Scaling storage
    Shared storage layer
    Data
    Node 1
    Data
    Node 2
    Data
    Node 3
    N#
    Scaling out compute, memory & storage

    View Slide

  14. Hyperscale (Citus) use cases
    Scaling PostgreSQL multi-tenant, SaaS applications
    Real-time operational analytics
    Building high throughput transactional apps

    View Slide

  15. Primary Use Cases for PostgreSQL
    Hyperscale (Citus)
    Digital transformations & data estate modernization
    Data intensive OSS relational apps: Scale from 100 GB, to multiple PBs
    Multi-tenant & SaaS
    applications
    Real-time, operational analytics
    applications
    Analytics on JSON data, Geospatial,
    Timeseries, In-Memory / HTAP workloads
    Transactional / OLTP
    applications
    B2B apps in Enterprise, Sharding,
    ISVs building SaaS applications
    Strong consistency, Relational semantics
    (foreign keys, joins), limitless data

    View Slide

  16. Real-time operational analytics and reporting
    Sub-second queries on billions of events.
    Notification Hubs
    Devices Event Hubs Raw Events Azure Databricks
    Scheduled
    process
    Browser Aggregations
    App Service
    1. Stream millions of events per second from devices and
    sensors into a scalable system that speaks and
    understands Apache Kafka. Build downstream pipelines
    to process, manipulate, and ingest your data
    2. Ingest millions of raw transactional events into
    Hyperscale (Citus) per second, allowing you to query and
    alert on granular events
    3. Perform incremental rollups directly in your database at
    granularities you define such as minutely, hourly, daily, to
    real-time reporting and dashboarding for down stream
    applications
    4. Take advantage of Azure Databricks to clean, transform,
    and analyze the streaming data, and combine it with
    structured data from operational databases or data
    warehouses
    5. Provide insights to users and operators on current device
    status
    6. Push timely notifications directly to your users on their
    preferred service or medium
    1 2
    3
    4
    5
    6
    4
    Hyperscale (Citus)

    View Slide

  17. High-throughput transactional / OLTP applications
    Azure DB for
    PostgreSQL
    Hyperscale
    (Citus)
    “scale out”
    • Lower costs: better use of memory, and storage scaling. Does
    not need to split into multiple DBs
    • Sub-second responses across many users (high concurrency)
    • Easy to add new nodes
    • Evolve with large, open source ecosystem
    • Faster: load and indexing
    • Limited compute and memory scaling to single node. Storage
    scale out limited to 64 TB.
    • Performance deteriorates with increased user concurrency
    • Performance: Resource utilization
    • Proprietary options, with lock-in.
    • Combine the power of relational semantics, with
    horizontal scalability
    • Simplify your application development, without
    re-architecting your applications just for scale
    • Leverage Postgres: Reliability, rich data types,
    extensions, & expertise.
    • Build with open source. Avoid lock-in.
    • Cut costs and reduce data duplication. Use high-
    performance JOINs, or combine with json data
    when you want to.
    • Keep millisecond responses, while growing to
    100’s of terabytes of data.
    Power of PostgreSQL and globally consistent transactions at scale with low latency
    Scale up

    View Slide

  18. Scaling multi-tenant & SaaS applications
    Data from
    multiple sources
    Hyperscale (Citus)
    Aggregations
    Azure Machine
    Learning
    Power BI
    Tenant (customer)
    Training & Predictive
    Experimentation
    Notification Hubs
    Consumers
    Azure Cache
    for Redis
    PostgreSQL Power BI
    Connector
    1. Ingest and sync data from disparate sources in real
    time with transactional guarantees
    2. Offload database demands by managing sessions
    state and asset caching with Azure Cache for Redis
    3. Shard by tenant and allow Hyperscale to elastically
    scale out your data. With co-location and tenant
    isolation features, don’t worry about the scale limits
    of your database
    4. Use scalable machine learning/deep learning
    techniques, to derive deeper insights from this data
    5. Report and visualize the state of your devices at a
    granular or aggregated level
    6. Push timely notifications directly to your users on
    their preferred service or medium
    1
    2
    3 4
    5
    6
    Azure Kubernetes
    Service
    Build applications that scale simply from one tenant to 1,000s

    View Slide

  19. Danny Piangerelli
    Chief Enterprise Architect,
    North American Community Markets

    View Slide

  20. Scaling a Monolith:
    Evolution of our multi-tenant application
    At inception Future: Cloud native
    Today: Monolithic
    - Minimize app changes & disruption:
    - Preserve enterprise-grade, relational semantics
    - Add/remove nodes on demand, with zero downtime
    - Ability to isolate tenants for performance & security
    - Worry-free cloud manageability (PaaS)
    - Relational data model: Natural fit
    - Ecosystem: Language frameworks & tooling
    - Enterprise-grade reliability
    - Open source
    Requirements

    View Slide

  21. Scaling a Monolith:
    Delivering on the promise of the cloud
    Migrate to a
    commercial RDBMS
    Scale using
    NoSQL
    Hyperscale
    (Citus)
    Manually shard
    (Database or Schema level)
    Minimize app changes &
    disruption
    (Migration time and cost)
    Maintenance costs
    Add nodes on demand, with zero
    downtime
    Tenant isolation
    PaaS
    X


    X
    X
    XX

    X
    X

    X


    X
    X





    View Slide

  22. Questions?
    aka.ms/azure-postgres
    aka.ms/azure-postgres-blog
    aka.ms/citus
    [email protected]
    .com

    View Slide

  23. You might also be interested in…
    Session
    ID
    Day /
    Time
    Title Speaker(s)
    BRK3018
    Tue 11/5
    11:45 AM
    Building modern apps that scale to billions of events with Azure
    Database for PostgreSQL and Hyperscale (Citus)
    Umur Cubukcu
    BRK2065
    Tue 11/5
    1:00 PM
    Innovations to boost productivity with Azure-managed MySQL,
    Postgres, and MariaDB databases
    Sunil Kamath
    THR2124
    Tue 11/5
    4:20 PM
    Running Postgres at scale on-premises and in the cloud Lukas Fittl
    BRK2064
    Wed 11/6
    11:45 AM
    Why developers love Postgres
    Craig Kerstiens, Shyam
    Pitchaimuthu
    THR2120
    Wed 11/6
    1:50 PM
    Deploy an app in Azure Kubernetes and App Services with MySQL Manish Kumar
    THR2123
    Wed 11/6
    4:20 PM
    Why enterprises are moving from Oracle to Azure Postgres Saurabh Modi
    BRK3019
    Thu 11/7
    2:15-3:00 Migrate or build internet-scale applications using MySQL
    Jan Engelsberg, Sunil
    Kamath

    View Slide

  24. View Slide