Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Building modern apps that scale to billions of events with Azure Database for Postgres | Ignite 2019 | Umur Cubukcu

Building modern apps that scale to billions of events with Azure Database for Postgres | Ignite 2019 | Umur Cubukcu

Come learn why customers call Hyperscale (Citus) a game changer (their words, not ours.) With the launch of Hyperscale (Citus), you can now scale out real-time analytics workloads horizontally using Azure Database for PostgreSQL. This session explores why so many developers are adopting the open source Postgres database, and highlights what makes our managed Postgres service on Azure unique. Then, we dive into real-world use cases to show how you can build scalable real-time analytics apps using Azure, Postgres, and Hyperscale (Citus). In particular, we show how one team built a petabyte-scale analytics dashboard that supports real-time decision making with response times of 90 ms—even with 6M queries/day across billions of rows.

Azure Database for PostgreSQL

November 05, 2019

More Decks by Azure Database for PostgreSQL

Other Decks in Technology


  1. PostgreSQL is more popular than ever loved wanted https://insights.stackoverflow.com/survey/2019?utm_source=so-owned&utm_medium=blog&utm_campaign=dev-survey-2019&utm_content=launch-blog https://db-engines.com/en/blog_post/76

    https://db-engines.com/en/ranking_trend/system/PostgreSQL DBMS of the Year DB-Engines’ ranking of PostgreSQL popularity PostgreSQL is more popular than ever
  2. PostgreSQL is more popular than ever Why PostgreSQL? Open source

    Proven resilience & stability Rich feature set enterprise-ready • Zero data loss • Rich indexing, high performance • Extensible and tooling
  3. What is Hyperscale (Citus)? Making PostgreSQL future-proof, at any scale

    Grow to 100’s of database nodes, without re-architecting your application Block growth on 1 (monolithic) database vs. 18 Total Nodes
  4. PostgreSQL is more popular than ever Creating the world's best

    PostgreSQL on Azure Uniquely delivering on all pillars of the open database platform Open source Proven resilience & stability Rich feature set Cloud management Highly scalable • Add more nodes anytime • Limitless compute & memory • Scale cost-effectively • Built-in high availability • Intelligent security & performance • Backups, monitoring Hyperscale (Citus)
  5. Azure Database for PostgreSQL is available in two deployment options

    Single Server Fully-managed, single-node PostgreSQL Example use cases • Apps with JSON, geospatial support, or full-text search • Transactional and operational analytics workloads • Cloud-native apps built with modern frameworks Hyperscale (Citus) High-performance Postgres for scale out Example use cases • Scaling PostgreSQL multi-tenant, SaaS apps • Real-time operational analytics • Building high throughput transactional apps Enterprise-ready, fully managed community PostgreSQL with built-in HA and multi-layered security
  6. Microsoft Windows relies on Citus for mission-critical decisions 6M+ queries

    per day; 75% https://techcommunity.microsoft.com/t5/Azure-Database-for- PostgreSQL/Architecting-petabyte-scale-analytics-by-scaling-out-Postgres-on/ba- p/969685
  7. Customers rely on Hyperscale (Citus) for mission critical workloads across

    industries Use Case Multi-Tenant, Industrial IoT storing measurement data from IoT platform. Use Case Patient data retention and access through bi-directional interface engine. Use Case Realtime Analytics with future Multi-Tenant and OLTP needs. B2B SaaS platform – AI for sales motions – buying “proclivity” engine Use Case Computer Vision software to optimize Flipkart supply chain Industrial IoT & Insurance Healthcare Retail ISVs: SaaS applications Value Prop Ability to parallelize data ingest, and roll-up to aggregated tables on same database. Keep up with data capture from sensors in the field, with fast read/write access Value Prop Scalability: average customer generates 3-4TB. Now running 5 (5 node) clusters, each with 10TB of customer data. Can rapidly expand cluster to meet customer requirement. Value Prop Efficient supply chain inventory tracking. Leveraging AI to improve product tracking data accuracy that scales. 20x faster queries. Geospatial data with PostGIS. Value Prop Better scale, and 10x faster performance. PaaS and Microsoft ecosystem
  8. Scale horizontally across hundreds of cores with Hyperscale (Citus) Shard

    your Postgres database across multiple nodes to give your application more memory, compute, and disk storage Easily add worker nodes to achieve horizontal scale Scale up to 100s of nodes Coordinator Table metadata Each node PostgreSQL with Citus installed 1 shard = 1 PostgreSQL table Sharding data across multiple nodes
  9. Hyperscale (Citus) effectively manages data scale-out ž Shard rebalancer redistributes

    shards across old and new worker nodes for balanced data scale-out ž Shard rebalancer will recommend rebalance when shards can be placed more evenly ž For more control, use tenant isolation to easily allocate dedicated to specific tenants with greater needs Hyperscale (Citus) Cloud Shard Rebalancer
  10. Scaling out storage vs. compute for your RDBMS Query endpoint

    Query endpoint Primary Replica Replica Scaling storage Shared storage layer Data Node 1 Data Node 2 Data Node 3 N# Scaling out compute, memory & storage
  11. Hyperscale (Citus) use cases Scaling PostgreSQL multi-tenant, SaaS applications Real-time

    operational analytics Building high throughput transactional apps
  12. Primary Use Cases for PostgreSQL Hyperscale (Citus) Digital transformations &

    data estate modernization Data intensive OSS relational apps: Scale from 100 GB, to multiple PBs Multi-tenant & SaaS applications Real-time, operational analytics applications Analytics on JSON data, Geospatial, Timeseries, In-Memory / HTAP workloads Transactional / OLTP applications B2B apps in Enterprise, Sharding, ISVs building SaaS applications Strong consistency, Relational semantics (foreign keys, joins), limitless data
  13. Real-time operational analytics and reporting Sub-second queries on billions of

    events. Notification Hubs Devices Event Hubs Raw Events Azure Databricks Scheduled process Browser Aggregations App Service 1. Stream millions of events per second from devices and sensors into a scalable system that speaks and understands Apache Kafka. Build downstream pipelines to process, manipulate, and ingest your data 2. Ingest millions of raw transactional events into Hyperscale (Citus) per second, allowing you to query and alert on granular events 3. Perform incremental rollups directly in your database at granularities you define such as minutely, hourly, daily, to real-time reporting and dashboarding for down stream applications 4. Take advantage of Azure Databricks to clean, transform, and analyze the streaming data, and combine it with structured data from operational databases or data warehouses 5. Provide insights to users and operators on current device status 6. Push timely notifications directly to your users on their preferred service or medium 1 2 3 4 5 6 4 Hyperscale (Citus)
  14. High-throughput transactional / OLTP applications Azure DB for PostgreSQL Hyperscale

    (Citus) “scale out” • Lower costs: better use of memory, and storage scaling. Does not need to split into multiple DBs • Sub-second responses across many users (high concurrency) • Easy to add new nodes • Evolve with large, open source ecosystem • Faster: load and indexing • Limited compute and memory scaling to single node. Storage scale out limited to 64 TB. • Performance deteriorates with increased user concurrency • Performance: Resource utilization • Proprietary options, with lock-in. • Combine the power of relational semantics, with horizontal scalability • Simplify your application development, without re-architecting your applications just for scale • Leverage Postgres: Reliability, rich data types, extensions, & expertise. • Build with open source. Avoid lock-in. • Cut costs and reduce data duplication. Use high- performance JOINs, or combine with json data when you want to. • Keep millisecond responses, while growing to 100’s of terabytes of data. Power of PostgreSQL and globally consistent transactions at scale with low latency Scale up
  15. Scaling multi-tenant & SaaS applications Data from multiple sources Hyperscale

    (Citus) Aggregations Azure Machine Learning Power BI Tenant (customer) Training & Predictive Experimentation Notification Hubs Consumers Azure Cache for Redis PostgreSQL Power BI Connector 1. Ingest and sync data from disparate sources in real time with transactional guarantees 2. Offload database demands by managing sessions state and asset caching with Azure Cache for Redis 3. Shard by tenant and allow Hyperscale to elastically scale out your data. With co-location and tenant isolation features, don’t worry about the scale limits of your database 4. Use scalable machine learning/deep learning techniques, to derive deeper insights from this data 5. Report and visualize the state of your devices at a granular or aggregated level 6. Push timely notifications directly to your users on their preferred service or medium 1 2 3 4 5 6 Azure Kubernetes Service Build applications that scale simply from one tenant to 1,000s
  16. Scaling a Monolith: Evolution of our multi-tenant application At inception

    Future: Cloud native Today: Monolithic - Minimize app changes & disruption: - Preserve enterprise-grade, relational semantics - Add/remove nodes on demand, with zero downtime - Ability to isolate tenants for performance & security - Worry-free cloud manageability (PaaS) - Relational data model: Natural fit - Ecosystem: Language frameworks & tooling - Enterprise-grade reliability - Open source Requirements
  17. Scaling a Monolith: Delivering on the promise of the cloud

    Migrate to a commercial RDBMS Scale using NoSQL Hyperscale (Citus) Manually shard (Database or Schema level) Minimize app changes & disruption (Migration time and cost) Maintenance costs Add nodes on demand, with zero downtime Tenant isolation PaaS X ✓ ✓ X X XX ✓ X X ✓ X ✓ ✓ X X ✓ ✓ ✓ ✓ ✓
  18. You might also be interested in… Session ID Day /

    Time Title Speaker(s) BRK3018 Tue 11/5 11:45 AM Building modern apps that scale to billions of events with Azure Database for PostgreSQL and Hyperscale (Citus) Umur Cubukcu BRK2065 Tue 11/5 1:00 PM Innovations to boost productivity with Azure-managed MySQL, Postgres, and MariaDB databases Sunil Kamath THR2124 Tue 11/5 4:20 PM Running Postgres at scale on-premises and in the cloud Lukas Fittl BRK2064 Wed 11/6 11:45 AM Why developers love Postgres Craig Kerstiens, Shyam Pitchaimuthu THR2120 Wed 11/6 1:50 PM Deploy an app in Azure Kubernetes and App Services with MySQL Manish Kumar THR2123 Wed 11/6 4:20 PM Why enterprises are moving from Oracle to Azure Postgres Saurabh Modi BRK3019 Thu 11/7 2:15-3:00 Migrate or build internet-scale applications using MySQL Jan Engelsberg, Sunil Kamath