Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What Is Azure Arc Enabled PostgreSQL Hyperscale | Data Platform Summit 2020 | Jean-Yves Devant & Nikil Patel

What Is Azure Arc Enabled PostgreSQL Hyperscale | Data Platform Summit 2020 | Jean-Yves Devant & Nikil Patel

You would like to modernize to the cloud but you can’t migrate everything overnight? For regulatory/compliance reasons you need to keep some workloads on your premises while you move other applications to the cloud? You need a database engine that is able to scale dynamically, with no downtime to match the growth of your multi-tenant/SaaS workloads or your real-time analytics applications? You are already using the Postgres database engine or you are planning to migrate to it? You would like to use the same Postgres based solution both as a managed service in the cloud and in your data center? If you answered yes to any of these questions, join us in this session to learn about Azure Arc enabled PostgreSQL Hyperscale. This is a new hybrid Azure data service that runs on any physical infrastructure, on premises, at the edge or in the cloud (Azure, AWS, GCP). It is the same technology as the Azure Database for PostgreSQL Hyperscale (Citus) managed service and is now available on the infrastructure of your choice with Azure Arc. Like its sibling in Azure PaaS, Azure Arc enabled PostgreSQL Hyperscale uses the open source Citus extension to scale horizontally, transform Postgres into a distributed database by distributing your data and your queries across all the nodes in a cluster. In the cloud and on your own infrastructure, we have it for you. Let’s connect.


Azure Postgres

December 04, 2020


  1. What is Azure Arc enabled PostgreSQL Hyperscale? Jean-Yves Devant (JY)

    Principal Program Manager Microsoft
  2. What is Azure Arc enabled PostgreSQL Hyperscale?

  3. Where Does That Fit? How does it work? Why PostgreSQL?

    What is Azure Database for PostgreSQL Hyperscale (Citus)? What is Azure Arc? What is Azure Arc enabled data services? Where does Azure Arc and Azure PostgreSQL Hyperscale meet? Show it to me!

  5. loved wanted https://insights.stackoverflow.com/survey/2019 https://db-engines.com/en/blog_post/76 DBMS of the Year Why PostgreSQL?

  6. Open source Large developer community Proven resilience & stability Thousands

    of mission critical workloads Rich feature set Solves a multitude of use cases Why PostgreSQL?
  7. High performances Open source Relational JSON/B support Key/value pairs with

    hstore Extensions Highly customizable Flexible datatypes Python, Ruby, R, V8… Frequent releases Rich indexing Geospatial Full text search Why PostgreSQL?

  9. What is Azure Database For PostgreSQL Hyperscale (Citus)? Managed service

    in Azure Runs the Citus extension Cluster of multiple PostgreSQL instances Scales out compute horizontally Distributes data and queries Superior performance

    A cluster of multiple PostgreSQL servers with the Citus extension. How Does It Work?
  11. APPLICATION CREATE TABLE campaigns (…); SELECT create_distributed_table( 'campaigns','company_id'); METADATA COORDINATOR

    NODE WORKER NODES W1 W2 W3 … Wn CREATE TABLE campaigns_102 CREATE TABLE campaigns_105 CREATE TABLE campaigns_101 CREATE TABLE campaigns_104 CREATE TABLE campaigns_103 CREATE TABLE campaigns_106 Distributes tables across the cluster How Does It Work?
  12. APPLICATION SELECT FROM GROUP BY company_id, avg(spend) AS avg_campaign_spend campaigns

    company_id; METADATA COORDINATOR NODE WORKER NODES W1 W2 W3 … Wn SELECT company_id sum(spend), count(spend) … FROM campaigns_2009 … SELECT company_id sum(spend), count(spend) … FROM campaigns_2001 … SELECT company_id sum(spend), count(spend) … FROM campaigns_2017 … Distributes queries across the cluster How Does It Work?
  13. Distributes transactions in the cluster, example 1 BEGIN; UPDATE SET

    WHERE COMMIT; campaigns start_date = '2018-03-17' company_id = 'Pat Co'; METADATA W1 W2 W3 … Wn BEGIN; UPDATE Campaigns_2012 SET …; COMMIT; APPLICATION COORDINATOR NODE WORKER NODES How Does It Work?

    feedback = 'relevance' company_type = 'platinum'; ads feedback = 'relevance' company_type = 'platinum'; METADATA W1 W2 W3 … Wn BEGIN … assign_distributed_ transaction_id … UPDATE campaigns_2009 … COMMIT PREPARED … BEGIN … assign_distributed_ transaction_id … UPDATE campaigns_2001 … COMMIT PREPARED … BEGIN … assign_distributed_ transaction_id … UPDATE campaigns_2017 … COMMIT PREPARED … COORDINATOR NODE WORKER NODES Distributes transactions in the cluster, example 2 How Does It Work?
  15. How Far Can Citus Scale? Algolia 5-10B rows ingested per

    day Heap 700+ billion events 1.4PB data on a 70-node Citus Chartbeat >2.6B rows added per month Mixrank 1.6PB time series data Microsoft Petabyte-scale analytics from 800M+ Windows devices “Distributed PostgreSQL is a game changer. We can support more than 6M queries every day, on 2 PB of data. With Citus, response times for 75% of queries are less than 0.2 seconds.” https://aka.ms/blog-petabyte-scale-analytics Pex 80B rows updated/day 20-node Citus 2.4TB memory, 1280 cores, 80TB of data Customer stories: https://www.citusdata.com/customers
  16. Citus helps ASB onboard customers 20x faster “After migrating to

    Citus, we can onboard Vonto customers 20X faster, in 2 minutes vs. the 40+ minutes it used to take. And with the launch of Hyperscale (Citus) on Azure Database for PostgreSQL, we are excited to see what we can build next on Azure.”

  18. Hybrid cloud is the norm

  19. Managing the complexity of hybrid cloud is the challenge

  20. 10s–1,000s of apps Diverse infrastructure Multi-cloud IoT devices Edge Datacenters

    Branch offices Hosters OEM hardware VMs Containers Databases Serverless Customer Environments Are Increasingly Complex
  21. Elastic scalability Self-service provisioning Built-in monitoring and security Pay for

    just what you use Management from anywhere Automation at scale Azure Arc Helps You Realize Cloud Benefits Everywhere!
  22. Azure IoT Any edge device Azure Arc Any datacenter, any

    cloud Integrated systems Azure Stack Microsoft Azure Azure Hybrid Innovation anywhere with Azure Management | Security + Identity | App + Data Services | Dev Tools + DevOps
  23. Azure Arc Bring Azure services and management to any infrastructure

    Azure Arc is a set of technologies that extends Azure management and enables Azure services to run across on-premises, multi-cloud, and edge Implement Azure security anywhere Run Azure Data Services anywhere Extend Azure management across your environments Adopt cloud practices on-premises
  24. Across Any Infrastructure Public cloud On-premises datacenter Edge site

  25. At-scale Kubernetes app management Organize and govern across environments Multi-cloud

    Datacenter & hosted Azure Arc Customer use cases Use cloud services on prem and still meet compliance and regulatory requirements Azure Arc enabled servers https://aka.ms/arc-serversdocs Azure Arc enabled Kubernetes https://aka.ms/Azure-Arc-Kubernetes Azure Arc enabled data services https://aka.ms/azurearcdata All Azure Arc services https://aka.ms/azurearc Run data services anywhere
  26. How Do I Get Started With Azure Arc? http://aka.ms/azurearc https://docs.microsoft.com/azure/azure-arc


  28. Elastic scale PostgreSQL Hyperscale Scale up, scale out on demand

    Automation at scale Always current Self-service provisioning in seconds Automated updates Evergreen SQL Managed Instance Unified management Single view for on-prem, clouds, and edge Consistent tools and workflows Built-in monitoring and security Azure Arc Enabled Data Services Azure data services in your datacenter, multi-cloud, and edge Connected or Disconnected
  29. Azure Arc enabled PostgreSQL Hyperscale Azure Arc Enabled Data Services

    In Preview Now! Azure Arc enabled SQL Managed Instance Azure Arc enabled SQL Server Try Azure Arc enabled data services for free and let us know what you think https://aka.ms/AzureArcData
  30. Azure Data Services Anywhere At A Glance Applications Custom apps

    Analytics BI … Any Kubernetes AKS Any hardware Azure data services OEM hardware Azure data controller Kubernetes OpenShift Microsoft Azure Site Recovery Azure Site Recovery Monitoring Azure Security Provisioning HA/DR Scaling Updates Backup Diagnostics Amazon EC2
  31. Why Kubernetes? Leading application containers technology Abstraction layer, runs on

    any infrastructure Consistent & at-scale deployment and management in seconds Automation and CI/CD at scale with GitOps https://www.gitops.tech
  32. Connectivity Modes Indirectly connected (preview) Local provisioning/de-provisioning Local elastic scaling

    Local monitoring Local log analytics Local backup/restore Upload logs and metrics to Monitor View inventory in Azure Upload billing data to Azure Use Kubernetes authentication and authorization Azure DevOps, GitOps operations Directly connected (future) More details to be announced later…
  33. Azure Arc data controller Backup Monitoring and logs Controller API

    Azure Arc integration HA/DR Scaling Patching/updates Provisioning Persistent storage Node Node Node Node Node Node Azure Data Studio Identity Azure RBAC & Policy Advanced Data Security Deployments Resource Inventory Logs & Telemetry Backup Retention Consumption azdata CLI kubectl CLI Microsoft Container Registry Azure Portal Azure Data Studio CLI 3rd Party Kubernetes API Azure Arc enabled PostgreSQL Hyperscale Other Database service Analytics services Azure Arc Data Services Architecture Deeper Dive
  34. Roles And Responsibilities: PaaS Vs. Hybrid Who’s in charge of

    SLAs? Azure Platform As A Service (PaaS) Azure Arc hybrid services Microsoft Yes Microsoft Microsoft Microsoft Customer No Customer Microsoft Does Microsoft provide SLAs? Who does the operations? Who provides the software*? Who provides the infrastructure? *Azure services Customer
  35. How Do I Get Started With Azure Arc Enabled Data

    Services? https://docs.microsoft.com/azure/azure- arc/data/

  37. This Is Where It All Comes Together Azure Arc enabled

    PostgreSQL Hyperscale is:
  38. How Do I Get Started With Azure Arc Enabled PostgreSQL

    Hyperscale? Get started https://aka.ms/arcpostgresqlhyperscale Deploy https://aka.ms/deployarcpostgresqlhyperscale Accelerated experience with a test deployment https://github.com/microsoft/azure_arc#azure-arc- enabled-data-services In preview now. Free
  39. SHOW IT TO ME!

  40. Postgres In Azure Vs. Other Clouds? The Choice Hyperscale (Citus)

    Worry-free PostgreSQL in the cloud with an architecture built to scale out Single Server Enterprise-ready, fully managed community OSS engines Azure Arc enabled PostgreSQL Hyperscale NEW Hybrid, scale out PostgreSQL in environment of your choice Flexible Server (Preview) NEW Maximum control with a simplified developer experience Open source & community PostgreSQL committers at Microsoft: https://aka.ms/blog-postgres-committers
  41. Q&A Get started https://aka.ms/arcpostgresqlhyperscale Follow us: @AzureDBPostgres, @CitusData

  42. Special Thanks To for supporting DataPlatformGeeks & SQLServerGeeks Community Initiatives

  43. Three Ways to Win Prizes Post your selfie with hash

    tag #DPS2020 Give Session & Conference Feedback Visit our Sponsors & Exhibitors Thank You Follow us on Twitter @TheDataGeeks @DataAISummit
  44. Data Platform Virtual Summit 2020 is a community initiative by

    DataPlatformGeeks RESOURCES
  45. Go Deeper Into Postgres & Hyperscale (Citus) • https://www.citusdata.com/ •

    http://docs.citusdata.com/en/v9.5/ Why Scale Out Postgres? https://youtu.be/g3H4nGsJsl0 DEMO - High performance HTAP with Postgres & Hyperscale (Citus) https://youtu.be/W_3e07nGFxY DEMO – Building HTAP Applications with Python & Postgres on Azurehttps://youtu.be/YDT8_riLLs0