Upgrade to Pro — share decks privately, control downloads, hide ads and more …

What Is Azure Arc Enabled PostgreSQL Hyperscale | Data Platform Summit 2020 | Jean-Yves Devant & Nikil Patel

What Is Azure Arc Enabled PostgreSQL Hyperscale | Data Platform Summit 2020 | Jean-Yves Devant & Nikil Patel

You would like to modernize to the cloud but you can’t migrate everything overnight? For regulatory/compliance reasons you need to keep some workloads on your premises while you move other applications to the cloud? You need a database engine that is able to scale dynamically, with no downtime to match the growth of your multi-tenant/SaaS workloads or your real-time analytics applications? You are already using the Postgres database engine or you are planning to migrate to it? You would like to use the same Postgres based solution both as a managed service in the cloud and in your data center? If you answered yes to any of these questions, join us in this session to learn about Azure Arc enabled PostgreSQL Hyperscale. This is a new hybrid Azure data service that runs on any physical infrastructure, on premises, at the edge or in the cloud (Azure, AWS, GCP). It is the same technology as the Azure Database for PostgreSQL Hyperscale (Citus) managed service and is now available on the infrastructure of your choice with Azure Arc. Like its sibling in Azure PaaS, Azure Arc enabled PostgreSQL Hyperscale uses the open source Citus extension to scale horizontally, transform Postgres into a distributed database by distributing your data and your queries across all the nodes in a cluster. In the cloud and on your own infrastructure, we have it for you. Let’s connect.

Azure Database for PostgreSQL

December 04, 2020
Tweet

More Decks by Azure Database for PostgreSQL

Other Decks in Technology

Transcript

  1. What is
    Azure Arc enabled
    PostgreSQL Hyperscale?
    Jean-Yves Devant (JY)
    Principal Program Manager
    Microsoft

    View Slide

  2. What is
    Azure Arc enabled PostgreSQL Hyperscale?

    View Slide

  3. Where Does That Fit?
    How does it work?
    Why PostgreSQL?
    What is Azure Database for PostgreSQL Hyperscale (Citus)?
    What is Azure Arc?
    What is Azure Arc enabled data services?
    Where does Azure Arc and Azure PostgreSQL Hyperscale meet?
    Show it to me!

    View Slide

  4. WHY POSTGRESQL?

    View Slide

  5. loved wanted
    https://insights.stackoverflow.com/survey/2019
    https://db-engines.com/en/blog_post/76
    DBMS of the Year
    Why PostgreSQL?

    View Slide

  6. Open source
    Large developer
    community
    Proven resilience
    & stability
    Thousands of mission
    critical workloads
    Rich feature set
    Solves a multitude of
    use cases
    Why PostgreSQL?

    View Slide

  7. High performances
    Open source
    Relational
    JSON/B support
    Key/value pairs with hstore
    Extensions
    Highly customizable
    Flexible datatypes Python, Ruby, R, V8…
    Frequent releases
    Rich indexing
    Geospatial
    Full text search
    Why PostgreSQL?

    View Slide

  8. WHAT IS
    AZURE DATABASE FOR
    POSTGRESQL HYPERSCALE (CITUS)?

    View Slide

  9. What is
    Azure Database For PostgreSQL Hyperscale (Citus)?
    Managed service
    in Azure
    Runs the Citus
    extension
    Cluster of
    multiple
    PostgreSQL
    instances
    Scales out
    compute
    horizontally
    Distributes data
    and queries
    Superior
    performance

    View Slide

  10. APPLICATION
    COORDINATOR
    NODE
    WORKER NODES
    W1
    W2
    W3 …
    Wn
    A cluster of multiple PostgreSQL servers with the Citus extension.
    How Does It Work?

    View Slide

  11. APPLICATION
    CREATE TABLE campaigns (…);
    SELECT create_distributed_table(
    'campaigns','company_id');
    METADATA
    COORDINATOR
    NODE
    WORKER NODES
    W1
    W2
    W3 …
    Wn
    CREATE TABLE
    campaigns_102
    CREATE TABLE
    campaigns_105
    CREATE TABLE
    campaigns_101
    CREATE TABLE
    campaigns_104
    CREATE TABLE
    campaigns_103
    CREATE TABLE
    campaigns_106
    Distributes tables across the cluster
    How Does It Work?

    View Slide

  12. APPLICATION
    SELECT
    FROM
    GROUP BY
    company_id,
    avg(spend) AS avg_campaign_spend
    campaigns
    company_id;
    METADATA
    COORDINATOR
    NODE
    WORKER NODES
    W1
    W2
    W3 …
    Wn
    SELECT company_id
    sum(spend),
    count(spend) …
    FROM
    campaigns_2009 …
    SELECT company_id
    sum(spend),
    count(spend) …
    FROM
    campaigns_2001 …
    SELECT company_id
    sum(spend),
    count(spend) …
    FROM
    campaigns_2017 …
    Distributes queries across the cluster
    How Does It Work?

    View Slide

  13. Distributes transactions in the cluster, example 1
    BEGIN;
    UPDATE
    SET
    WHERE
    COMMIT;
    campaigns
    start_date = '2018-03-17'
    company_id = 'Pat Co';
    METADATA
    W1
    W2
    W3 …
    Wn
    BEGIN; UPDATE
    Campaigns_2012
    SET …;
    COMMIT;
    APPLICATION
    COORDINATOR
    NODE
    WORKER NODES
    How Does It Work?

    View Slide

  14. APPLICATION
    BEGIN;
    UPDATE
    SET
    WHERE
    UPDATE
    SET
    WHERE
    COMMIT;
    campaigns
    feedback = 'relevance'
    company_type = 'platinum';
    ads
    feedback = 'relevance'
    company_type = 'platinum';
    METADATA
    W1
    W2
    W3 …
    Wn
    BEGIN …
    assign_distributed_
    transaction_id …
    UPDATE campaigns_2009 …
    COMMIT PREPARED …
    BEGIN …
    assign_distributed_
    transaction_id …
    UPDATE campaigns_2001 …
    COMMIT PREPARED …
    BEGIN …
    assign_distributed_
    transaction_id …
    UPDATE campaigns_2017 …
    COMMIT PREPARED …
    COORDINATOR
    NODE
    WORKER NODES
    Distributes transactions in the cluster, example 2
    How Does It Work?

    View Slide

  15. How Far Can Citus Scale?
    Algolia
    5-10B rows ingested per day
    Heap
    700+ billion events
    1.4PB data on a 70-node Citus
    Chartbeat
    >2.6B rows added per month
    Mixrank
    1.6PB time series data
    Microsoft
    Petabyte-scale analytics from
    800M+ Windows devices
    “Distributed PostgreSQL is a game changer. We can
    support more than 6M queries every day, on 2 PB of data.
    With Citus, response times for 75% of queries are less than
    0.2 seconds.”
    https://aka.ms/blog-petabyte-scale-analytics
    Pex
    80B rows updated/day
    20-node Citus
    2.4TB memory, 1280 cores, 80TB of data
    Customer stories: https://www.citusdata.com/customers

    View Slide

  16. Citus helps ASB
    onboard customers
    20x faster
    “After migrating to Citus, we can onboard Vonto
    customers 20X faster, in 2 minutes vs. the 40+
    minutes it used to take. And with the launch of
    Hyperscale (Citus) on Azure Database for
    PostgreSQL, we are excited to see what we can
    build next on Azure.”

    View Slide

  17. WHAT IS AZURE ARC?

    View Slide

  18. Hybrid cloud
    is the norm

    View Slide

  19. Managing the
    complexity of hybrid
    cloud is the
    challenge

    View Slide

  20. 10s–1,000s of apps Diverse infrastructure Multi-cloud
    IoT devices Edge
    Datacenters
    Branch offices
    Hosters
    OEM hardware
    VMs
    Containers
    Databases
    Serverless
    Customer Environments Are Increasingly Complex

    View Slide

  21. Elastic scalability
    Self-service provisioning
    Built-in monitoring and security
    Pay for just what you use
    Management from anywhere
    Automation at scale
    Azure Arc Helps You Realize Cloud Benefits
    Everywhere!

    View Slide

  22. Azure IoT
    Any edge device
    Azure Arc
    Any datacenter, any cloud
    Integrated systems
    Azure Stack
    Microsoft Azure
    Azure Hybrid
    Innovation anywhere with Azure
    Management | Security + Identity | App + Data Services | Dev Tools + DevOps

    View Slide

  23. Azure Arc
    Bring Azure services and management to any infrastructure
    Azure Arc is a set of technologies that extends Azure management and enables
    Azure services to run across on-premises, multi-cloud, and edge
    Implement Azure
    security anywhere
    Run Azure Data
    Services anywhere
    Extend Azure management
    across your environments
    Adopt cloud
    practices on-premises

    View Slide

  24. Across Any Infrastructure
    Public cloud On-premises datacenter Edge site

    View Slide

  25. At-scale Kubernetes
    app management
    Organize and govern
    across environments
    Multi-cloud
    Datacenter & hosted
    Azure Arc
    Customer use cases
    Use cloud services
    on prem and still
    meet compliance
    and regulatory
    requirements
    Azure Arc enabled servers
    https://aka.ms/arc-serversdocs
    Azure Arc enabled Kubernetes
    https://aka.ms/Azure-Arc-Kubernetes
    Azure Arc enabled data services
    https://aka.ms/azurearcdata
    All Azure Arc services
    https://aka.ms/azurearc
    Run data services
    anywhere

    View Slide

  26. How Do I Get Started
    With
    Azure Arc?
    http://aka.ms/azurearc
    https://docs.microsoft.com/azure/azure-arc

    View Slide

  27. WHAT IS
    AZURE ARC ENABLED DATA SERVICES?

    View Slide

  28. Elastic scale
    PostgreSQL Hyperscale
    Scale up, scale out on demand
    Automation at scale
    Always current
    Self-service provisioning in seconds
    Automated updates
    Evergreen SQL Managed Instance
    Unified management
    Single view for on-prem, clouds, and edge
    Consistent tools and workflows
    Built-in monitoring and security
    Azure Arc Enabled Data Services
    Azure data services in your datacenter, multi-cloud, and edge
    Connected or Disconnected

    View Slide

  29. Azure Arc enabled
    PostgreSQL Hyperscale
    Azure Arc Enabled Data Services In Preview Now!
    Azure Arc enabled
    SQL Managed Instance
    Azure Arc enabled
    SQL Server
    Try Azure Arc enabled data services for free and let us know what you think
    https://aka.ms/AzureArcData

    View Slide

  30. Azure Data Services Anywhere At A Glance
    Applications Custom
    apps Analytics
    BI

    Any Kubernetes
    AKS
    Any hardware
    Azure data services
    OEM hardware
    Azure data controller
    Kubernetes OpenShift
    Microsoft Azure
    Site Recovery
    Azure Site Recovery
    Monitoring
    Azure Security
    Provisioning
    HA/DR
    Scaling
    Updates
    Backup
    Diagnostics
    Amazon EC2

    View Slide

  31. Why Kubernetes?
    Leading application containers technology
    Abstraction layer, runs on any infrastructure
    Consistent & at-scale deployment and management in seconds
    Automation and CI/CD at scale with GitOps
    https://www.gitops.tech

    View Slide

  32. Connectivity Modes
    Indirectly connected (preview)
    Local provisioning/de-provisioning
    Local elastic scaling
    Local monitoring
    Local log analytics
    Local backup/restore
    Upload logs and metrics to Monitor
    View inventory in Azure
    Upload billing data to Azure
    Use Kubernetes authentication and
    authorization
    Azure DevOps, GitOps operations
    Directly connected (future)
    More details to be announced later…

    View Slide

  33. Azure Arc data controller
    Backup
    Monitoring and logs
    Controller API Azure Arc integration HA/DR Scaling
    Patching/updates
    Provisioning
    Persistent storage
    Node Node Node Node Node Node
    Azure Data Studio
    Identity
    Azure RBAC & Policy
    Advanced Data Security
    Deployments
    Resource Inventory
    Logs & Telemetry
    Backup Retention
    Consumption
    azdata CLI
    kubectl CLI
    Microsoft Container
    Registry
    Azure Portal
    Azure Data Studio
    CLI
    3rd Party
    Kubernetes
    API
    Azure Arc enabled PostgreSQL Hyperscale Other Database service Analytics services
    Azure Arc Data Services Architecture Deeper Dive

    View Slide

  34. Roles And Responsibilities: PaaS Vs. Hybrid
    Who’s in charge of SLAs?
    Azure Platform As A Service
    (PaaS)
    Azure Arc hybrid services
    Microsoft
    Yes
    Microsoft
    Microsoft
    Microsoft
    Customer
    No
    Customer
    Microsoft
    Does Microsoft provide SLAs?
    Who does the operations?
    Who provides the software*?
    Who provides the infrastructure?
    *Azure services
    Customer

    View Slide

  35. How Do I Get Started
    With
    Azure Arc Enabled
    Data Services?
    https://docs.microsoft.com/azure/azure-
    arc/data/

    View Slide

  36. AZURE ARC ENABLED POSTGRESQL HYPERSCALE
    + =

    View Slide

  37. This Is Where It All Comes Together
    Azure Arc enabled PostgreSQL Hyperscale is:

    View Slide

  38. How Do I Get Started
    With
    Azure Arc Enabled
    PostgreSQL Hyperscale?
    Get started
    https://aka.ms/arcpostgresqlhyperscale
    Deploy
    https://aka.ms/deployarcpostgresqlhyperscale
    Accelerated experience with a test deployment
    https://github.com/microsoft/azure_arc#azure-arc-
    enabled-data-services
    In preview
    now. Free

    View Slide

  39. SHOW IT TO ME!

    View Slide

  40. Postgres In Azure Vs. Other Clouds?
    The Choice
    Hyperscale (Citus)
    Worry-free PostgreSQL in the
    cloud with an architecture
    built to scale out
    Single Server
    Enterprise-ready, fully
    managed community OSS
    engines
    Azure Arc enabled PostgreSQL Hyperscale NEW
    Hybrid, scale out PostgreSQL in
    environment of your choice
    Flexible Server (Preview) NEW
    Maximum control with a
    simplified developer experience
    Open source & community
    PostgreSQL committers at Microsoft: https://aka.ms/blog-postgres-committers

    View Slide

  41. Q&A
    Get started
    https://aka.ms/arcpostgresqlhyperscale
    Follow us: @AzureDBPostgres, @CitusData

    View Slide

  42. Special Thanks To
    for supporting
    DataPlatformGeeks & SQLServerGeeks
    Community Initiatives

    View Slide

  43. Three Ways to Win Prizes
    Post your selfie with hash tag #DPS2020
    Give Session & Conference Feedback
    Visit our Sponsors & Exhibitors
    Thank You
    Follow us on Twitter @TheDataGeeks @DataAISummit

    View Slide

  44. Data Platform Virtual Summit 2020 is a community initiative by DataPlatformGeeks
    RESOURCES

    View Slide

  45. Go Deeper Into Postgres & Hyperscale (Citus)
    • https://www.citusdata.com/
    • http://docs.citusdata.com/en/v9.5/
    Why Scale Out Postgres?
    https://youtu.be/g3H4nGsJsl0
    DEMO - High performance HTAP
    with Postgres & Hyperscale (Citus)
    https://youtu.be/W_3e07nGFxY
    DEMO – Building HTAP
    Applications with Python &
    Postgres on
    Azurehttps://youtu.be/YDT8_riLLs0

    View Slide

  46. DELETE BEFORE PUBLISHING

    View Slide