Upgrade to Pro — share decks privately, control downloads, hide ads and more …

AzureBootcamp2023: Data Clean Rooms & Confident...

AzureBootcamp2023: Data Clean Rooms & Confidential Computing by David Sturzenegger & Primo Amrein

Bootcamp Switzerland
Register
Sessions
Location
Sponsors
Team
Archive
Sessions 🗓️
We are very happy to be able to offer you once again an exciting lineup including many new speakers, both from the local and international community, consisting of MVPs, Microsoft employees and industry leads, who will speak about specific use cases in the industry as well as the latest developments around services in Azure. Be it real life use cases from Pax insurance, Axpo, Georg Fischer, and Die Mobiliar, a kickstart with FinOps on Azure, Azure Quantum Compute or PowerBI or deep dives on Azure networking - these sessions will provide you with insights around Azure and the opportunity to connect with peers and speakers.

The schedule is still subject to change.

Time DevOps
Room 3.54 Infrastructure
Room 3.53 Future Tech
Room 3.14
0800 ⏰ Registration
0900 ⭐ How we Build Data Clean Rooms on Azure Confidential Computing at Decentriq
🙂 DAVID STURZENEGGER
🙂 PRIMO AMREIN
1010 ☕ Coffee Break
1040 ⭐ Kubernetes @ PAX - DevOps at a Swiss Insurance
🙂 SASCHA SPREITZER
🙂 ESRA DOERKSEN ⭐ Building a Lakehouse Platform on Azure with Databricks
🙂 HANSJÖRG WINGEIER
🙂 MATHIAS HERZOG ⭐ Azure FinOps: The Quiz
🙂 ROLAND KRUMMENACHER
🙂 STEFAN DENK
1135 ⭐ Advanced Analytics with Azure DevOps Dojo
🙂 ARINDAM MITRA
🙂 ADRIAN SENN ⭐ The immutable laws of security
🙂 ALAIN SCHNEITER
🙂 MICHAEL RÜEFLI ⭐ God really plays dice - Introduction to quantum computing with Q#
🙂 FILIP WOJCIESZYN
1220 🍕 Lunch Break
1330 ⭐ Pushing Azure (DevOps) @ Georg Fischer
🙂 MARTIN STANEK ⭐ Azure Networking vNext - How to build modern connectivity for IaaS, PaaS and SaaS
🙂 ERIC BERG ⭐ How can Microsoft Azure help with sustainability? Methods to estimate your cloud’s carbon footprint
🙂 WIBKE SUDHOLT
1425 ⭐ Eventdriven systems on Azure done right
🙂 ROBIN KONRAD ⭐ Azure PaaS, but as private as possible…
🙂 STEPHAN GRABER ⭐ Use the power of OpenAI to leverage your business application
🙂 DAVID SCHNEIDER
1510 ☕ Coffee Break
1540 ⭐ Securing web applications using Azure AD
🙂 DAMIEN BOWDEN ⭐ Azure Virtual Network Manager: The future of network management?
🙂 MARCEL ZEHNER ⭐ Push your Azure tenant to the next level with Power BI
🙂 DENIS SELIMOVIC
1635 ⭐ Fully automated & cloud-native data platform
🙂 TIM GIGER ⭐ More than a facade - Azure API Management “from zeron to hero”
🙂 MICHAEL RÜEFLI ⭐ Develop for inclusion using cognitive services: an Azure story
🙂 ANDRÉ MELANCIA
🙂 KAY SAUTER
1720 🍻 Networking Apéro sponsored by isolutions
⭐️ How we Build Data Clean Rooms on Azure Confidential Computing at Decentriq#
Decentriq is a Zurich based startup developing leading-edge data privacy products for sensitive industries. They have been awarded the startup of the year 2022 award by Microsoft. This presentation will open with an overview of Microsoft’s confidential computing strategy and Azure’s Confidential Compute offering. Decentriq will then present their confidential computing-based data collaboration platform, a deeper dive into the technology as well as use-cases of their customers ranging from banks to pharmaceutical companies. In many cases it would be desirable to combine sensitive datasets from multiple sources to compute anonymous statistics. Examples range from healthcare research to customer analytics to anti-money laundering and marketing. However, the fact that data comes from multiple sources means that at least one party has to disclose sensitve data to another. In practice this usually means that such use-cases are blocked either for legal or lack-of-trust reasons. Confidential computing is a CPU-rooted privacy technology that enables the processing of data while keeping the data inaccessible to all parties - including all participants, the SaaS platform and infrastructure providers. Encryption in-use enables data to stay encrypted also in memory and prevent access by the operating system. Remote attestation enables users to remotely verify that a server indeed runs in confidential computing and even what code it is running.
🙂 DAVID STURZENEGGER ⚡️ Head of Product @ Decentriq
🙂 PRIMO AMREIN ⚡️ Cloud Lead @ Microsoft Switzerland

More Decks by Azure Zurich User Group

Other Decks in Technology

Transcript

  1. Upcoming Events §16.05.2023 Azure Zurich: Scale with KEDA and Container

    Apps §07.06.2023 Azure Bern: Manage and Govern your Hybrid Servers using Azure Arc §29.08.2023 DotNET Day Switzerland 2023 §18. – 20.09.2023 Experts Live Europe, Prague
  2. Over a decade of helping our customers on their journey

    to/in the Microsoft cloud We are supporting them wherever they are.
  3. Modernisation Change management Strategy development Governance/ security Compliance/ placement Solution

    design Migration Compliance Security Optimisation/ FinOps Support/ reliability Buy-in/ commitment Migration strategy (7 Rs) Architecture Costs/ benefits Set-up of landing zone Governance Business continuity Operation/ lifecycle Innovation Concept Implementation & transformation Operation & optimisation Strategy Risk management Strategy approved Roadmap available Rollout 3 2 1
  4. 10:40h, Room 3.14: Azure FinOps: The Quiz 16:35h, Room 3.54:

    Fully automated & cloud-native data platform Whole day: Swisscom / itnetX booth 13 Our presence today
  5. Azure Workloads Launched Since Last Bootcamp (1/2) • Azure Container

    Apps in Switzerland North Deploy containerized apps without managing complex infrastructure with this fully managed serverless platform service. Build and deploy modern apps and microservices at scale. Applications built on Azure Container Apps can dynamically scale based on the following characteristics: HTTP traffic, event-driven processing, CPU or memory load. • Azure IoT Hub in Switzerland North and West Cloud-hosted solution to connect, monitor and manage IoT assets • Azure Ultra Disks in Switzerland North Highest-performing storage, suited for data-intensive workloads such as SAP HANA, top-tier DBs and transaction-heavy loads
  6. • Azure Spring Cloud Service in Switzerland North Open-source application

    framework providing infrastructure support for developing Java applications • Azure DNS Private Resolver in Switzerland North Enables to query Azure DNS private zones from on-prem environment and vice versa without deploying VM based DNS servers. Customers no longer need to provision IaaS based solutions on their Virtual Networks to resolve names registered on Azure Private DNS Zones and can do conditional forwarding of domains back to on-prem, multi-cloud and public DNS servers. • Lasv3 VMs in Switzerland North Storage-optimized VMs: using the local disk on the node attached directly to the VM rather than durable data disks, allowing greater IOPS and throughput for workloads. First AMD-based VMs in Swiss DCs. Azure Workloads Launched Since Last Bootcamp (2/2)
  7. New Customer Stories YouTube Video 1 YouTube Video 2 YouTube

    Video 3 German English AGEFI Article in French Inside IT Article in German
  8. Azure cloud security is best in class More than 650,000

    customers and 90 of the Fortune 100 trust Microsoft SCI solutions Microsoft employs +8,500 security experts and committed $20B in security investment over the next 5 years In 2020, 9 billion malware threats were blocked on endpoints by Microsoft 365 Defender Microsoft processes over 24 trillion signals every 24 hours
  9. Cloud customers are increasingly looking for ways to trust as

    little as possible Full control over the data lifecycle Privacy Untrusted collaboration Regulations and compliance Customer trust
  10. Data protection EXISTING ENCRYPTION Data at rest Encrypt inactive data

    when stored in blob storage, database, etc. Data in transit Encrypt data that is flowing between untrusted public or private networks
  11. Azure Confidential Computing EXISTING ENCRYPTION Data at rest Encrypt inactive

    data when stored in blob storage, database, etc. Data in transit Encrypt data that is flowing between untrusted public or private networks CONFIDENTIAL COMPUTING Data in use Protect/encrypt data that is in use, while in RAM, and during computation Protect against privileged admins or insiders exploiting bugs in the Hypervisor/OS accessing data without customer consent In Azure, confidential computing means… A hardware root-of-trust, customer verifiable remote attestation, and memory encryption
  12. Confidential Cloud Data is fully in the control of the

    customer at rest, in transit, or in use. The cloud platform provider is outside the trusted compute base. Code running in the cloud is protected and verified by the customer. Activity history is immutable and auditable.
  13. Confidential Computing Market Hardware & software market Multi-party computing Financial,

    healthcare, life sciences and more Global market Confidential computing TAM US$ billion 2021 2024 2026 16-18 52-54 1.9-2.0
  14. 29 Imagine all data silos get unlocked overnight. Every start-up

    founder, every data scientist, magically gets access to all data ever collected. What'd be the impact? Will this double world GDP in 5 years? 10? What if I told you we can already do this? Elad Verbin 29
  15. 30 Unlocking all data silos raises legitimate privacy concerns PRIVACY

    VS UTILITY “Unlocking all data silos” High Utility, Low Privacy Today’s situation Low Utility, High Privacy PRIVACY UTILITY
  16. 31 PRIVACY VS UTILITY Confidential Computing can improve this trade-off

    PRIVACY UTILITY CC-based Data Collaboration Good Utility, High Privacy
  17. 32 32 32 Bill Gates (2019) on how to learn

    more from data while maintaining privacy “ What if we had a way to collect data but not reveal individual records? ” Cost Compliance Security Control SENSITIVE DATA COLLABORATION CHALLENGES 32
  18. 35 DATA COLLABORATION INSURANCE BANK External party trusted with all

    data (or trust each other) Only contractual & organisational measures protect the data at the trusted external party As the external party has access to all data, there is risk for unintentional or intentional misuse Person-level customer data Person-level customer data Aggregated and anonymous customer insights Aggregated and anonymous customer insights INTRODUCTORY EXAMPLE – THE TRADITIONAL WAY
  19. 36 DATA COLLABORATION New confidential computing technology makes it impossible

    for the third party to access data or modify the allowed processing operations Confidential computing enables technical verification of this fact, also for remote users This renders the risk of data leakage & misuse minimal Data Clean Room Azure Confidential Computing Data cannot be accessed, not even by admins INSURANCE BANK Person-level customer data Person-level customer data Aggregated and anonymous customer insights Aggregated and anonymous customer insights INTRODUCTORY EXAMPLE – WITH CONFIDENTIAL COMPUTING
  20. 37 DATA COLLABORATION This collaboration model covers many use-cases: in

    medical research, in marketing etc. Personal data Personal data Aggregated statistics Data Owner B Data Owner A Data Clean Room Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  21. 39 REQUIREMENTS FOR CONFIDENTIAL DATA PROCESSING Personal data Aggregated statistics

    Data Owner A Data Clean Room As Data Owner, I want…. 1. When the data is “in the box”, nobody should be able to access it (confidentiality) 2. Nobody should be able to modify the computations I approved to be run (integrity, purpose-limitation) 3. To be able to remotely verify that 1 & 2 hold such that I don’t need to trust the third party (verifiability) Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  22. 41 Confidential Computing is the protection of data in use

    by performing computation in a CPU-based Trusted Execution Environment (TEE) CONFIDENTIAL COMPUTING
  23. 42 Trust perimeter Intel SGX Feature in Intel IceLake Server

    CPUs CONFIDENTIAL COMPUTING Confidential Computing is the protection of data in use by performing computation in a CPU-based Trusted Execution Environment (TEE)
  24. 43 CONFIDENTIAL COMPUTING Can’t access / modify data Trust perimeter

    Hypervisor Intel SGX Operating System App Enclave Virtual Machine TEEs provide new security primitives Run-time Isolation • Encryption & isolation of process memory • Memory confidentiality and integrity Remote Attestation • Verify genuine TEE platforms and their healthiness • Authenticate remotely running enclave code Software Feature in Intel IceLake Server CPUs Confidential Computing is the protection of data in use by performing computation in a CPU-based Trusted Execution Environment (TEE)
  25. 44 CONFIDENTIAL COMPUTING Can’t access / modify data Trust perimeter

    Hypervisor Intel SGX Operating System App Enclave Virtual Machine Security model: Hardware manufacturer is trusted There are VM-isolation versions too • AMD SEV/SNP, Intel TDX Azure Confidential Computing is the only cloud offering VMs allowing proper remote attestation Software Feature in Intel IceLake Server CPUs Confidential Computing is the protection of data in use by performing computation in a CPU-based Trusted Execution Environment (TEE)
  26. 45 REQUIREMENTS FOR CONFIDENTIAL DATA PROCESSING As Data Owner, I

    want…. 1. When the data is “in the box”, nobody should be able to access it (confidentiality) 2. Nobody should be able to modify the computations I approved to be run (integrity, purpose-limitation) 3. To be able to remotely verify that 1 & 2 hold such that I don’t need to trust the third party (verifiability) Personal data Aggregated statistics Data Owner A Data Clean Room Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  27. 46 Now data can be kept encrypted in memory also

    when computed impossible before confidential computing Data in use Data is encrypted when stored on hard disks standard for decades Data at rest Data is encrypted when transferred (HTTPS/TLS) Data in transit standard for decades CONFIDENTIALITY THROUGH ENCRYPTION OF DATA IN USE Data and code of the enclave process are encrypted and integrity checked when transferred from CPU to memory. Data only ever is decrypted in the CPU and the CPU is virtually impossible to attack. Computation Data Storage Data Results
  28. 47 REQUIREMENTS FOR CONFIDENTIAL DATA PROCESSING 47 As Data Owner,

    I want…. 1. When the data is “in the box”, nobody should be able to access it (confidentiality) 2. Nobody should be able to modify the computations I approved to be run (integrity, purpose-limitation) 3. To be able to remotely verify that 1 & 2 hold such that I don’t need to trust the third party (verifiability) Personal data Aggregated statistics Data Owner A Data Clean Room Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  29. 48 certificate authority Enclave measurement Acceptabe hardware configuration Data Clean

    Room definition DEFINE THE ALLOWED COMPUTATIONS AND PERMISSIONS
  30. 49 DEFINE THE ALLOWED COMPUTATIONS AND PERMISSIONS certificate authority Enclave

    measurement Acceptabe hardware configuration Data Clean Room definition
  31. 53 REQUIREMENTS FOR CONFIDENTIAL DATA PROCESSING As Data Owner, I

    want…. 1. When the data is “in the box”, nobody should be able to access it (confidentiality) 2. Nobody should be able to modify the computations I approved to be run (integrity, purpose-limitation) 3. To be able to remotely verify that 1 & 2 hold such that I don’t need to trust the third party (verifiability) Personal data Aggregated statistics Data Owner A Data Clean Room Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  32. 54 Step 1 – SGX program (enclave) launch 1. Its

    binary (compiled code) is hashed. This value is the measurement. 2. A random enclave key-pair is generated. 3. { measurement, enclave public key } is signed with a secret, CPU-specific key. Only SGX programs can get this signature. Intel can verify this signature. VERIFIABILITY THROUGH REMOTE ATTESTATION SGX program launch Measurement and enclave key pair generation Signature with CPU key Initialization User Intel SGX Trusted execution environment
  33. 55 Step 1 – SGX program (enclave) launch 1. Its

    binary (compiled code) is hashed. This value is the measurement. 2. A random enclave key-pair is generated. 3. { measurement, enclave public key } is signed with a secret CPU-specific key. Only SGX programs can get this signature. Intel can verify this signature. Step 2 – Verification 1. Check signature with Intel to see if it is genuine and all security patches were applied 2. Check if the measurement corresponds to the expected value (code authenticity) 3. User shares own public key to establish secure communication 4. User checks data room definition (purpose limitation) & uploads data VERIFIABILITY THROUGH REMOTE ATTESTATION SGX program launch Measurement and enclave key pair generation Signature with CPU key Initialization Verification User key {measurement, enclave public key, signature} User Intel SGX Trusted execution environment
  34. 56 RECAP – DATA COLLABORATION REQUIREMENTS As Data Owner, I

    want…. 1. When the data is “in the box”, nobody should be able to access it (confidentiality) 2. Nobody should be able to modify the computations I approved to be run (integrity, purpose-limitation) 3. To be able to remotely verify that 1 & 2 hold such that I don’t need to trust the third party (verifiability) Personal data Aggregated statistics Data Owner A Data Clean Room Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  35. 57 RECAP – CONFIDENTIAL COMPUTING New confidential computing technology makes

    it impossible for the third party to access data or modify the allowed processing operations Confidential computing enables independent technical verification of this fact, also by remote users Personal data Aggregated statistics Data Owner A Data Clean Room Personal data Data Owner B Aggregated statistics Azure Confidential Computing Data cannot be accessed, not even by admins
  36. 59 59 Enabling secure sensitive data collaborations via Data Clean

    Rooms 59 | HOTTEST PRIVACY MARTECH SOLUTIONS | FOUNDING MEMBER Deployed on Azure Confidential Computing Trusted and neutral SaaS platform (Switzerland-as-a-Service) Confidential Computing means no one can see your data ORGANIZATION 1 ORGANIZATION 2 ORGANIZATION N
  37. 60 Data Clean Room ORGANIZATION 2 Sensitive data ORGANIZATION N

    Sensitive data ORGANIZATION 1 Sensitive data DECENTRIQ DATA CLEAN ROOMS COLLA B OR A T E on sensitive & federated data P R OT ECT data confidentiality end-to-end ST R ICT LY CONT R OL how data is processed B E INNOV A T IV E Use-cases never possible before
  38. 61 HOW DOES IT WORK? encrypted data Data Clean Room

    Configuration Compliant results DATA CLEAN ROOM DATA OWNER 1 DATA OWNER 2 DATA OWNER N DATA ANALYST encrypted data encrypted data DATA CONSUMER
  39. 62 Confidential Computing Hardware Input Security Output Privacy User Experience

    Decentriq UI Exploratory sandbox Access Controls K-anonymity filter Decentriq API Synthetic data generator Differential Privacy* SQL Python R Docker* Intel SGX AMD SEV Intel TDX* NVIDIA GPU* PLATFORM COMPONENTS
  40. 63 63 CASE STUDY Collaborate with an insurer to identify

    next best product with an ML model keeping customer data confidential Improve product prediction efficiency Predicted new attributes TRAINED PRODUCT ML MODEL customer data customer data BANK INSURER
  41. 64 64 CASE STUDY Benchmarking against confidential data without a

    trusted 3rd party Extended scope to more sensitive data Data validation and governance INDIVIDUALIZED BENCHMARKS Queries & Reference data Confidential Sales data Healthcare Consulting PHARMA 2 PHARMA 1 PHARMA 20
  42. 65 65 CASE STUDY Combine confidential phishing emails to improve

    cyber defence Compute benchmarks and run NLP models on phishing email text data Emails remain confidential SIX SWISS NATIONAL BANK P H I S H I N G D A T A ZKB P H I S H I N G D A T A P H I S H I N G D A T A
  43. 66 66 CASE STUDY Optimize marketing campaigns without data sharing

    Target right audience across publishers without data sharing Unveil new audience insights Increased targeting reach based on lookalikes TOP AFFINITY SEGMENTS FOR ADV CAMPAIGN customer data customer data CAR RESELLER PUBLISHER 2 PUBLISHER 1 PUBLISHER 3
  44. 68 INDIVIDUALIZED CARE FOR CVD PUBLIC PARTNERS Data Clean Room

    INDUSTRY PARTNERS OBJECTIVE Improve personalized care delivery by developing treatment recommendation models for cardiovascular disease (CVD) patients APPROACH Partners will validate models on the largest CVD patient data set (>1M patients). The ecosystem will be open to the wider clinical community in the future. DECENTRIQ Decentriq will be the main analysis platform of the collaboration
  45. 69 Take-Aways Confidential Computing is a CPU-based technology enabling encryption-in-use

    and remote attestation Confidential Computing improves privacy-utility trade-offs in data collaborations The Decentriq Platform on Azure Confidential Computing makes this technology easy to use
  46. 70 Thank You [email protected] www.decentriq.com P R I V A

    C Y B Y D E S I G N N E U T R A L G R O U N D – D A T A C L E A N R O O M S P O W E R E D B Y C O N F I D E N T I A L C O M P U T I N G