Slide 1

Slide 1 text

May 16, 2024 Building an Enterprise Data Platform with Microsoft Fabric Weili Gao, Dr. Gerald Reif Azure Bootcamp Switzerland 2024, Bern May 16, 2024

Slide 2

Slide 2 text

May 16, 2024 www.ipt.ch Who We Are • Principal Architect @ ipt • Azure, cloud-native development, DevOps 2 • Principal Architect, Director @ ipt • Azure, enterprise data platforms Weili Gao Gerald Reif

Slide 3

Slide 3 text

May 16, 2024 Agenda Experiences of an early adopter What is Microsoft Fabric? Our Experience working with Fabric Would we choose Fabric again? 3

Slide 4

Slide 4 text

May 16, 2024 www.ipt.ch 4 Current data integration Project Goals ● Build a data platform that automates the E2E process from data ingestion to the visualization in dashboards ● Enable users to share and consume data via self-service ● Create the basis for advanced data services (ML) Swissmedic aims to become a data-driven authority Project Background

Slide 5

Slide 5 text

May 16, 2024 www.ipt.ch Swissmedic preset: Data platform to be based on Azure-native services 1. Azure Synapse Analytics - Lack of development on Synapse (last update April 2023; Delta lake update) 2. Azure Databricks - More platform engineering effort (SCC, service principals, secure context, user management, IP range, ...) 3. Microsoft Fabric - Great vision on paper, General Available since Nov. 2023 - Not yet feature complete, but important features for us are on the roadmap - Client is ok with waiting for those features 5 Which Options to consider? 1 2 3

Slide 6

Slide 6 text

May 16, 2024 www.ipt.ch • A single unified SaaS data platform • One shared experience for data engineers, analysis, scientists and citizen developers • OneLake as unified data store, reducing need to move or duplicate data 6 Fabric Overview

Slide 7

Slide 7 text

May 16, 2024 www.ipt.ch 7 Less platform engineering (no resource groups, individual resources (storage, key vault, synapse), private endpoints Organization within Fabric in Domains and Subdomains, not via subscriptions and resource groups PaaS SaaS Think Different: PaaS → SaaS

Slide 8

Slide 8 text

May 16, 2024 www.ipt.ch SaaS: Build-in Deployment Pipelines 8 No external deployment pipelines required for dev → test → prod

Slide 9

Slide 9 text

May 16, 2024 www.ipt.ch SaaS: Git Support 9 Git is optional - but recommended!

Slide 10

Slide 10 text

May 16, 2024 www.ipt.ch 10 Starting from F64 capacity several relevant enterprise features are included Free trial license corresponds to F64+/- Flat Fee for the Service

Slide 11

Slide 11 text

May 16, 2024 www.ipt.ch 11 Many Options to Work with Data Notebook Data pipeline Dataflow Gen 2

Slide 12

Slide 12 text

May 16, 2024 www.ipt.ch 12 Notebook • Spark notebooks supporting PySpark, Spark, Spark SQL, SparkR • Application: Data ingestion, transformation & visualization • “Closer” to software engineering • Only option currently that works with private endpoint

Slide 13

Slide 13 text

May 16, 2024 www.ipt.ch 13 Data Pipeline • Similar to Azure Data Factory pipeline • Application: Data ingestion & simple transformations • Orchestration of notebooks, dataflows and other pipelines

Slide 14

Slide 14 text

May 16, 2024 www.ipt.ch 14 Dataflow Gen 2 • Low-code tool for data ingestion & transformation • Can “code” using Data Analytics Expressions (DAX) in Power Query • Experience very similar to Excel or Power BI

Slide 15

Slide 15 text

May 16, 2024 www.ipt.ch 15 Professional Developer Citizen Developer Data Ingestion Spark, Data pipeline Data pipeline, Dataflow Gen 2 Data Transformation Spark + Lakehouse T-SQL + Warehouse Data pipeline, Dataflow Gen 2 Orchestration Data pipeline Reports & visualization Spark (pandas, R) Power BI Many Options to Work with Data

Slide 16

Slide 16 text

May 16, 2024 www.ipt.ch Data Ingestion Architecture @ Swissmedic 16 ● Three network environments (for now) ● Moving Targets in the Architecture ● Reuse of existing solutions ● Challenge: Data Contract with source systems

Slide 17

Slide 17 text

May 16, 2024 www.ipt.ch How do we deal with missing Enterprise Features 17 Our Approach ● Analysis of Requirements ● Fabric Release Plan ● Risk Analysis ● Mitigation Actions ● Microsoft FastTrack Managed Identity Managed Virtual Endpoints Double Encryption GIT Support GitHub Support Got the OK from the Enterprise Architecture Board ✅ IaC Support i.e. Terraform Data Catalog and OneSecurity

Slide 18

Slide 18 text

May 16, 2024 www.ipt.ch • Place to create collection of items such as warehouses, lakehouses, notebooks, pipelines, … • Access managed by workspace roles • Each workspace can be assigned to a • capacity • Folder & Git branch • private endpoint • managed identity 18 Workspaces

Slide 19

Slide 19 text

May 16, 2024 www.ipt.ch 19 Choosing a Workspace model Workspace PROD Workspace TEST Workspace DEV PRD TEST DEV Private Workspace Git Feature branch Main branch Deployment pipeline Data Source PR

Slide 20

Slide 20 text

May 16, 2024 www.ipt.ch Still open but on the roadmap: ● Full integration with Purview ● OnSecurity: Great Vision, Implementation to be seen Would we choose Fabric again? Our Experiences so far Enterprise Features were released or are on the release plan ● Features delivered according to release plan ● New features are announced in Blog 20 Yes, we would! ✅ Microsoft delivered, so far at least in parts Fabric is a product that keeps platform engineering low ➢ Important for Swissmedic All data processing related features are available

Slide 21

Slide 21 text

May 16, 2024