Does Not Fit All: Make The Right Data Mesh For You Prepared for presentation at Big Data Expo NL, 2022 Dr. Jennifer Belissent | Principal Data Strategist | Snowflake Fernando Raposo | Senior Data Engineer | Findhotel
socio-technical approach to managing and accessing analytical data at scale.” It is distributed, not centralized or monolithic It’s about architecture and organizational principles. It’s about data management for analytics, not for operational systems. It aims to improve organizational scalability. DATA MESH IS…
PRINCIPLES Distribute responsibility for data pipelines and data quality to people with domain knowledge. Serve data as-a-product using a common self-service IT infrastructure platform to ensure consistency. Domain-Centric Ownership & Architecture Data as-a-Product Self-Serve Data Platform Federated Governance • Data products reflect use cases, and are easily discoverable • Data products are well-documented, easy to obtain and use • Data products define quality requirements • Domains are accountable to “customers” • Domain-agnostic, common tool set • Easy to use and low maintenance to support • Easy to deploy repeatable patterns for common requirements: cleansing, transformation, automation, storage, security, governance, sharing • Data governance is coordinated global to ensure interoperability across domains • Access and use policies are coordinated • Governance is applied within domains • Data pipelines owned by teams with domain knowledge • Domains own cleansing, refinement, historization, pre-aggregation, etc. • Domains responsible for data governance • Domains deliver data products to “customers”
support each other Domain-Centric Ownership & Architecture Data-as-a-Product Self-Service Data Platform Federated Governance FOUR INTERCONNECTED PRINCIPLES Source: Zhamak Dehghani Reduce cost and complexity of building and maintaining data products Empower domain teams Enforce global standards Ensure interoperability Reduce domain isolation Ensure product quality Connect data products Prevent data silos Ensure data use Data-as-a-Product
Domain: Marketing Domain: Customer 360 Inventory of shared data products Snowflake Managed Account The Data Mesh Architecture Consumers Data Sources Interoperability Standards, Federated Governance, 3rd Party Tools Data Domains: • Can be functional or topical (e.g. customer, item/sku, etc.) • Can consume and share data or functions • Control access policies, data masking, etc. for downstream consumers • Can share external tables, i.e. provide access to data outside of Snowflake • Can provide reader accounts for non-Snowflake consumers • Must deliver products, and comply with federated governance • Data Marketplace / Catalog: • Connects providers to consumers • Inventory of available assets • No central storage of shared data • Providers retain full control over shared assets (data, functions) • Consumers access live provider data, no copies or ETL required Data Consumers: • Discovers and uses data or services or applications directly (no copying or moving) Snowflake Data Marketplace or 3rd-party catalog 3rd party marketing agency Reseller Sales Analysts Churn & Retention Business optimization Finance & Controlling
Excellence coordinate data and analytics processes and projects, with a central hub and distributed spokes. Function Function Region LOB LOB CoE Hub At the hub of a CoE, data insights leaders drive strategy, establish enterprise-wide data governance and deliver services including talent recruitment, training, vendor management and data and analytic services. Gray areas reflect shared responsibilities depending on spoke-level capabilities. The spokes of a CoE lie within the lines of business, functional teams or geographical units. They define use cases, oversee data execution, track performance and measure outcomes LOB THE DATA ORGANIZATION
within the online travel (marketplace) industry, offering price comparison and direct booking options ⊲ Headquartered in Amsterdam, with a distributed team of 150+ experienced professionals active across three continents ⊲ We work with the one of the largest & most competitive inventory in the world, using it on a CPA basis. This ensures the highest liquidity of supply in the market, better prices for consumers and a win-win-win formula for publishers-booking engines-consumers. ⊲ We are a trusted partner to all of the major (travel) marketing networks, like Google, TripAdvisor, Bing, etc. Our Major Suppliers Confidential ⓒ FindHotel 2022. All rights reserved
Framework to assess the data maturity of each team ⊲ Tools that automate the deployment of data infrastructure ⊲ Central repository for metadata ⊲ Domain mapping framework (simplified) ⊲ Split the monolith into business domains and delegated ownership Confidential ⓒ FindHotel 2022. All rights reserved
and direction ⊲ Domain mapping should be agreed upon before any deployment ⊲ Focus on Governance very early ⊲ Assemble a Data Council Confidential ⓒ FindHotel 2022. All rights reserved
often called a listening tour. 2. Determine skills levels and gaps across teams. This will suggest which domains will be more autonomous first, and how to support others. 3. Start with a domain and use case, then iterate to expand domain ownership and self-service infrastructure access. KEY TAKEAWAYS HOW TO GET STARTED, TODAY!
for implementing Data Mesh" https://www.capgemini.com/no-no/2021/05/why-snowflake-is-a-good-match-for-implementing-data-mesh/ Data Mesh at Siemens Healthineers https://resources.snowflake.com/case-study/snowflake-enables-siemens-healthineers-it-to-optimize-business-operations-around-the-gl obe The Data Mesh Journey at Roche https://www.snowflake.com/blog/data-mesh-perspectives-a-qa-with-roche-diagnostics/ https://f.hubspotusercontent30.net/hubfs/5870630/Roche%20and%20DataOps.live%20case%20study.pdf Building a Data Mesh with Snowflake at Iterable https://www.snowflake.com/blog/building-iterables-data-mesh-using-snowflake-three-components-of-an-innovative-data-management-strategy/ Data Mesh at Flexport: Driving Buy-in and Social/Org Challenges https://www.youtube.com/watch?v=-POiudR2_R0 https://docs.google.com/presentation/d/e/2PACX-1vQpZUaVDrBh1ZnMic-2BP1f_N-Zu3QKpaACagA1lrOkXlGJ5s_cMKmx342N2FpB3FAryKMXW46BNDDG/pub#sli de=id.gf1a3223b00_0_30 Empower Data Teams with a Data Mesh Built on Snowflake https://www.snowflake.com/guides/data-mesh-self-service-data https://www.snowflake.com/blog/empower-data-teams-with-a-data-mesh-built-on-snowflake/