Upgrade to Pro — share decks privately, control downloads, hide ads and more …

The rise of big data governance: Insight on this emerging trend from active open source initiatives

The rise of big data governance: Insight on this emerging trend from active open source initiatives

Each of today’s most forward-thinking enterprises have been forced to face similar data challenges: the reliance on real-time data to better serve their customers and, subsequently, the requirement of complying with regulations to protect that data, such as the EU’s General Data Protection Regulation (GDPR).

John Mertic and Maryna Strelchuk detail the benefits of a vendor-neutral approach to data governance, explain the need for an open metadata standard, and share how companies like ING, IBM, Hortonworks, and more are delivering solutions to this challenge as an open source initiative. The solution to this emerging challenge is a tricky one. For companies like ING, this data governance challenge has been met with metadata, a consistent view across a large heterogeneous ecosystem, and collaboration with an active open source community.

John Mertic

March 07, 2018

More Decks by John Mertic

Other Decks in Technology


  1. The rise of big data governance: Insight on this emerging

    trend from active open source initiatives March 7, 2018 – Strata SJC 2018
  2. @ODPiOrg TODAY’S SPEAKERS 2 John Mertic, Director of Program Management,

    Linux Foundation Maryna Strelchuk, Information Architect and Application Developer at ING
  3. @ODPiOrg IMAGINE … An enterprise data catalogue that lists all

    of your data, where it is located, its origin (lineage), owner, structure, meaning, classification and quality No matter where the data resides Search
  4. @ODPiOrg New tools from any vendor connect to your data

    catalogue out of the box No vendor lock-in and no expensive population of yet another proprietary, siloed metadata repository Search Open Metadata Management & Governance IMAGINE …
  5. @ODPiOrg Metadata is added automatically to the catalogue as new

    data is created Databases Applications Function Function Functions Files It’s possible if data-driven enterprises collaborate to build it Let’s talk about how IMAGINE …
  6. @ODPiOrg • The Metadata Problem • Building an Open Ecosystem

    • Benefits for Data Governance Professionals AGENDA
  7. @ODPiOrg 1.Use data outside the application that created it 2.Find

    the right data sets 3.Automate governance processes WHY DO WE NEED METADATA?
  8. @ODPiOrg • Many data platforms do not have metadata support

    • Proprietary tools support a limited range of data sources and governance actions • Expensive efforts to create an enterprise data catalogue TODAY’S REALITY
  9. @ODPiOrg i. The maintenance of metadata must be automated ii.

    Metadata management must become ubiquitous iii. Metadata access must become open and remotely accessible iv. Metadata should be used to drive the governance of data v. Wherever possible, discovery and maintenance of metadata has to an integral part of all tools that access, change and move information. 10 METADATA GOVERNANCE MANIFESTO
  10. @ODPiOrg Open and Unified Metadata Atlas Metadata repository IBM Metadata

    repository Custom Metadata repository Open Metadata Repository Service Open Metadata Access Service Open and Unified Metadata WHAT NEEDS TO CHANGE
  11. @ODPiOrg Update to Apache Atlas 12 Connectivity Business Value Automation

    Atlas Metadata Server Metadata Highway Metadata Repository Store Metadata Work with Metadata Exchange Metadata Manage Metadata Capture Metadata Discovery Server Stewardship Server Access Services Data Platforms and Engines Analyse Data Improve Data Repository Services Retrieve Metadata Automation Capture of metadata from data platforms, data movement engines and data protection engines. Exception management and stewardship Business Value Specialized services for key data roles such as CDO, Data Scientist, Developer, DevOps Operator, Asset Owner, Applications Connectivity Metadata Highway offering open metadata exchange, linking and federation between heterogeneous metadata repositories.
  12. @ODPiOrg Open and Unified Metadata Atlas Metadata repository IBM Metadata

    repository Microsoft SSAS Open Metadata Repository Service OMAS Open and Unified Metadata CURRENTLY IN DEVELOPMENT Information View Asset Catalog Subject Area Catalog Search UI Power BI
  13. @ODPiOrg Good metadata enables subject matter experts to collaborate around

    the data Locate the data they need, quickly and efficiently Feeding back their knowledge about the data and the uses they have made about it to help others and support economic evaluation of data CO-CREATION WITH PRACTITIONERS
  14. @ODPiOrg Your governance program if based on established definitions Allow

    a broader range of tools in your organization Automated governance processes protect and manage your data Metadata-driven access control Auditing, metering and monitoring Quality control and exception management Rights management Your metadata offerings will deliver value faster as they tap into metadata collected by other vendor’s tools. ODPi packages extend your metadata system’s and tools’ capabilities Conformance tests minimize your effort in being compliant with key standards and regulations. Customers have increased confidence in your tools and services due to ODPi certification. Data Governance Professionals Vendors HOW THIS HELPS
  15. @ODPiOrg ROADMAP March April May June July August September Data

    Governance PMC meets weekly • Focus of meetings are to develop the open metadata usage guidelines, best practices, connector descriptions • Two threads every other week on the PMC • Thread 1 : Compliance tools and packs • Thread 2 : Practitioner - Subject matter experts • Learn more at https://lists.odpi.org/g/odpi-pmc- datagovernance Strata, San Jose Dataworks Summit, Berlin IBM Think, Las Vegas Webinar for Offering Managers Webinar for Developers Privacy Pack GA Apache Atlas 1.0 GA Releases upcoming • Privacy pack due in June (https://jira.odpi.org/browse/DG-3) • Apache Atlas 1.0 GA to support work due in late June (https://cwiki.apache.org/confluenc e/display/ATLAS/Open+Metadata+ and+Governance) Future work • Metadata tools and solutions will integrate through the open metadata interfaces • Integrated solutions and products with the open metadata interfaces Dataworks Summit, San Jose Apache Atlas 1.0 beta Strata, NYC
  16. FOUNDATIONS ENABLE TRUSTED INNOVATION Successful Projects depend on members, developers,

    infrastructure to develop technology, which is turned into products that the market will adopt. Ecosystem
  17. GET INVOLVED WITH ODPi DATA GOVERNANCE Have your organization support

    ODPi https://www.odpi.org/about/join Visit ODPi website and join the quarterly newsletter https://www.odpi.org/ Learn more about Data Governance PMC https://www.odpi.org/odpi-compliance-directory/odpi-end-users Join the Data Governance PMC Mailing List https://lists.odpi.org/g/odpi-pmc-datagovernance