Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Streamline Data Governance with Egeria

Streamline Data Governance with Egeria

Avatar for NancyMcGrory

NancyMcGrory

March 20, 2019

More Decks by NancyMcGrory

Other Decks in Technology

Transcript

  1. https://github.com/odpi/egeria Real Data Landscapes § Many enterprises have had 40

    years of innovation embedded into their IT Systems. § Data Science projects spend a lot of time finding and validating data. § There is often a delay moving real- time analytics from POC into production. 2
  2. https://github.com/odpi/egeria Building governance maturity is a gradual process § Organizations

    may operate different levels of maturity in different parts of their business. § Choices determined by where the most value lies. § Many organizations aspire to provide all employees with the data they need (data citizenship*) * Source: Forrester Data Governance 2.0, 2015-2016 3
  3. https://github.com/odpi/egeria Observations from the maturity model § The number of

    bespoke integrations between tools to exchange metadata so it is consistently available to everyone who needs it grows steadily with each step up in maturity. § With little to no standardization between vendors, the cost and time delay is born by the organization. 5
  4. https://github.com/odpi/egeria Using Egeria … § Eases the cost of metadata

    integration through § Comprehensive standards and libraries. § Active vendor recruitment program. § Provides direct support to many governance roles, filling the gaps between function offered through commercial tools. 6
  5. https://github.com/odpi/egeria Connecting to multiple cohorts Cohort B Cohort A Chief

    Data Office Data Lake Systems of Record Mobile Apps Data Lake Systems of Record Marketing 8
  6. https://github.com/odpi/egeria Importance of the Graph Model 10 Database Column Glossary

    Term Glossary Term Meaning Server 1 Server 2 Reference Copy Relationship
  7. https://github.com/odpi/egeria Importance of the Graph Model 11 Database Column Glossary

    Term Server 1 Server 3 Server 2 Database Column Glossary Term Meaning
  8. https://github.com/odpi/egeria Importance of the Graph Model – Using Entity Proxies

    12 Database Column Glossary Term Server 1 Server 3 Server 2 Meaning Database Column Glossary Term Entity Proxy
  9. https://github.com/odpi/egeria Metadata and governance digital platform Open Metadata and Governance

    Reporting Platform ETL Platform Analytics Platform Virtualization Platform Governance Platform Data Platform 13
  10. https://github.com/odpi/egeria Search Open Metadata Access Services Design philosophy Open Metadata

    Repository Services 14 Use cases, Personas, Practitioners input Data integration, availability and integrity best practices
  11. https://github.com/odpi/egeria Coco Pharmaceuticals persona Jules Keeper, CDO Tessa Tube, Chief

    Researcher Erin Overview, Information Architect Faith Broker Chief Privacy Officer Bob Nitter, Integration Developer Callie Quartile, Data Scientist Nancy Noah Cloud Specialist Gary Geeke IT Infrastructure https://odpi.github.io/data-governance/coco-pharmaceuticals/personas/ 15
  12. https://github.com/odpi/egeria Open metadata type model summary Glossary Collaboration Governance Models

    and Reference Data Metadata Discovery Lineage Data Assets Base Types, Systems and Infrastructure 16
  13. https://github.com/odpi/egeria Each area caters for appropriate metadata structures Policy Metadata

    (Principles, Regulations, Standards, Approaches, Rule Specifications, Roles and Metrics) Governance Actions and Processes Augmentation Mapping Implementation Business Objects and Relationships, Taxonomies and Ontologies Business Attributes Organization Teaming Metadata (people profiles, communities, projects, notebooks, …) Models and Schemas 4 3 1 5 Physical Asset Descriptions (Data stores, APIs, models and components) Asset Collections (Sets, Typed Sets, Type Organized Sets) Information Views Rights Management Reference Data Feedback Metadata (tags, comments, ratings, …) Classification Schemes Classification Strategy Subject Area Definition Campaigns and Projects Rollout 2 Discovery Metadata (profile data, technical classification, data classification, data quality assessment, …) Augmentation Instrument Association Information Process Instrumentation (design lineage) 6 7 Connectors Basic Types, Infrastructure and Systems Access 0 17
  14. https://github.com/odpi/egeria Current Open Metadata Access Services (OMASs) 18 Project Management

    Community Profile Asset Catalog Stewardship Action Information View Governance Program Data Process Subject Area Connected Asset Discovery Engine Governance Engine Data Protection Software Developer Data Platform Asset Owner Digital Architecture Data Science DevOps Asset Consumer Data Infrastructure Data Privacy Asset Lineage
  15. https://github.com/odpi/egeria Automating governance example IBM Information Governance Catalog Apache Atlas

    Apache Ranger Gaian Define Policies Hadoop Metadata Manage Data Access Egeria (Open metadata exchange and federated queries) Access Data Egeria Open Governance APIs configure configure 19
  16. https://github.com/odpi/egeria Realizing open metadata and governance § Delivering core technology

    § Recruiting vendors § Assisting practitioners 22 Vendors Practitioners Core Technology Compliance Suite Best Practices Project Egeria Project Data Governance
  17. https://github.com/odpi/egeria Help wanted § Governance practice leaders needed to build

    out best practices § If you buy data technology please encourage your vendors to consume the Egeria technology. § Looking for developers: § UI development § Graph repository (eg JanusGraph/TinkerPop) § Python clients § Join the ODPi to help fund our work § Tell everyone about want we do 23
  18. https://github.com/odpi/egeria Links § Press Release and Podcast § Open source

    repositories • https://github.com/odpi/data-governance • https://github.com/odpi/egeria • https://www.linuxfoundation.org/press-release/2018/08/odpi-announces-egeria-for-open- sharing-exchange-and-governance-of-metadata/ • https://roaringelephant.org/2018/09/25/episode-107-open-metadata-and-governance- masterclass-with-mandy-chessell-part-1/ • https://roaringelephant.org/2018/10/09/episode-109-open-metadata-and-governance- • masterclass-with-mandy-chessell-part-2/