Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Graph Database makes Data Lineage easy

Graph Database makes Data Lineage easy

Data Lineage (DL) has been (re)discovered by many companies because of the GDPR enforcement in May 2018. It is the easiest way to know and keep control on “who can access what, how and why” throughout a corporate analytical platform. The challenge of DL is the non-linear aspect of most data flows, often being M-to-N relationships, making them difficult to analyze easily and quickly with traditional tools. A Graph Database is the perfect way to store and analyze metadata collected for DL because of its modular structure composed of nodes and edges. We will demonstrate an implementation and analysis of DL: generating and loading the graph into the Oracle Database, analyze it via SQL, Notebook (Python/Java) or visually with Cytoscape.

Gianni Ceresa

May 11, 2018
Tweet

More Decks by Gianni Ceresa

Other Decks in Technology

Transcript

  1. Common Enterprise Information Model Connecting Data with Self Service Analytic

    Applications Presentation Layer Physical Layer Semantic Object Layer Map Phyisical Data Connections Schemas Business Model Dimensions & Hierarchies Measures & Calculations Time Series & Aggregation Simplified Business View Subject Areas Security and Roles Preferences
  2. (what) (who) If interested in how to practically get the

    metadata from OAC/OBIEE have a look at
  3. • • • • Doesn’t support loading a graph from

    DB !!! • Will support loading from DB
  4. GraalVM will make this part useless thanks to its polyglot

    feature • Python will have direct access to Java objects and methods
  5. WITH properties AS ( SELECT DISTINCT k, t, 'Vertex' AS

    kind FROM sa607vt$ UNION ALL SELECT DISTINCT k, t, 'Edge' AS kind FROM sa607ge$ ) ,cfg AS ( SELECT '.add' || kind || 'Property("' || k || '",PropertyTypeClass.' || CASE WHEN t = 1 THEN 'STRING' WHEN t = 5 THEN 'DATE' END || ')' AS prop FROM properties ) SELECT LISTAGG(prop,'') WITHIN GROUP(ORDER BY prop) FROM cfg;