Data Lineage made easy with Graph Databases

Data Lineage made easy with Graph Databases

Data Lineage has always been a topic, at least for auditing, and came back as a key element with regulations like GDPR and similar. The problem is that with the multiplications of tools, sources, transformations and movements of data it's getting harder and harder to have a clear picture of the whole data lineage in a company and even more complicated to use that information for auditing. This is where graph databases jump in to make things easier: data lineage is by nature a graph. It's possible to model every single flow, every single component down to a column in a database or a dashboard. Add the whole corporate security on top with the various abstraction layers of groups and roles on top of users and your graph is ready for analysis. This talk will cover why graph databases are a perfect match for data lineage and use an analytical enterprise platform as example, tracking a single column from a database table to the very end into dashboards and reports and the respective security. (Based on Oracle Property Graph engine PGX with Cytoscape for visualization, and OAC/OBIEE for the data lineage example).

Bf71450537acca19e045ae6f7febdf9a?s=128

Gianni Ceresa

May 07, 2019
Tweet

Transcript

  1. None
  2. None
  3. None
  4. None
  5. None
  6. None
  7. None
  8. 25 May 2018

  9. • • • • • • • • • •

    • Only heard about it few times, for not much $. Did you?
  10. • • GDPR compliant

  11. • • • • •

  12. None
  13. Vertex edge Graph Database (also called node)

  14. edge edge label edge properties edge ID directed edge vertex

    (node) vertex properties vertex ID a vertex can have a label
  15. None
  16. None
  17. • • • •

  18. • • • • • • • •

  19. None
  20. • • •

  21. mapped to reference reference page contains contains Catalog ACL member

    of member of
  22. • • • •

  23. None
  24. A shortcut is like a symbolic link: it makes something

    accessible with a different path. It is invisible to most users and OOTB auditing tools.
  25. An alias is an alternative way to access a data.

    It is invisible to most users and OOTB auditing tools.
  26. None
  27. None
  28. None
  29. • • Coming soon... • • • • • •

  30. • • • Coming soon... • • • Free!!! Free!!!

  31. None
  32. None
  33. From 45700 nodes with 105406 edges, to 85 nodes with

    218 edges in seconds Catalog RPD Security
  34. Coming soon…

  35. • • DEMO

  36. None
  37. None
  38. None
  39. None