Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploratory Data Catalog - Democratizing Data w...

Kan Nishida
December 11, 2019

Exploratory Data Catalog - Democratizing Data within Organizations

Kan is presenting Exploratory Data Catalog, a new solution from Exploratory to help you democratize data within your organization.

Kan talks about the common challenges when trying to Democratize Data within organizations and shows you how Exploratory Data Catalog can address them with a demo.

Kan Nishida

December 11, 2019
Tweet

More Decks by Kan Nishida

Other Decks in Technology

Transcript

  1. Kan Nishida CEO/co-founder Exploratory Summary Beginning of 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams for building various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  2. Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling Visualization

    Analytics (Statistics / Machine Learning) Data Analysis What you can do with Exploratory
  3. Give a Man a Fish, and You Feed Him for

    a Day. Teach a Man To Fish, and You Feed Him for a Lifetime
  4. We build a tool to do Data Science easier and

    teach how to use Data Science to gain deeper insights from data.
  5. Common Problems • We don’t have an access to data

    sources, so we need to ask someone to get the data for us. • We don’t know which one is the right data, there are too many spreadsheets flying around by emails. • Since data wrangling takes up most of our time hence we don’t have enough time left for analyzing data with Statistics and Machine Learning algorithms.
  6. Data Access “We want everyone to do customer retention analysis

    by using data from our payment system, but we can’t expose our customers’ detail information to everyone.“
  7. Data Access • We want to create an environment where

    everyone can access any data. • But, in reality, we can’t let everyone access any data source. • It is dangerous to allow anyone to share our customers private information with anyone without any oversight.
  8. “There are many spreadsheet data flying around via Emails, Slack,

    Google Docs, or random folders at document sharing servers. But, nobody is really sure which ones are the right ones to look at.” Data Governance
  9. • There are many similar data, we don’t know which

    one is the one to analyze. • Spreadsheets are getting copied and updated but we don’t know who updated and how it’s done. • We don’t know the context of the data and don’t know the meaning of each column. Data Governance
  10. “Every time when we try to analyze data we end

    up spending so much time cleaning and transforming data, and often we run out of time before getting to analyzing the data.” Data Readiness
  11. Data Wrangling • Most data is not ready for visualizing

    & analyzing without cleaning & transforming. • Always want to use the latest data. Data Readiness
  12. • Slow, Expensive, Hard to maintain. • Don’t have enough

    resources. • Dependency on IT & Data Engineers. • Requirements for data continue to evolve. IT / Data Engineers
  13. Exploratory BI Excel R / Python / JS DB Cloud

    Files Web Pages Exploratory Data Catalog Schedule Data Catalog Web UI REST API
  14. 1. Prepare Data 2. Publish 4. Schedule Life Cycle of

    Data Catalog 3. Share 6. Reproduce & Extend 5. Discover
  15. 1. Prepare Data 2. Publish 4. Schedule 1. Prepare Data

    3. Share 6. Reproduce & Extend 5. Discover
  16. 1. Prepare Data 2. Publish 4. Schedule Life Cycle of

    Data Catalog 3. Share 6. Reproduce & Extend 5. Discover
  17. Metadata • You can describe your data with Markdown text.

    • With Data Dictionary, you can provide a description for each column.
  18. 1. Prepare Data 2. Publish 4. Schedule 3. Share 3.

    Share 6. Reproduce & Extend 5. Discover
  19. BI Excel R / Python / JS • Share in

    Private or Public mode. • An invitation will be sent to those you have shared with for the Privately shared data. • Those who are shared can create FREE accounts and browse and download the data. 3. Share Exploratory Data Catalog Schedule Data Catalog Web UI REST API Exploratory
  20. 1. Prepare Data 2. Publish 4. Schedule 4. Schedule 3.

    Share 6. Reproduce & Extend 5. Discover
  21. DB Cloud Files Web Pages Schedule - Automate Data Extraction

    and Wrangling BI Excel R / Python / JS Exploratory Data Catalog Schedule Data Catalog Web UI REST API Exploratory
  22. 4. Schedule Automate for extracting data from the data sources

    and transform the data by scheduling. Your data is always up-to-date even without opening Exploratory Desktop.
  23. 1. Prepare Data 2. Publish 4. Schedule 5. Discover 3.

    Share 6. Reproduce & Extend 5. Discover
  24. BI Excel R / Python / JS • My Insight

    • Insight Page • Tag Page 5. Discover Data Exploratory Data Catalog Schedule Data Catalog Web UI REST API Exploratory
  25. All your data or the data someone have shared with

    you in one place. 5. Discover - My Insight
  26. 1. Prepare Data 2. Publish 4. Schedule 6. Reproduce &

    Extend 3. Share 6. Reproduce & Extend 5. Discover
  27. BI Excel R / Python / JS Import Directly from

    Data Catalog Exploratory Schedule Data Catalog Web UI REST API Exploratory Data Catalog
  28. • Import as EDF (Exploratory Data Format) • Import the

    Final Result as CSV 6. Reproduce & Extend
  29. • Import as EDF (Exploratory Data Format) • Import the

    Final Result as CSV 6. Reproduce & Extend
  30. • Import as EDF (Exploratory Data Format) • Import the

    Final Result as CSV 6. Reproduce & Extend
  31. BI Excel R / Python / JS Import Directly from

    Data Catalog Exploratory Schedule Data Catalog Web UI REST API Exploratory Data Catalog
  32. Data Catalog Data Source Access all data you have access

    directly inside Exploratory Desktop.
  33. Re-Import Click Re-Import button to re- import the latest data

    when the shared data is updated at the Exploratory Server.
  34. There are various data that are shared publicly at Exploratory

    Server such as GDP, Population, Unemployment, etc. Public Data
  35. Files Manage Data Access BI Excel R / Python /

    JS Exploratory Data Catalog User Access Management Exploratory Desktop Exploratory Desktop
  36. Data Access • You can decide: • Which data to

    be share data. • Which level of data to be shared. • No need to share the data source, but if you want you can.
  37. DB Cloud Files Web Pages Single Source of Truth /

    Reproducible Data Sharing BI Excel R / Python / JS Exploratory Data Catalog Exploratory Desktop Exploratory Desktop Schedule Data Catalog Web UI REST API
  38. Data Governance • All the data in a single place.

    • Easier to discover data. • Reproduce the data and Know how it was prepared.
  39. DB Cloud Files Web Pages Data Wrangling as Service BI

    Excel Exploratory Data Catalog Exploratory Schedule Connection Wrangling Exploratory
  40. Data Readiness • Easy to prepare data and share. •

    Automate the data extraction and wrangling. • Not everyone needs to do the same data wrangling again and again.
  41. Exploratory Desktop BI Excel R / Python / JS Exploratory

    Desktop Exploratory Cloud Exploratory Data Catalog Go to https://exploratory.io/insight Schedule Data Catalog Web UI REST API
  42. Exploratory Cloud Exploratory Data Catalog Exploratory Desktop BI Excel R

    / Python / JS Exploratory Desktop https://exploratory.io Schedule Data Catalog Web UI REST API
  43. Exploratory Data Catalog Schedule Data Catalog Web UI REST API

    Exploratory Desktop Scheduled Auto Data Wrangling BI Excel Discover - Data View - Dictionary - API R / Python / JS Exploratory Desktop Exploratory Cloud Exploratory Data Catalog Exploratory Collaboration Server Linux Server / AWS / GCP / Azure / etc. Firewall
  44. Exploratory Desktop Scheduled Auto Data Wrangling BI Excel Discover -

    Data View - Dictionary - API R / Python / JS Exploratory Desktop Exploratory Cloud Exploratory Data Catalog Exploratory Collaboration Server Linux Server / AWS / GCP / Azure / etc. Firewall
  45. Exploratory Desktop BI Excel R / Python / JS Exploratory

    Desktop Scheduled Auto Data & Insight Generation Discover - Data View - Dictionary - API Exploratory Collaboration Server Data Catalog Linux Server / AWS / GCP / Azure / etc. Insight Catalog Discover - Insight View - Dashboard, Note User Management Sharing Management
  46. 1. Create Insights 2. Publish 4. Schedule Data Life Cycle

    of Insights 3. Share 6. Reproduce & Extend 5. Discover