Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Exploratory Seminar #38 - Why Excel Users Love ...

Exploratory Seminar #38 - Why Excel Users Love Exploratory? - Part 2

We have interviewed our users who used to be using Excel on why they have transitioned to Exploratory, and we've come up with a list of 9 reasons we often hear.

* Intuitive Understanding of Data at Every Step
* Analytical Capability
* Reproducibility for Reporting
* Easy to Share, But in a Managed way

In this seminar (Part 2), Kan will be discussing the second half of the topics.

Kan Nishida

March 17, 2021
Tweet

More Decks by Kan Nishida

Other Decks in Technology

Transcript

  1. Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  2. 5 Questions Communication Data Access Data Wrangling Visualization Analytics (Statistics

    / Machine Learning) Data Analysis Data Science Workflow
  3. 6 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling

    Visualization Analytics (Statistics / Machine Learning) Data Analysis ExploratoryɹModern & Simple UI
  4. 1. Better Performance & Bigger Data Size 2. Lower Learning

    Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way 9 Reasons Why Excel Users Love Exploratory
  5. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way Part 1
  6. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way Part 2
  7. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way
  8. 14 Without an intuitive understanding of data… • Fail to

    realize the problems with your data. • Not sure what you get is what you have expected. • Fail to find the patterns in your data.
  9. 15 Without an intuitive understanding of data… • Fail to

    realize the problems with your data. • Not sure what you get is what you have expected. • Fail to find the patterns in your data.
  10. 18 There might be missing values, but you won’t notice

    unless you scroll and check all the rows.
  11. There might be unexpected outlier values but you won’t notice

    unless you scroll and check all the rows. 19
  12. 21 Top 100 Customers based on Sales Failing to realize

    the problems with data at the beginning doesn’t mean that the problems don’t exist. They just will be carried through the downstream. 1. Remove/Impute NAs 2. Remove outlier values 3. Clean up the customer names 4. Summarize the Sales by Customer 5. Extract the first and the last order date 6. Extract the customers country 7. Filter for Northern America customers 8. Calculate the customer life time period 9. Calculate the sales per day 10. Keep only the top 100 customers Is this really top 100?
  13. If you realize the problems after the data wrangling tasks

    in Excel, you will need to fix the problems and adjust all the following tasks as required. 22
  14. 24 Summary view is automatically generated when you import data.

    It gives you a quick and intuitive way to identify the data problems if there is any.
  15. 27 You can quickly check if it has all the

    data as you expect. If it is date and time data, you can check the beginning and the ending of period.
  16. 28 Without an intuitive understanding of data… • Fail to

    realize the problems with your data. • Not sure what you get is what you have expected. • Fail to find the patterns in your data.
  17. But, if you look closely there are some rows that

    didn’t get joined as expected. 32
  18. After the Join operation, you can quickly check how the

    data looks under the Summary view. 35 Looks there is 1 row that doesn’t get joined.
  19. You can check if there are any NAs in the

    Summary view and make sure all the rows are joined as expected. 40
  20. In Excel, everything is shown as text and it’s hard

    to realize the problems in your data. With Exploratory, you can quickly realize the problems thanks to visual presentation of your data at each step of data wrangling operations. Exploratory Excel 41
  21. 42 Without an intuitive understanding of data… • Fail to

    realize the problems with your data. • Not clear if the result of data wrangling is what you have expected. • Fail to find the patterns in your data.

  22. 45 By quickly switching the data aggregation and grouping you

    can find seasonal patterns in your data.
  23. Original Data Summarize Visualize 46 With Excel, you need to

    summarize the data first, then visualize.
  24. 47 Switching the column assignments is complicated Need to assign

    the area that contains the target data…
  25. 48 Bar Chart Scatter Chart Configuring the chart data assignment

    is different among the chart types, which makes it harder to quickly try various ways to find patterns in data.
  26. 49 The experience of data visualization in Excel is not

    designed to explore data and discover patterns and trends. It is rather designed for presenting the data.
  27. For Date and Time data, you can set the aggregation

    level flexibly by rounding the date or extracting only a part of Date and Time. 52
  28. A data visualization grammar based design lets you use the

    same framework to configure all the charts. For example, the Color is to create multiple groups. 53
  29. You can use Repeat By to separate into multiple charts.

    54 Looks that the return rate spike in December is coming from the North America.
  30. 55 Being able to understand the data intuitively and visualy

    is critical in any steps of data analysis. Recognize the problems. Make sure you get what you expected at each step of Data Wrangling. Find the patterns and trends quickly. Data Wrangling Data Visualization Summary
  31. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way
  32. Well… Ok, you can do Analytics in Excel. But, the

    most of you won’t do it because it’s hard to do it. $BOEP 8JMMEP 㱠
  33. 60 Age Monthly Income The bigger the Age is, the

    bigger the Monthly Income is. Correlation
  34. Even with the same correlation coefficient the data can look

    very different. Correlation Coefficientɿ0.81 Anscombe’s Quartet
  35. You can quickly visualize the relationship between a given two

    variables and see a set of correlation related metrics such as the correlation coefficient. 68
  36. Also, if you have a variable of your interest, you

    can use the Correlation Mode under Summary view to … 70
  37. Now that we know Sales and Sales Comp are correlated,

    but is that because an increase in Sales Comp causes the increase in Sales? 75
  38. 83

  39. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way
  40. Typically, you create charts and copy and paste them to

    Power Point (or Keynote) in order to create a report or present to others. 87
  41. 3 Problems for Reproducibility of Reporting 88 1. Need to

    copy the charts, one by one. 2. If any problems with data you’ll need to do it again. 3. The charts pasted in the report are not guaranteed to reproduce.
  42. 3 Problems for Reproducibility of Reporting 89 1. Need to

    copy the charts, one by one. 2. If any problems with data you’ll need to do it again. 3. The charts pasted in the report are not guaranteed to reproduce.
  43. It becomes harder to find and manage the charts when

    they’re created in various sheets. 91
  44. 3 Problems for Reproducibility of Reporting 97 1. Need to

    copy the charts, one by one. 2. If any problems with data you’ll need to do it again. 3. The charts pasted in the report are not guaranteed to reproduce.
  45. With Excel, a single problem in the upstream of the

    data preparation flow will require all the subsequent charts to be adjusted. 99
  46. When you change or fix the data in the upstream

    of the data preparation flow, all the subsequent steps will be automatically updated to accommodate the change. 102
  47. And, all the charts that reference to the subsequent steps

    will also be automatically updated. 103
  48. You don’t need to revisit all the charts one by

    one, you just click the Run button, then all the charts will be regenerated to accommodate the changes automatically. 104
  49. If the data in the original Excel files have been

    updated, then you can click ‘Re-Import’ button, which will not only import the data but also apply all the data wrangling steps and update the charts automatically. 105
  50. 3 Problems for Reproducibility of Reporting 106 1. Need to

    copy the charts, one by one. 2. If any problems with data you’ll need to do it again. 3. The charts pasted in the report are not guaranteed to reproduce.
  51. I want to customize the report to see how my

    product is doing. Report Author You
  52. 108 • When you are given the report (Excel and

    Powerpoint) you don’t know which charts are supposed to be coming from which Excel sheets. • It’s hard to know how the chart data was prepared since there is no data wrangling steps and chart creation being recorded. Blackbox of Report Creation
  53. &%' You can share your report as an EDF (Exploratory

    Data Format) which contains everything you need to reproduce the report. 110 Report Author You
  54. Just by importing the EDF you’ll be able to reproduce

    exactly the same thing of the original report. 113
  55. You need Data, Data Wrangling Steps, and Chart configuration to

    reproduce the Note, and the EDF contains all of them. 114
  56. 115 Not only reproducing the original report, you can also

    check how the data and the charts were created by looking at the UIs. By sharing the reports as reproducible formats it makes it easier to share the works such as data wrangling and analysis that is needed to produce the reports with other members. You can scale the team productivity by collaborating with others instead of concentrating all the reporting works on a single person.
  57. 9 Reasons Why Excel Users Love Exploratory 1. Better Performance

    & Bigger Data Size 2. Lower Learning Cost 3. Better Debuggability 4. Data Reproducibility & Automation 5. No Dependency on Excel God 6. Intuitive Understanding with Visualization 7. Better and Quicker Analytical Capability 8. Reproducibility for Reporting 9. Easy to Share, But in a Managed way
  58. “There are many spreadsheet data flying around via Emails, Slack,

    Google Docs, or random folders at document sharing servers. But, nobody is really sure which ones are the right ones to look at.” - a person we hear very often
  59. • Not sure which data file is the correct one.

    • Not clear what’s in the data. • Nobody knows how the data has been transformed or manipulated. • Someone has to keep updating the data manually. 120
  60. By downloading EDF, you can import the data along the

    data wrangling steps required to reproduce the data in Exploratory. 130