Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Seminar #41 - An Introduction to Exploratory v6.5 New Features

Seminar #41 - An Introduction to Exploratory v6.5 New Features

We have released Exploratory v6.5 on 4/19 with many exciting new features. Kan will introduce some of the new features including the following.

- Summary View: New Chart Types
- Analytics: Time Series Clustering
- Analytics: Updates for Correlation
- Multiple Excel / CSV Files Import
- Google Drive / S3 Support
- Contents Search inside Project

19fc8f6113c5c3d86e6176362ff29479?s=128

Kan Nishida
PRO

April 21, 2021
Tweet

Transcript

  1. EXPLORATORY Online Seminar #41 Introduction to Exploratory v6.5

  2. Kan Nishida CEO/co-founder Exploratory Summary In Spring 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams to build various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  3. 3 Data Science is not just for Engineers and Statisticians.

    Exploratory makes it possible for Everyone to do Data Science. The Third Wave
  4. 4 Questions Communication Data Access Data Wrangling Visualization Analytics (Statistics

    / Machine Learning) Data Analysis Data Science Workflow
  5. 5 Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling

    Visualization Analytics (Statistics / Machine Learning) Data Analysis ExploratoryɹModern & Simple UI
  6. EXPLORATORY Online Seminar #41 Introduction to Exploratory v6.5

  7. • Project • Summary View • Analytics • Chart (Visualization)

    • Data Wrangling • Data Source Areas of New Features / Enhancements
  8. Project • Width Adjustment for Data Frame List • Search

    inside Project • Data Frame Export
  9. Width Adjustment Now, you can adjust the width of the

    data frame list area!
  10. Search Type to find data frames, charts, notes, dashboards quickly!

  11. Search by the chart name and the comment as well.

  12. The pop-up now shows the analytics thumbnails as well.

  13. And you can click on the thumbnail image to directly

    open the chart!
  14. Now all the charts are shown inside the pop-up and

    you can scroll through when there are too many charts!
  15. Data Frame Export You can export a data frame including

    all the charts, analytics, and branches. Note that the data frames that are joined or merged with the exported data frame won’t be exported. They will need to be exported as separate files.
  16. You can import the exported data frame to reproduce the

    data frame with charts, analytics, and branches.
  17. • Multiple Excel / CSV Files Import • Google Drive

    • Amazon S3 • Enhancements for Google BigQuery Data Source
  18. It now supports Amazon S3 and Google Drive as the

    data file locations.
  19. You can select multiple files to import.

  20. You can select the files by matching the file name.

  21. Or, simply select all the files that are inside a

    selected folder.
  22. Once you select files, you can click ‘Import’ button to

    import as separate data frames.
  23. You will see a data import dialog to configure the

    way to import each file.
  24. You can see the index of each file in the

    importing sequence.
  25. If you want to apply the same setting to all

    the files instead of configuring one by one then you can click ‘OK for All’ button.
  26. You can ‘Skip’ to cancel a particular file import or

    ‘Skip All’ to cancel all the remaining files.
  27. Google Drive You can import the files (CSV / Excel)

    that are saved at Google Drive now!
  28. Select the ‘Google Drive’ tab.

  29. Select a file(s) or a folder.

  30. You can import the files as separate data frames or

    as a single data frame by merging them together.
  31. The same import configuration will be applied to all the

    files.
  32. You will see the original file names under the ‘id’

    column.
  33. If there are new files being added to the same

    folder…
  34. You can simply click on ‘Re-import’ button to import all

    the files that matches with the file selection condition.
  35. • Aggregation • Distribution • Uncertainty Summary View - Correlation

  36. The Aggregation type shows you the aggregated values (mean or

    ratio) for each numeric or categorical value.
  37. The Distribution type shows you a distribution of a target

    variable for each numerical or categorical value.
  38. The Uncertainty type shows you the mean or the ratio

    with the confidence interval for each numerical or categorical value so that you can see if a given difference is significant or not.
  39. • Time Series Clustering under Analytics View • Correlation -

    Significance Test • Variable Importance with FIRM algorithm Analytics
  40. Time Series Clustering

  41. We introduced Time Series Clustering as Data Wrangling Step with

    v6.4.
  42. None
  43. With v6.5, we added the Time Series Clustering under the

    Analytics view.
  44. When you want to cluster the data based on the

    similarities of the raw values you want to set ‘Normalize Value’ to FALSE.
  45. But some times, you want to cluster the data not

    based on the similarity of t values but based on the similarity of the trend (ups and downs).
  46. In such cases, you can turn on ‘Normalize Value’.

  47. Then, you can see how they are clustered under the

    ‘Time Series (Normalized)’ tab.
  48. Correlation Now you can show the correlation coefficient values on

    the charts.
  49. New ‘Significance’ tab shows P Value and color each combination

    based on whether it is statistically significant or not.
  50. FIRM - Feature Importance Ranking Measure A new algorithm ‘FIRM’

    is added for calculating the variable importance.
  51. The ‘FIRM’ scores the importance of each variable based on

    the variance of the predicted values.
  52. • Sorting Support for Stack Bar Chart • Enhancements for

    Word Cloud • Table: Support Date / Time Formatting • Summarize Table / Pivot Table: Grand Total Calculation Timing • Sorting Support for Cumulative Ratio Chart (Visualization)
  53. Now you can sort the bars based on the first

    level in the Color group.
  54. Word Cloud Now you can assign different columns to Color

    and Size!
  55. You can pick a color palette from various options.

  56. Sorting Support for Cumulative Ratio

  57. Let’s say you want to know ‘What are the countries

    that are top 80% of your sales?’
  58. First, select ‘Cumulative’ window calculation type and ‘Sum Ratio’.

  59. Then, select ‘Y1 Axis’ to sort the countries from the

    highest Sales.
  60. Date Format Support for Table

  61. Select a date format from a list of formatting options.

  62. Summarize by Row with ‘summarize_row’ function Data Wrangling

  63. Sometimes, you want to summarize for each row. Mean

  64. Now you can use the ‘summarize_row’ function to do just

    that!
  65. summarize_row(across(where(is.numeric)), mean, na.rm=TRUE) This function lets you select columns super

    flexibly.
  66. summarize_row(across(where(is.numeric)), mean, na.rm=TRUE) This is to select all the columns

    as long as they are the numeric type.
  67. summarize_row(across(c(columnA, columnB)), mean, na.rm=TRUE) You can select multiple columns by

    typing the column names inside the ‘c’ function.
  68. summarize_row(across(starts_with(‘Sales’)), mean, na.rm=TRUE) You can select all the columns whose

    names starting with a given letters.
  69. summarize_row(across(starts_with(‘Sales’)), mean, na.rm=TRUE) You can select multiple columns by typing

    the column names inside the ‘c’ function.
  70. summarize_row(across(where(is.numeric)), mean, na.rm=TRUE) Here, I’m using the ‘mean’ function to

    calculate the average.
  71. summarize_row(across(where(is.numeric)), mean, na.rm=TRUE) This is to let it ignores NAs

    (remove na) when summarizing data.
  72. One More Thing…

  73. Subscribe Exploratory Server

  74. With Exploratory Server, you can Share, Schedule, and Interact with

    all your Data and Insights (Chart, Analytics, Dashboard, Notes, and Slides). Exploratory Server
  75. 1. Publish Data, Chart, Dashboard, Note, & Slides.

  76. 2. Open them in browser.

  77. 3. And Share!

  78. 4. Find all your insights under My Insights page.

  79. 5. Or, find them by Insight Search.

  80. Some of them are scheduled to update the data frequently.

  81. You might have data or insights that you want to

    be updated when the data is updated (scheduled). You can now subscribe the email notification.
  82. When the data is updated you will receive an email

    notification.
  83. That’s it for today! 83

  84. EXPLORATORY Online Seminar #42 4/28/2021 (Wed) 11AM PT Cohort Analysis

    Part 1 - Layer Cake Chart
  85. None
  86. Information Email kan@exploratory.io Website https://exploratory.io Twitter @ExploratoryData Seminar https://exploratory.io/online-seminar

  87. Q & A 87

  88. EXPLORATORY 88