An Introduction to Exploratory v5.3

19fc8f6113c5c3d86e6176362ff29479?s=47 Kan Nishida
August 07, 2019

An Introduction to Exploratory v5.3

Introducing new features and critical enhancements of Exploratory v5.3.

19fc8f6113c5c3d86e6176362ff29479?s=128

Kan Nishida

August 07, 2019
Tweet

Transcript

  1. Exploratory Seminar Exploratory v5.3

  2. EXPLORATORY

  3. Kan Nishida co-founder/CEO Exploratory Summary Beginning of 2016, launched Exploratory,

    Inc. to democratize Data Science. Prior to Exploratory, Kan was a director of product development at Oracle leading teams for building various Data Science products in areas including Machine Learning, BI, Data Visualization, Mobile Analytics, Big Data, etc. While at Oracle, Kan also provided training and consulting services to help organizations transform with data. @KanAugust Speaker
  4. Mission Make Data Science Available for Everyone

  5. Data Science is not just for Engineers and Statisticians. Exploratory

    makes it possible for Everyone to do Data Science. The Third Wave
  6. First Wave Second Wave Third Wave Proprietary Open Source UI

    & Programming Programming 2016 2000 1976 Monetization Commoditization Democratization Statisticians Data Scientists Democratization of Data Science Algorithms Experience Tools Open Source UI & Automation Business Users Theme Users
  7. Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling Visualization

    Analytics (Statistics / Machine Learning) Exploratory Data Analysis
  8. Exploratory v5.3

  9. Questions Communication (Dashboard, Note, Slides) Data Access Data Wrangling Visualization

    Analytics (Statistics / Machine Learning)
  10. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  11. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  12. Test Mode

  13. None
  14. None
  15. None
  16. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  17. Cut Point for Binary Classification (TRUE / FALSE)

  18. Cut Pointɿ0.5

  19. Cut Pointɿ0.8

  20. Cut Pointɿ0.3

  21. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  22. Random Forest: Boruta

  23. Linear Regression: Variable Importance

  24. Logistic Regression: Average Marginal Effect

  25. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  26. Daily Seasonality

  27. Change Point Period

  28. Change Point : 0.8 (Default)

  29. Change Point : 0.5

  30. Additive vs. Multiplicative

  31. Additive

  32. Multiplicative

  33. Additive

  34. Multiplicative

  35. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  36. Effect Size & Power

  37. Power Analysis

  38. Sample Size Estimation

  39. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  40. Wilcoxon Test & Kruskal-Wallis Test

  41. Analytics • Test Mode Support for Statistical / Machine Learning

    Models • Cut Point for Binary Classification (TRUE/FALSE) • Updated Default Setting - Random Forest, Linear Regression, Logistic Regression • Prophet - Extra Regressor, Daily Seasonality, Multiplicative, Change Point Range • Power Analysis, Effect Size • Support for Wilcoxon Test and Kruskal-Wallis Test • GLM - Negative Binomial
  42. GLM - Negative Binomial

  43. Chart • Categorization of Numeric Values (Binning) • Limit Values

    - Top / Bottom NɺCondition • Log, Absolute, Weekend • Pivot Table - Post Calculation, 30 Columns for Values • Re-order - X, Y, Legend
  44. Chart • Categorization of Numeric Values (Binning) • Limit Values

    - Top / Bottom NɺCondition • Log, Absolute, Weekend • Pivot Table - Post Calculation, 30 Columns for Values • Re-order - X, Y, Legend
  45. None
  46. None
  47. None
  48. None
  49. Categorize Numbers with Equal Width

  50. Categorize Numeric Values for X-Axis & Color

  51. Manually Set Cut Points

  52. Add Label Text

  53. Categorization with Outlier

  54. Outlier by All Setting

  55. Outlier by X-Axis Setting

  56. Chart • Categorization of Numeric Values (Binning) • Limit Values

    - Top / Bottom NɺCondition • Log, Absolute, Weekend • Pivot Table - Post Calculation, 30 Columns for Values • Re-order - X, Y, Legend
  57. Limit X-Axis Values

  58. Top / Bottom N

  59. You can use a column that is not included in

    the chart.
  60. Condition

  61. Slovenia’s Return Rate is 100%! But this could be because

    there is not many orders. The scale is between 0% and 100%.
  62. Sure enough! Slovenia is gone. But it’s easier to see

    difference among other countries now. The scale is now between 0% and 11%.
  63. Chart • Categorization of Numeric Values (Binning) • Limit Values

    - Top / Bottom NɺCondition • Log, Absolute, Weekend • Pivot Table - Post Calculation, 30 Columns for Values • Re-order - X, Y, Legend
  64. Sales 64

  65. log(Sales) 65

  66. Pivot Table - More Values!

  67. Pivot Table - Post Calculation for Grand Total

  68. Data Wrangling

  69. Data Wrangling • New UI - Summarize, Pivot, Rename •

    Updates for Merge Data Frames • Mode function • str_logical function • str_extract_inside function
  70. Data Wrangling • New UI - Summarize, Pivot, Rename •

    Updates for Merge Data Frames • Mode function • str_logical function • str_extract_inside function
  71. Summarize

  72. Custom Function in Summarize

  73. Pivot

  74. Rename

  75. Data Wrangling • New UI - Summarize, Pivot, Rename •

    Updates for Merge Data Frames • Mode function • str_logical function • str_extract_inside function
  76. Merge - Bind Rows

  77. None
  78. None
  79. Demo

  80. Q & A

  81. Contact Email kan@exploratory.io Home Page https://exploratory.io Twitter @KanAugust Online Seminar

    https://exploratory.io/online-seminar