Business Intelligence Seminar

D73dc2189cf378ae9088283c720d0331?s=47 Pacmann AI
August 10, 2019

Business Intelligence Seminar

A brief of Business Intelligence that enables you to access and analyze information so you can improve and optimize your business decision and performance.

D73dc2189cf378ae9088283c720d0331?s=128

Pacmann AI

August 10, 2019
Tweet

Transcript

  1. BUSINESS INTELLIGENCE

  2. A brief of Business Intelligence that enables you to access

    and analyze information so you can improve and optimizing your business decision and performance. WHAT
  3. Challenges of current situation. Why you need Business Intelligence? Complex

    Business Problem
  4. The number of possible solutions is so large that it

    precludes a complete search for the best answer. 01 The problem exists in a time-changing environment. 02 The problem is heavily constrained 03 There are many (possibly conflicting) objectives 04
  5. The necessary data were not recorded. Incomplete Information 1 The

    data are not reliable. Uncertainty 3 The data contain rounded figures and estimates. Noisy Data 2
  6. Academic Research (Social Science) Theory Hypothesis Data Exploration Metrics Insight

    Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision
  7. Case of Business Intelligence Overview

  8. Motivation “People in both fields operate with beliefs and biases.

    To the extent you can eliminate both and replace them with data, you gain a clear advantage” Michael Lewis, Moneyball: The Art of Winning an Unfair Game
  9. None
  10. None
  11. Variables: • Team • League • Year • Runs Scored

    (RS) • Runs Allowed (RA) • Wins (W) • On-Base Percentage (OBP) • Slugging Percentage (SLG) • Batting Average (BA) • Playoffs (binary) • RankSeason • RankPlayoffs • Games Played (G) • Opponent On-Base Percentage (OOBP) • Opponent Slugging Percentage (OSLG) Moneyball Case: Intelligence Part 1 Collecting Data
  12. Moneyball Case: Intelligence Part 2 Information Processing How does a

    team make it to playoffs? To be exact, how many games did it take to make it to playoffs? Target Wins: 95
  13. Moneyball Case: Intelligence Part 3 Gaining More Knowledge How many

    more runs do we need to score than we allow in order to win 95 games in the regular season? Run Differential (RD) = Run Scored (RS) - Run Allowed (RA)
  14. Moneyball Case: Intelligence Part 4 Build a Metrics To achieve

    the goal, we need to measure how much [metrics] do we need for Run Scored, Runs Allowed, and Run Differential?
  15. Moneyball Case: Intelligence Part 4 Build a Metrics W =

    80.8814 + 0.1058(RD) RD = (W - 80.8814) / 0.1058 Replace W with 95 RD = (95 - 80.8814) / 0.1058 RD ≈ 133
  16. Moneyball Case: Intelligence Part 5 Optimal Decision Support Drop BA

    (Batting Average)?
  17. What OBP and SLG do we need to achieve a

    run differential of +133? The OOBP and OSLG for the A's in 2001 were: • OOBP = 0.315 • OSLG = 0.384 So the estimated RA ≈ 662 The actual value of runs allowed in the 2002 season was 654. Moneyball Case: Intelligence Part 5 Optimal Decision Support
  18. Moneyball Case: Intelligence Part 5 Optimal Decision Support What OOBP

    and OSLG do we need to achieve a run differential of +133? The OBP and SLG for the A's in 2001 were: • OBP = 0.339 • SLG = 0.432 So the RA ≈ 808 The actual value of runs scored in the 2002 season was 800.
  19. Moneyball Case: Intelligence Part 5 Optimal Decision Support Actual RD

    = 800 - 654 = 156 W = 80.8814 + 0.1058(RD) W = 80.8814 + 0.1058(156) W ≈ 96 The actual value of games won in the 2002 season was 103.
  20. None
  21. Data-Driven for Decision Making The ability to use existing data

    in a new way or obtain data to make decisions with confidence that creates meaningful change Problem Decision PROBLEM SOLVING
  22. Business Intelligence Dimensions

  23. What is Business Intelligence? Business intelligence (BI) is a set

    of theories, methodologies, architectures, and technologies that transform raw data into meaningful and useful information for business purposes. 3 2 1 Lots and lots of data 01 Processing and Aggregation 02 Insight & Visualization 03
  24. Descriptive Analytics Diagnostic Analytics Predictive Analytics Prescriptive Analytics Information Optimization

    What Happened? Why did it happened? What will happen? How can we make it happen? Early Stage of company Amount of data Analytics & Company Maturity Seed Growth Beyond
  25. Customer Development Cycle A company is a sequence of hypothesis

    testing. https://xweb.stanford.edu/group/e145/cgi-bin/wi nter/drupal/upload/handouts/Four_Steps.pdf
  26. Data Acquisition

  27. Business Intelligence Data Exploration Insight Hypothesis Metrics A/B Test Decision

  28. Getting Your First Data Survey FGD

  29. Surveys can provide timely data using specific survey questions. “Designing

    and Conducting Business Survey” - Ger Snijkers, Gustav Haraldsen, Jacqui Jones. 2013) Getting Your First Data: Survey
  30. Getting Your First Data: Survey General Flow Business Intelligence Design

    Scope, variables Build sample, questions Test Launch, processing, get insight
  31. Getting Your First Data: FGD To identify a range of

    perspectives on some topic/issue, to gain an understanding of the topic/issue from the perspective of participants themselves
  32. Getting Your First Data: FGD 1 6 2 3 4

    5 Explore 2 Group Process Gain Diversity 4 Explain 5 Evaluate 6 Design
  33. The market research approach typically uses focus group discussions to

    gain consumer views on new products or marketing campaigns (Kroll et al., 2007; Bloor et al., 2001). Getting Your First Data: FGD
  34. Problems • Your opinion/belief is only a hypothesis. • The

    obvious things based on your opinion might be a fallacy. • The first person you can be easily fooled is yourself. • In order to falsify your opinion or belief, you can gather data and compare it as a source of “truth”. Getting Your First Data: Case
  35. Case I : Motorola Motorola’s Iridium satellite-based phone system. Engineering

    triumph and built to support a customer base of millions. No one asked the customer if they wanted it. Cost $5 billion. Yes, billion. Satellites are awfully expensive. Getting Your First Data: Case
  36. Case II : Smokeless Cigarettes R.J. Reynolds’ Premier and Eclipse

    smokeless cigarettes. Understood what the general public (nonsmokers) wanted, but did not understand that their customers didn’t care. Cost: $450 million Getting Your First Data: Case
  37. Case III : Toothpaste A toothpaste company want to increase

    their sales. Q: But How? Let’s do a FGD. Focus Group Discussion Person 1: I like the aluminium tube. The toothpaste smells nice. Person 2: The price is affordable. I can easily squeeze the aluminium tube. Person 3: It taste good. My child might eat it. Getting Your First Data: Case
  38. Case III : Toothpaste Design a new tube, made of

    plastic, so users can’t squeeze all the toothpaste, and buy more. Getting Your First Data: Case
  39. Getting Your First Data Conclusion You need to gather your

    first data, either with FGD or Survey or secondary source and use it to falsify your opinion. Data triumph intuition!
  40. Data Exploration and Visualization

  41. Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

  42. Problems • You already have data from survey/FGD. • You

    want to understand your user behavior, gain insight from your data. • You want to find patterns in your data. • These patterns can be used to form hypothesis on how to optimize our objectives. Data Exploration and Visualization
  43. Case: NYC Taxi Trip In this competition, Kaggle is challenging

    you to build a model that predicts the total ride duration of taxi trips in New York City. Your primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and several other variables.
  44. How • I will follow George Polya’s “How to Solve

    It” heuristics: Understand your problem. • Ask questions! ◦ What are you asked to find or show? ◦ Can you think of a diagram that might help you understand the problem? ◦ Is there enough information to enable you to find a solution? ◦ Do you understand all the data used in this problem? • Do visualizations! Data Exploration and Visualization
  45. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  46. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  47. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  48. None
  49. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  50. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  51. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  52. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  53. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  54. Conclusion Data understanding from exploration and visualization make us now

    the behavior of our users. These understanding can be formed into hypothesis, and we can test those hypothesis to optimize our decision/objectives. Data Exploration and Visualization
  55. Hypothesis Making

  56. Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

  57. Hypothesis A hypothesis is a tentative, provisional, or unconfirmed statement

    derived from theory/ intuitions that can be either verified or falsified.
  58. • The questions we ask are more important than the

    things we measure. • I will follow Judea Pearl hypothesis making mechanism. • We will represent hypothesis as “How to Solve It” heuristics mechanism to find a good set of hypothesis. Hypothesis
  59. Hypothesis Alcohol drinker Lung cancer Smoking Correlation does not imply

    causation
  60. Case: NYC Taxi Trip In this competition, Kaggle is challenging

    you to build a model that predicts the total ride duration of taxi trips in New York City. Your primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and several other variables.
  61. Is there any hypothesis you can form based on previous

    visualizations? Hypothesis
  62. None
  63. Hypothesis Vendor 1 average speed > Vendor 0 average speed

    Vendor 1 pickup > Vendor 0 Pickup
  64. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Hypothesis

  65. Hypothesis Weekdays? Median trip duration weekdays > Median trip duration

    weekends
  66. source: https://www.kaggle.com/headsortails/nyc-taxi-eda-update-the-fast-the-curious Data Exploration and Visualization

  67. Hypothesis Direct Distance Trip Distance

  68. Is there any hypothesis you can form based on previous

    visualizations? Hypothesis
  69. Business Metrics

  70. Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

  71. Introduction to Business Metrics Measurement is the act of determining

    a quantitative indication of the extent, amount, dimension, capacity, or size of some attribute of a product or process (Pressman, 2000) Metrics is a quantitative measure of the degree to which a system, component, or process possesses a given attribute. (IEEE, 1990)
  72. Introduction to Business Metrics KPI 1 2 Business Metrics

  73. Business Metrics: User Metrics

  74. Business Metrics: User Metrics Why: - To determine if one

    methodology produces a faster result than another - Identify hot leads - Improve marketing campaign effectiveness - Determine which marketing campaigns lead to the most profitable customers - Discover which features are getting the most/least use - Reveal technical problems which are hindering your service
  75. Business Metrics: User Activity Metrics Metrics Examples: - Churn Rate

    - DAU/MAU Ratio - Adoption Rate - Lifetime Value - Screen Flow - Completion Rate - Time Based Efficiency - Overall Relative Efficiency
  76. One single most important User Activity Metrics: - Retention Business

    Metrics: User Activity Metrics % active (monthly) Days from acquisition
  77. Business Metrics: User Activity Metrics % active (monthly) Retention Low

    Retention Days from acquisition
  78. Business Metrics: User Activity Metrics When this thing happen: -

    You probably don’t have a product market fit. - Revise and build a better product. - Your churn rate is a big problem. - You have a loyal customer. - The parallel line of retention with X axis show it converge to a number. - Don’t make yourself biased - 20% of retention, in airlines market, in Indonesia, in daily churn might be big number. People probably only use airplane 2 times a year on average.
  79. Business Metrics: User Activity Metrics % active (monthly) Retention High

    Retention Days from acquisition
  80. Business Metrics: User Activity Metrics When this thing happen: -

    You probably have a product market fit. - Your churn rate is not a big problem. - You have a loyal customer. - The parallel line of retention with X axis show it converge to a number. - Don’t make yourself biased - 95% retention of Nasi Goreng, in NY, 100 people is a small market.
  81. Business Metrics: User Activity Metrics % active (monthly) Retention Dead

    of a product Days from acquisition
  82. Business Metrics: User Activity Metrics When this thing happen: -

    YOU DON’T HAVE A PRODUCT MARKET FIT
  83. Business Metrics: User Activity Metrics % active (monthly) Retention Dead

    of a product Days from acquisition
  84. Business Metrics: User Activity Metrics When this thing happen: -

    Marketing might help you in the short run, but the retention will converge to its natural rate when the marketing campaign gone. - Yes, it can prolong your product life cycle. (and suffering) - Still, the time you buy with marketing campaign need to be translated into a new better product, so the users might stay.
  85. Business Metrics Conclusion It is important to track business metrics

    to know current business states.
  86. A/B Testing

  87. Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

  88. Problems • We want to measure the effect of our

    product/hypothesis/experiment on some business metrics. • If the effect is large enough, it can increase our metrics and achieve our business objectives. A/B Testing
  89. A/B Testing Population Sample A B Separate these samples into

    two set
  90. A/B Testing Sample A Sample B New Experiment Treatment

  91. Optimize Decision Making: A/B Testing Case : Scurvy • First

    controlled experiment / randomized trial for medical purposes • Scurvy is a disease that results from vitamin C deficiency • It killed over 100,000 people in the 16th-18th centuries, mostly sailors • Lord Anson’s circumnavigation voyage from 1740 to 1744 started with 1,800 sailors and only about 200 returned;most died from scurvy • Dr. James Lind noticed lack of scurvy in Mediterranean ships. • Gave some sailors limes (treatment), others ate regular diet (control)
  92. A/B Testing Source: https://exp-platfor m.com/Documents /2015-08OnlineCo ntrolledExperiment sKDDKeynoteNR. pdf

  93. A/B Testing Source: https://exp-platfor m.com/Documents /2015-08OnlineCo ntrolledExperiment sKDDKeynoteNR. pdf

  94. Quasi Experimental Testing Which side of airplanes do we need

    to strengthen?
  95. A/B Testing and Metrics

  96. A/B Testing and Metrics Sources: https://web.stanf ord.edu/class/ee 380/Abstracts/1 40129-slides-Ma chine-Learning-

    and-Econometri cs.pdf
  97. A/B Testing Conclusion It is important to measure our hypothesis/product

    impact on business metrics.
  98. Decision Making

  99. Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

  100. Problems Our objective is to choose a range of values

    to optimize business metrics. How Step 1: Get data. Step 2: Define a loss function. Step 3: Build a model with a tunable parameter. Step 4: Optimize the loss function given a range of parameter values. Step 5: Choose the best parameter. Decision Making
  101. Case: Let’s say I am trying to decide a price

    at which to list a used phone I want to sell. In this case I may denote my decision space as the entire positive real line such that a∈[0,+∞) . Decision Making
  102. Step 1: Get Data Decision Making

  103. Step 2: Define a loss function. how do we figure

    out the loss associated with individual decisions when we don’t even know the information we want to use to make a decision? The answer is that we turn to probability theory and instead calculate the “Expected Loss” we would feel if we choose a given action given our beliefs (our probability distribution) about θ Decision Making
  104. Step 3: Build model Decision Making Source: http://www.statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/

  105. Step 4: Optimize Loss, with Monte Carlo Simulation Source: http://www.statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/

    Decision Making
  106. Step 4: Get Optimal Parameter Source: http://www.statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/ For a given

    phone, the optimal price is $71 Decision Making
  107. Decision Making Conclusion Bayesian Decision Making is important to optimize

    our decision to choose the best parameter on a large set of values.
  108. Case: Customer Cycle Development

  109. Death Spiral: Product Development Cycle https://web.stanford.edu/group/e145/cgi-bin/winter/drupal/upload/handouts/Four_Steps.pdf

  110. Customer Development Cycle A company is a sequence of hypothesis

    testing. https://xweb.stanford.edu/group/e145/cgi -bin/winter/drupal/upload/handouts/Four_ Steps.pdf
  111. Customer Development Cycle: Case

  112. 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

    Build a prototype 4. Test the prototype to users Customer Development Cycle Product Hypothesis: Do they need a checkins-photo?? Data Exploration Insight Hypothesis Metrics Test Decision
  113. 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

    Build a prototype 4. Test the prototype to users Customer Development Cycle Market Hypothesis: Do we need to focus on photo only? Data Exploration Insight Hypothesis Metrics Test Decision
  114. 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

    Build a prototype 4. Test the prototype to users Customer Development Cycle Feature Question: What kind of feature do they want to make the users use our product? Data Exploration Insight Hypothesis Metrics Test Decision
  115. 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

    Build a prototype 4. Test the prototype to users Customer Development Cycle Marketing Question: Do we need to rebrand our app? Data Exploration Insight Hypothesis Metrics Test Decision
  116. EOP Contact: business@pacmannai.com

  117. We believe everyone can be Data Scientist Contact: business@pacmannai.com

  118. Upcoming ML & BI Class Contact: business@pacmannai.com

  119. None
  120. Check our website: www.pacmann.ai Contact: business@pacmannai.com

  121. Check our website: www.pacmann.ai Contact: business@pacmannai.com

  122. Check our website: www.pacmann.ai Contact: business@pacmannai.com

  123. Past Classes Contact: business@pacmannai.com

  124. Pacmann AI Classes Quality State of the Art of Machine

    Learning Research Practical Skills Theoretical Understanding > 50 institutions 400++ alumni 6 Classes in the past business@pacmannai.com https://pacmann.ai
  125. 60 participants 1 week 28 institutions business@pacmannai.com https://pacmann.ai Previous Classes

  126. Previous Classes business@pacmannai.com https://pacmann.ai 96 participants 2 weeks 51 institutions

  127. Previous Classes business@pacmannai.com https://pacmann.ai 61 participants 8 weeks 48 institutions

  128. Previous Classes business@pacmannai.com https://pacmann.ai 59 participants 8 weeks 48 institutions

  129. Previous Classes business@pacmannai.com https://pacmann.ai 44 participants 2 weeks 33 institutions

  130. Previous Classes Facts & Figure business@pacmannai.com https://pacmann.ai Level of Education

  131. Previous Classes Facts & Figure business@pacmannai.com https://pacmann.ai Field of Work

  132. Previous Participants business@pacmannai.com https://pacmann.ai

  133. Contact Email: business@pacmannai.com Whatsapp Business: +62 812-8122-1707