Pacmann AI
August 10, 2019
1.9k

A brief of Business Intelligence that enables you to access and analyze information so you can improve and optimize your business decision and performance.

August 10, 2019

## Transcript

2. ### A brief of Business Intelligence that enables you to access

and analyze information so you can improve and optimizing your business decision and performance. WHAT

4. ### The number of possible solutions is so large that it

precludes a complete search for the best answer. 01 The problem exists in a time-changing environment. 02 The problem is heavily constrained 03 There are many (possibly conﬂicting) objectives 04
5. ### The necessary data were not recorded. Incomplete Information 1 The

data are not reliable. Uncertainty 3 The data contain rounded ﬁgures and estimates. Noisy Data 2
6. ### Academic Research (Social Science) Theory Hypothesis Data Exploration Metrics Insight

Business Intelligence Data Exploration Insight Hypothesis Metrics Test Decision

8. ### Motivation “People in both ﬁelds operate with beliefs and biases.

To the extent you can eliminate both and replace them with data, you gain a clear advantage” Michael Lewis, Moneyball: The Art of Winning an Unfair Game
9. None
10. None
11. ### Variables: • Team • League • Year • Runs Scored

(RS) • Runs Allowed (RA) • Wins (W) • On-Base Percentage (OBP) • Slugging Percentage (SLG) • Batting Average (BA) • Playoffs (binary) • RankSeason • RankPlayoffs • Games Played (G) • Opponent On-Base Percentage (OOBP) • Opponent Slugging Percentage (OSLG) Moneyball Case: Intelligence Part 1 Collecting Data
12. ### Moneyball Case: Intelligence Part 2 Information Processing How does a

team make it to playoffs? To be exact, how many games did it take to make it to playoffs? Target Wins: 95
13. ### Moneyball Case: Intelligence Part 3 Gaining More Knowledge How many

more runs do we need to score than we allow in order to win 95 games in the regular season? Run Differential (RD) = Run Scored (RS) - Run Allowed (RA)
14. ### Moneyball Case: Intelligence Part 4 Build a Metrics To achieve

the goal, we need to measure how much [metrics] do we need for Run Scored, Runs Allowed, and Run Differential?
15. ### Moneyball Case: Intelligence Part 4 Build a Metrics W =

80.8814 + 0.1058(RD) RD = (W - 80.8814) / 0.1058 Replace W with 95 RD = (95 - 80.8814) / 0.1058 RD ≈ 133
16. ### Moneyball Case: Intelligence Part 5 Optimal Decision Support Drop BA

(Batting Average)?
17. ### What OBP and SLG do we need to achieve a

run differential of +133? The OOBP and OSLG for the A's in 2001 were: • OOBP = 0.315 • OSLG = 0.384 So the estimated RA ≈ 662 The actual value of runs allowed in the 2002 season was 654. Moneyball Case: Intelligence Part 5 Optimal Decision Support
18. ### Moneyball Case: Intelligence Part 5 Optimal Decision Support What OOBP

and OSLG do we need to achieve a run differential of +133? The OBP and SLG for the A's in 2001 were: • OBP = 0.339 • SLG = 0.432 So the RA ≈ 808 The actual value of runs scored in the 2002 season was 800.
19. ### Moneyball Case: Intelligence Part 5 Optimal Decision Support Actual RD

= 800 - 654 = 156 W = 80.8814 + 0.1058(RD) W = 80.8814 + 0.1058(156) W ≈ 96 The actual value of games won in the 2002 season was 103.
20. None
21. ### Data-Driven for Decision Making The ability to use existing data

in a new way or obtain data to make decisions with conﬁdence that creates meaningful change Problem Decision PROBLEM SOLVING

23. ### What is Business Intelligence? Business intelligence (BI) is a set

of theories, methodologies, architectures, and technologies that transform raw data into meaningful and useful information for business purposes. 3 2 1 Lots and lots of data 01 Processing and Aggregation 02 Insight & Visualization 03
24. ### Descriptive Analytics Diagnostic Analytics Predictive Analytics Prescriptive Analytics Information Optimization

What Happened? Why did it happened? What will happen? How can we make it happen? Early Stage of company Amount of data Analytics & Company Maturity Seed Growth Beyond

29. ### Surveys can provide timely data using speciﬁc survey questions. “Designing

and Conducting Business Survey” - Ger Snijkers, Gustav Haraldsen, Jacqui Jones. 2013) Getting Your First Data: Survey
30. ### Getting Your First Data: Survey General Flow Business Intelligence Design

Scope, variables Build sample, questions Test Launch, processing, get insight
31. ### Getting Your First Data: FGD To identify a range of

perspectives on some topic/issue, to gain an understanding of the topic/issue from the perspective of participants themselves
32. ### Getting Your First Data: FGD 1 6 2 3 4

5 Explore 2 Group Process Gain Diversity 4 Explain 5 Evaluate 6 Design
33. ### The market research approach typically uses focus group discussions to

gain consumer views on new products or marketing campaigns (Kroll et al., 2007; Bloor et al., 2001). Getting Your First Data: FGD
34. ### Problems • Your opinion/belief is only a hypothesis. • The

obvious things based on your opinion might be a fallacy. • The ﬁrst person you can be easily fooled is yourself. • In order to falsify your opinion or belief, you can gather data and compare it as a source of “truth”. Getting Your First Data: Case
35. ### Case I : Motorola Motorola’s Iridium satellite-based phone system. Engineering

triumph and built to support a customer base of millions. No one asked the customer if they wanted it. Cost \$5 billion. Yes, billion. Satellites are awfully expensive. Getting Your First Data: Case
36. ### Case II : Smokeless Cigarettes R.J. Reynolds’ Premier and Eclipse

smokeless cigarettes. Understood what the general public (nonsmokers) wanted, but did not understand that their customers didn’t care. Cost: \$450 million Getting Your First Data: Case
37. ### Case III : Toothpaste A toothpaste company want to increase

their sales. Q: But How? Let’s do a FGD. Focus Group Discussion Person 1: I like the aluminium tube. The toothpaste smells nice. Person 2: The price is affordable. I can easily squeeze the aluminium tube. Person 3: It taste good. My child might eat it. Getting Your First Data: Case
38. ### Case III : Toothpaste Design a new tube, made of

plastic, so users can’t squeeze all the toothpaste, and buy more. Getting Your First Data: Case
39. ### Getting Your First Data Conclusion You need to gather your

ﬁrst data, either with FGD or Survey or secondary source and use it to falsify your opinion. Data triumph intuition!

42. ### Problems • You already have data from survey/FGD. • You

want to understand your user behavior, gain insight from your data. • You want to ﬁnd patterns in your data. • These patterns can be used to form hypothesis on how to optimize our objectives. Data Exploration and Visualization
43. ### Case: NYC Taxi Trip In this competition, Kaggle is challenging

you to build a model that predicts the total ride duration of taxi trips in New York City. Your primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and several other variables.
44. ### How • I will follow George Polya’s “How to Solve

It” heuristics: Understand your problem. • Ask questions! ◦ What are you asked to ﬁnd or show? ◦ Can you think of a diagram that might help you understand the problem? ◦ Is there enough information to enable you to ﬁnd a solution? ◦ Do you understand all the data used in this problem? • Do visualizations! Data Exploration and Visualization

48. None

54. ### Conclusion Data understanding from exploration and visualization make us now

the behavior of our users. These understanding can be formed into hypothesis, and we can test those hypothesis to optimize our decision/objectives. Data Exploration and Visualization

57. ### Hypothesis A hypothesis is a tentative, provisional, or unconﬁrmed statement

derived from theory/ intuitions that can be either veriﬁed or falsiﬁed.
58. ### • The questions we ask are more important than the

things we measure. • I will follow Judea Pearl hypothesis making mechanism. • We will represent hypothesis as “How to Solve It” heuristics mechanism to ﬁnd a good set of hypothesis. Hypothesis

causation
60. ### Case: NYC Taxi Trip In this competition, Kaggle is challenging

you to build a model that predicts the total ride duration of taxi trips in New York City. Your primary dataset is one released by the NYC Taxi and Limousine Commission, which includes pickup time, geo-coordinates, number of passengers, and several other variables.
61. ### Is there any hypothesis you can form based on previous

visualizations? Hypothesis
62. None
63. ### Hypothesis Vendor 1 average speed > Vendor 0 average speed

Vendor 1 pickup > Vendor 0 Pickup

weekends

68. ### Is there any hypothesis you can form based on previous

visualizations? Hypothesis

71. ### Introduction to Business Metrics Measurement is the act of determining

a quantitative indication of the extent, amount, dimension, capacity, or size of some attribute of a product or process (Pressman, 2000) Metrics is a quantitative measure of the degree to which a system, component, or process possesses a given attribute. (IEEE, 1990)

74. ### Business Metrics: User Metrics Why: - To determine if one

methodology produces a faster result than another - Identify hot leads - Improve marketing campaign effectiveness - Determine which marketing campaigns lead to the most proﬁtable customers - Discover which features are getting the most/least use - Reveal technical problems which are hindering your service
75. ### Business Metrics: User Activity Metrics Metrics Examples: - Churn Rate

- DAU/MAU Ratio - Adoption Rate - Lifetime Value - Screen Flow - Completion Rate - Time Based Efﬁciency - Overall Relative Efﬁciency
76. ### One single most important User Activity Metrics: - Retention Business

Metrics: User Activity Metrics % active (monthly) Days from acquisition
77. ### Business Metrics: User Activity Metrics % active (monthly) Retention Low

Retention Days from acquisition
78. ### Business Metrics: User Activity Metrics When this thing happen: -

You probably don’t have a product market ﬁt. - Revise and build a better product. - Your churn rate is a big problem. - You have a loyal customer. - The parallel line of retention with X axis show it converge to a number. - Don’t make yourself biased - 20% of retention, in airlines market, in Indonesia, in daily churn might be big number. People probably only use airplane 2 times a year on average.
79. ### Business Metrics: User Activity Metrics % active (monthly) Retention High

Retention Days from acquisition
80. ### Business Metrics: User Activity Metrics When this thing happen: -

You probably have a product market ﬁt. - Your churn rate is not a big problem. - You have a loyal customer. - The parallel line of retention with X axis show it converge to a number. - Don’t make yourself biased - 95% retention of Nasi Goreng, in NY, 100 people is a small market.
81. ### Business Metrics: User Activity Metrics % active (monthly) Retention Dead

of a product Days from acquisition
82. ### Business Metrics: User Activity Metrics When this thing happen: -

YOU DON’T HAVE A PRODUCT MARKET FIT
83. ### Business Metrics: User Activity Metrics % active (monthly) Retention Dead

of a product Days from acquisition
84. ### Business Metrics: User Activity Metrics When this thing happen: -

Marketing might help you in the short run, but the retention will converge to its natural rate when the marketing campaign gone. - Yes, it can prolong your product life cycle. (and suffering) - Still, the time you buy with marketing campaign need to be translated into a new better product, so the users might stay.

88. ### Problems • We want to measure the effect of our

product/hypothesis/experiment on some business metrics. • If the effect is large enough, it can increase our metrics and achieve our business objectives. A/B Testing

two set

91. ### Optimize Decision Making: A/B Testing Case : Scurvy • First

controlled experiment / randomized trial for medical purposes • Scurvy is a disease that results from vitamin C deﬁciency • It killed over 100,000 people in the 16th-18th centuries, mostly sailors • Lord Anson’s circumnavigation voyage from 1740 to 1744 started with 1,800 sailors and only about 200 returned;most died from scurvy • Dr. James Lind noticed lack of scurvy in Mediterranean ships. • Gave some sailors limes (treatment), others ate regular diet (control)

94. ### Quasi Experimental Testing Which side of airplanes do we need

to strengthen?

96. ### A/B Testing and Metrics Sources: https://web.stanf ord.edu/class/ee 380/Abstracts/1 40129-slides-Ma chine-Learning-

and-Econometri cs.pdf

100. ### Problems Our objective is to choose a range of values

to optimize business metrics. How Step 1: Get data. Step 2: Deﬁne a loss function. Step 3: Build a model with a tunable parameter. Step 4: Optimize the loss function given a range of parameter values. Step 5: Choose the best parameter. Decision Making
101. ### Case: Let’s say I am trying to decide a price

at which to list a used phone I want to sell. In this case I may denote my decision space as the entire positive real line such that a∈[0,+∞) . Decision Making

103. ### Step 2: Deﬁne a loss function. how do we ﬁgure

out the loss associated with individual decisions when we don’t even know the information we want to use to make a decision? The answer is that we turn to probability theory and instead calculate the “Expected Loss” we would feel if we choose a given action given our beliefs (our probability distribution) about θ Decision Making

105. ### Step 4: Optimize Loss, with Monte Carlo Simulation Source: http://www.statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/

Decision Making
106. ### Step 4: Get Optimal Parameter Source: http://www.statsathome.com/2017/10/12/bayesian-decision-theory-made-ridiculously-simple/ For a given

phone, the optimal price is \$71 Decision Making
107. ### Decision Making Conclusion Bayesian Decision Making is important to optimize

our decision to choose the best parameter on a large set of values.

112. ### 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

Build a prototype 4. Test the prototype to users Customer Development Cycle Product Hypothesis: Do they need a checkins-photo?? Data Exploration Insight Hypothesis Metrics Test Decision
113. ### 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

Build a prototype 4. Test the prototype to users Customer Development Cycle Market Hypothesis: Do we need to focus on photo only? Data Exploration Insight Hypothesis Metrics Test Decision
114. ### 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

Build a prototype 4. Test the prototype to users Customer Development Cycle Feature Question: What kind of feature do they want to make the users use our product? Data Exploration Insight Hypothesis Metrics Test Decision
115. ### 1. Do FGD/ Customer Survey 2. Do Data exploration 3.

Build a prototype 4. Test the prototype to users Customer Development Cycle Marketing Question: Do we need to rebrand our app? Data Exploration Insight Hypothesis Metrics Test Decision

119. None

124. ### Pacmann AI Classes Quality State of the Art of Machine

Learning Research Practical Skills Theoretical Understanding > 50 institutions 400++ alumni 6 Classes in the past business@pacmannai.com https://pacmann.ai