Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Statistics 101: Everything You Learned but Tried to Forget

Statistics 101: Everything You Learned but Tried to Forget

Some people in my office wanted me to present some basic concepts in statistics. This is the presentation I gave.

Benjamin Scheich

June 22, 2017
Tweet

Other Decks in Education

Transcript

  1. Statistics 101: Everything You Learned but Tried to Forget Ben

    Scheich Associate Director, Data Analytics AWHONN August 1, 2012
  2. What in the world do statisticians do? • Who should

    we study • Gather data • Drawing conclusions • “Simplify” your data Estimation Inference Study Design Survey Design
  3. Basic Estimation Methods • Mean or “Average” • Add everything

    up and divide by the total number of items you have • John earns $30, Bill earns $100. Mean is $65 • Median-“The Middle” when arranged from smallest to largest 5,7,12,18,25
  4. When to use Mean vs. Median Name Net Worth Jon

    $15,000 Steve $25,000 Al $30,000 Estimate Measure Value Mean $23,333 Median $25,000 Name Net Worth Jon $15,000 Steve $25,000 Al $30,000 Warren Buffet $43 Billion Estimate Measure Value Mean $14.3 Billion Median $27,500
  5. Visualizing the spread of the data- Standard Deviation Figure from:

    Mwtoews at http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg Standard Deviation: • Relates to the mean • Small standard deviation- Data are close together • Large standard deviation- Data are far apart Condition Standard Deviation Value Before Warren Buffet $7,637 After Warren Buffet $21.4 B
  6. Visualizing the spread of the data- Interquartile Range Figure from:

    http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.svg Interquartile Range (IQR): • Relates to the median • Broken up into quartiles- Where 25% of the data resides Typically graphed using a boxplot Quartile Name Net Worth Jon $15,000 Q1 Steve $25,000 Q2 (median) Median $27,500 Q3 Al $30,000 Warren Buffet $43 Billion $15K $27.5K $25K $30K $43B Q1 Q2 Q3
  7. Confidence Intervals • Confidence Intervals (CI): When we are estimating

    something that is unknown in a study, we like to set boundaries around our estimate to help us understand the unknown item. • Let’s look at another study: • What is the mean age of everyone who works at 2000 L Street NW in Washington DC. • We conduct a survey and ask 50 people their age. • Sample mean is 35.7 • Is this the actual mean of everyone at 2000 L St NW? (our population) • No, but we don’t have time to go and interview every person, so we set up a confidence interval. • CI is based on the standard error, which relates to the standard deviation of your sample along with the number of people you are sampling. • Usually confidence intervals are 95% CI.
  8. Confidence Intervals, continued • Let’s say our 95% CI for

    our mean age at 2000 L St NW is between 32.7 and 38.7. What does this mean? • Our “best guess” for our study is that the true population mean lies within the 95% Confidence Interval of 32.7-38.7 • If we repeated our survey 100 times, 95 times out a 100 we would have an estimate where our CI contains the true population mean.
  9. Basic Inference Methods • Inference: Used to draw a conclusion

    about data • Statisticians draw conclusions by testing a hypothesis • What is a scientific hypothesis? • It is a guess about what you think will happen with your data based upon past observations. Almost always it is something that is not interesting in the data. It is often called the Null Hypothesis. • For example: • Null Hypothesis: Al and Frank type at the same rate • Null Hypothesis: Drug A and Drug B reduce breast cancer tumors by the same amount • Null Hypothesis: Al and Joe both run the 100 m dash in the same amount of time • Then we test whether the null hypothesis is true
  10. Distributions • To test these hypotheses, we compare “distributions” •

    But what is a distribution? “normal” distribution Poisson distribution
  11. The goal of inference: Al vs. Joe • We want

    to know, is Al’s mean running time for the 100m dash different than Joe’s mean running time for the same race? • In statistical notation: • Null Hypothesis (H0 ): Al and Joe run the 100 m dash in the same amount of time (they have the same mean running time). • Alternative Hypothesis (HA ): Al and Joe do not run the 100 m dash in the same amount of time (their mean running times are not equal).
  12. Let’s experiment Day Al time (s) Joe time (s) 1

    12.10 11.02 2 12.95 11.53 3 12.49 13.10 4 13.10 12.79 5 12.56 11.95 6 12.54 12.29 7 12.24 11.29 8 13.25 11.48 9 12.69 12.60 10 13.83 12.04 Assume every day is independent to every other day
  13. Formalizing our testing • We want to test our hypothesis

    in a way that is standardized so others can follow and reproduce our results • One of the easiest tests is a t test • In a t test, you compare whether the mean of one group (Al’s running times) is equal to the mean of another group (Joe’s running times) • Remember, our Null Hypothesis (H0 ): Al and Joe run the 100 m dash in the same amount of time (they have the same mean running time). • Our significance level is α of 0.05
  14. Our Results • Al mean running time: 12.78 seconds •

    Joe mean running time: 12.01 seconds • P value: 0.01 or 1% • Are they the same? • To determine this, we have to understand the concept of the p-value: • P-value: “The p-value is the probability that the data would be at least as extreme as those observed if the null hypothesis were true.” (Vickers 2010) • ?
  15. P-Value • What we are saying here is this: •

    There is a 1% probability that we would see data at least as extreme as the data we collected from Al and Joe and they would still have the same mean 100 m dash running time. • In other words: Al and Joe may have the same mean running time, but according to our statistical methods, we would only see data like these 1/100 times if that were the case. • So what do we say? • Since 1/100 is not very likely, we reject our null hypothesis that the mean running times of Al and Joe are equal and accept our alternative hypothesis that they are not equal. • Conclusion: Al and Joe do not have equal mean running times of the 100 m dash • But…… • Could we be wrong?
  16. Could we be wrong? • Absolutely • When we collected

    our sample, this could have been the 1% of the time where the data are very extreme…we are not 100% certain. • But, based upon our statistical significance level of 0.05, we feel confident enough to reject our null hypothesis • And anyone reading our research will know our p-value=0.01 so they can draw the same conclusion as we can.
  17. Errors in testing • Type I Error: Reject the null

    hypothesis when we should not have rejected it. • Example: Let’s say Al and Joe do have the same mean running times for the 100 m dash, but we said that they do not and we made a mistake. We made a Type I error (Probability of making a Type I error is call α). • Type II Error: Fail to reject the null hypothesis when we should have rejected it. • Example: Let’s say that Al and Joe do not have the same mean running times for the 100 m dash, but we said that they do have the same mean running times. We made a Type II error (Probability of making a Type II error is called β).
  18. Power Analysis • Power: Probability of rejecting the null hypothesis

    when the null hypothesis is false. • Power is 1-β (also known as sensitivity) • Power depends on variation and sample size • A “Power Analysis” is what typically helps determine how many subjects should be sampled in a study. • “The findings were not significant because we believe the study was underpowered.” • Translation: The researchers didn’t have enough samples in the study, probably because of budget constraints, and that is why they did not get a p-value low enough to reject the null hypothesis.
  19. Other Inference Techniques • T-tests are not the only test

    you can perform • Chi-Squared tests • Wilcoxon Rank Sum test • ANOVA test • Fischer’s Exact test • And many more
  20. Data Visualization • With the amount of available data and

    the “democratization” of analysis, being able to manipulate and visualize data is a major competitive advantage. • Definite trend in taking the analysis that statisticians perform and working with designers to visualize it. • Some of my favorite data visualizers: • Edward Tufte (http://www.edwardtufte.com/) • Nathan Yau (http://flowingdata.com) • Nancy Duarte (http://www.duarte.com/) • Hans Rosling (http://www.gapminder.org/) • Ben Schneiderman (http://www.cs.umd.edu/~ben/)
  21. Edward Tufte- Sparklines “small, high resolution graphics embedded in a

    context of words, numbers, images" Edward Tufte (May 27, 2004). "Sparkline theory and practice". Edward Tufte forum. en.wikipedia.org/wiki/Sparkline http://www.edwardtufte.com/ bboard/q-and-a-fetch- msg?msg_id=0001OR
  22. Q & A • Risk Ratio & % increase in

    risk • Probability of exposure vs. Probability of non-exposure • 20 times as many smokers have cancer compared to non-smokers • Odds Ratio (O.R.) • Odds of outcome with experimental group vs. odds of outcome with control group • Typically used in clinical trials and with logistic regression • OR of 2 means the experimental group has twice the odds of the outcome compared to the control group.
  23. Another Example • Say you are on a criminal jury

    trial for a man who pickpocketed a wallet: • Null Hypothesis: The man is innocent • Probability of committing a Type I error (α): “Beyond a reasonable doubt.” You set your alpha level based upon what will get you to beyond a reasonable doubt. If you set it wrong, you could send an innocent man to prison. • Probability of committing a Type II error (β): Can be adjusted as you receive more evidence in the case. For example if you have 1 witness claim to see the man pickpocket, that may not be enough to reject the null hypothesis. If you have 100 witnesses claim to see a man pickpocket, you have reduced the probability of a Type II error. • Why do we care?
  24. Statistics- Why it matters to all of us? done vs.

    dran Grainger, J., Dufau, S., Montant, M., Ziegler, J. C., & Fagot, J. (2012). Orthographic processing in baboons (Papio papio). Science (New York, N.Y.), 336(6078), 245–248. doi:10.1126/science.1218152