Statistics 101: Everything You Learned but Tried to Forget

Statistics 101: Everything You Learned but Tried to Forget Ben
Scheich Associate Director, Data Analytics AWHONN August 1, 2012

What’s your goal?

What & Why Of Statistics Estimation Visualization Inference Our Agenda

Warning about this presentation

What in the world do statisticians do? • Who should
we study • Gather data • Drawing conclusions • “Simplify” your data Estimation Inference Study Design Survey Design

Statistics- Why it matters to nursing?

Statistics- Why it matters to you?

Statistics is really about….

Basic Estimation Methods • Mean or “Average” • Add everything
up and divide by the total number of items you have • John earns $30, Bill earns $100. Mean is $65 • Median-“The Middle” when arranged from smallest to largest 5,7,12,18,25

When to use Mean vs. Median Name Net Worth Jon
$15,000 Steve $25,000 Al $30,000 Estimate Measure Value Mean $23,333 Median $25,000 Name Net Worth Jon $15,000 Steve $25,000 Al $30,000 Warren Buffet $43 Billion Estimate Measure Value Mean $14.3 Billion Median $27,500

Visualizing the spread of the data- Standard Deviation Figure from:
Mwtoews at http://en.wikipedia.org/wiki/File:Standard_deviation_diagram.svg Standard Deviation: • Relates to the mean • Small standard deviation- Data are close together • Large standard deviation- Data are far apart Condition Standard Deviation Value Before Warren Buffet $7,637 After Warren Buffet $21.4 B

Visualizing the spread of the data- Interquartile Range Figure from:
http://en.wikipedia.org/wiki/File:Boxplot_vs_PDF.svg Interquartile Range (IQR): • Relates to the median • Broken up into quartiles- Where 25% of the data resides Typically graphed using a boxplot Quartile Name Net Worth Jon $15,000 Q1 Steve $25,000 Q2 (median) Median $27,500 Q3 Al $30,000 Warren Buffet $43 Billion $15K $27.5K $25K $30K $43B Q1 Q2 Q3

Confidence Intervals • Confidence Intervals (CI): When we are estimating
something that is unknown in a study, we like to set boundaries around our estimate to help us understand the unknown item. • Let’s look at another study: • What is the mean age of everyone who works at 2000 L Street NW in Washington DC. • We conduct a survey and ask 50 people their age. • Sample mean is 35.7 • Is this the actual mean of everyone at 2000 L St NW? (our population) • No, but we don’t have time to go and interview every person, so we set up a confidence interval. • CI is based on the standard error, which relates to the standard deviation of your sample along with the number of people you are sampling. • Usually confidence intervals are 95% CI.

Confidence Intervals, continued • Let’s say our 95% CI for
our mean age at 2000 L St NW is between 32.7 and 38.7. What does this mean? • Our “best guess” for our study is that the true population mean lies within the 95% Confidence Interval of 32.7-38.7 • If we repeated our survey 100 times, 95 times out a 100 we would have an estimate where our CI contains the true population mean.

Basic Inference Methods • Inference: Used to draw a conclusion
about data • Statisticians draw conclusions by testing a hypothesis • What is a scientific hypothesis? • It is a guess about what you think will happen with your data based upon past observations. Almost always it is something that is not interesting in the data. It is often called the Null Hypothesis. • For example: • Null Hypothesis: Al and Frank type at the same rate • Null Hypothesis: Drug A and Drug B reduce breast cancer tumors by the same amount • Null Hypothesis: Al and Joe both run the 100 m dash in the same amount of time • Then we test whether the null hypothesis is true

Distributions • To test these hypotheses, we compare “distributions” •
But what is a distribution? “normal” distribution Poisson distribution

The goal of inference: Al vs. Joe • We want
to know, is Al’s mean running time for the 100m dash different than Joe’s mean running time for the same race? • In statistical notation: • Null Hypothesis (H0 ): Al and Joe run the 100 m dash in the same amount of time (they have the same mean running time). • Alternative Hypothesis (HA ): Al and Joe do not run the 100 m dash in the same amount of time (their mean running times are not equal).

Let’s experiment Day Al time (s) Joe time (s) 1
12.10 11.02 2 12.95 11.53 3 12.49 13.10 4 13.10 12.79 5 12.56 11.95 6 12.54 12.29 7 12.24 11.29 8 13.25 11.48 9 12.69 12.60 10 13.83 12.04 Assume every day is independent to every other day

Our Results- What does this mean?

Formalizing our testing • We want to test our hypothesis
in a way that is standardized so others can follow and reproduce our results • One of the easiest tests is a t test • In a t test, you compare whether the mean of one group (Al’s running times) is equal to the mean of another group (Joe’s running times) • Remember, our Null Hypothesis (H0 ): Al and Joe run the 100 m dash in the same amount of time (they have the same mean running time). • Our significance level is α of 0.05

Our Results • Al mean running time: 12.78 seconds •
Joe mean running time: 12.01 seconds • P value: 0.01 or 1% • Are they the same? • To determine this, we have to understand the concept of the p-value: • P-value: “The p-value is the probability that the data would be at least as extreme as those observed if the null hypothesis were true.” (Vickers 2010) • ?

P-Value • What we are saying here is this: •
There is a 1% probability that we would see data at least as extreme as the data we collected from Al and Joe and they would still have the same mean 100 m dash running time. • In other words: Al and Joe may have the same mean running time, but according to our statistical methods, we would only see data like these 1/100 times if that were the case. • So what do we say? • Since 1/100 is not very likely, we reject our null hypothesis that the mean running times of Al and Joe are equal and accept our alternative hypothesis that they are not equal. • Conclusion: Al and Joe do not have equal mean running times of the 100 m dash • But…… • Could we be wrong?

Could we be wrong? • Absolutely • When we collected
our sample, this could have been the 1% of the time where the data are very extreme…we are not 100% certain. • But, based upon our statistical significance level of 0.05, we feel confident enough to reject our null hypothesis • And anyone reading our research will know our p-value=0.01 so they can draw the same conclusion as we can.

Errors in testing • Type I Error: Reject the null
hypothesis when we should not have rejected it. • Example: Let’s say Al and Joe do have the same mean running times for the 100 m dash, but we said that they do not and we made a mistake. We made a Type I error (Probability of making a Type I error is call α). • Type II Error: Fail to reject the null hypothesis when we should have rejected it. • Example: Let’s say that Al and Joe do not have the same mean running times for the 100 m dash, but we said that they do have the same mean running times. We made a Type II error (Probability of making a Type II error is called β).

Power Analysis • Power: Probability of rejecting the null hypothesis
when the null hypothesis is false. • Power is 1-β (also known as sensitivity) • Power depends on variation and sample size • A “Power Analysis” is what typically helps determine how many subjects should be sampled in a study. • “The findings were not significant because we believe the study was underpowered.” • Translation: The researchers didn’t have enough samples in the study, probably because of budget constraints, and that is why they did not get a p-value low enough to reject the null hypothesis.

Other Inference Techniques • T-tests are not the only test
you can perform • Chi-Squared tests • Wilcoxon Rank Sum test • ANOVA test • Fischer’s Exact test • And many more

Data Visualization • With the amount of available data and
the “democratization” of analysis, being able to manipulate and visualize data is a major competitive advantage. • Definite trend in taking the analysis that statisticians perform and working with designers to visualize it. • Some of my favorite data visualizers: • Edward Tufte (http://www.edwardtufte.com/) • Nathan Yau (http://flowingdata.com) • Nancy Duarte (http://www.duarte.com/) • Hans Rosling (http://www.gapminder.org/) • Ben Schneiderman (http://www.cs.umd.edu/~ben/)

Edward Tufte- Sparklines “small, high resolution graphics embedded in a
context of words, numbers, images" Edward Tufte (May 27, 2004). "Sparkline theory and practice". Edward Tufte forum. en.wikipedia.org/wiki/Sparkline http://www.edwardtufte.com/ bboard/q-and-a-fetch- msg?msg_id=0001OR

Nathan Yau- Flowingdata.com

Nancy Durate

Hans Rosling- Gapminder

Ben Schneiderman-Treemaps http://www.smart money.com/map- of-the-market/

Q & A • Risk Ratio & % increase in
risk • Probability of exposure vs. Probability of non-exposure • 20 times as many smokers have cancer compared to non-smokers • Odds Ratio (O.R.) • Odds of outcome with experimental group vs. odds of outcome with control group • Typically used in clinical trials and with logistic regression • OR of 2 means the experimental group has twice the odds of the outcome compared to the control group.

Thank you!

Another Example • Say you are on a criminal jury
trial for a man who pickpocketed a wallet: • Null Hypothesis: The man is innocent • Probability of committing a Type I error (α): “Beyond a reasonable doubt.” You set your alpha level based upon what will get you to beyond a reasonable doubt. If you set it wrong, you could send an innocent man to prison. • Probability of committing a Type II error (β): Can be adjusted as you receive more evidence in the case. For example if you have 1 witness claim to see the man pickpocket, that may not be enough to reject the null hypothesis. If you have 100 witnesses claim to see a man pickpocket, you have reduced the probability of a Type II error. • Why do we care?

Statistics- Why it matters to all of us? done vs.
dran Grainger, J., Dufau, S., Montant, M., Ziegler, J. C., & Fagot, J. (2012). Orthographic processing in baboons (Papio papio). Science (New York, N.Y.), 336(6078), 245–248. doi:10.1126/science.1218152

Statistics 101: Everything You Learned but Tri...

Statistics 101: Everything You Learned but Tried to Forget

Other Decks in Education

Featured

Transcript