210

# Introduction to Statistics with Python

April 16, 2016

## Transcript

3. ### THE STORY OF THIS TALK: 3 DAYS BEFORE THE CONFERENCE

VALERIO: CHRISTIAN, WE HAVE A FREE SLOT AND WE NEED A TALK CHRISTIAN: I CAN’T IN 3 DAYS… VALERIO: YOU MUST.
4. ### CONTENT 1. What is STATISTICS ? 2. Variable types 3.

Univariate distribution 4. Frequencies 5. M^3 (Mean, Median, Mode) 6. Variance and Standard Deviation 7. Multivariate distribution 8. Covariance and Correlation

6. ### — Oxford English Dictionary …. THE BRANCH OF SCIENCE OR

MATHEMATICS CONCERNED WITH THE ANALYSIS AND INTERPRETATION OF NUMERICAL DATA AND APPROPRIATE WAYS OF GATHERING SUCH DATA. ” “
7. ### — American Statistical Association STATISTICS IS THE SCIENCE OF LEARNING

FROM DATA, AND OF MEASURING, CONTROLLING, AND COMMUNICATING UNCERTAINTY; AND IT THEREBY PROVIDES THE NAVIGATION ESSENTIAL FOR CONTROLLING THE COURSE OF SCIENTIFIC AND SOCIETAL ADVANCES ” “
8. ### — John Tukey, Bell Labs, Princeton University THE BEST THING

ABOUT BEING A STATISTICIAN IS THAT YOU GET TO PLAY IN EVERYONE ELSE'S BACKYARD. ” “
9. ### — Mark Twain THERE ARE THREE KINDS OF LIES: LIES,

DAMNED LIES, AND STATISTICS. ” “

11. ### 4 KINDS OF VARIABLES • QUANTITATIVE VARIABLES • CONTINUOUS •

DISCRETE • CATEGORICAL VARIABLES • ORDINAL • NOMINAL

27 AND 28 ?
16. ### THE TYPE OF A VARIABLE SOMETIMES IS NOT STRICTLY RELATED

TO THE VALUE THAT ASSUMES

CONFERENCE ?

21. None

24. ### DIFFERENT TYPES OF FREQUENCY • ABSOLUTE FREQUENCY (ni): number of

observation for each of the “OBSERVATIONAL UNIT“ • ABSOLUTE CUMULATIVE FREQUENCY (Ni): Ni = Ni-1 + ni • RELATIVE FREQUENCY (fi): number of observations for each of the “OBSERVATIONAL UNIT“ divided by the total number of observations (N) • RELATIVE CUMULATIVE FREQUENCY (Fi): Fi = Fi-1 + fi • % FREQUENCY: fi * 100 • % CUMULATIVE FREQUENCY: Fi * 100

27. None
28. ### 3 MAIN CONCEPTS • OBSERVATIONAL UNITS: entities whose characteristics we

measure or observe (ALIAS ROWS) • VARIABLE: feature, characteristic of the OBSERVATIONAL UNITS (ALIAS COLUMNS) • FREQUENCY: Number of OBSERVATIONAL UNITS with the same value of a VARIABLE
29. ### import numpy as np import pandas as pd import matplotlib.pyplot

as plt %matplotlib inline univariate = pd.DataFrame(df["Product (X1)"].value_counts()) univariate.columns = ["Absolute Frequency (ni)"] univariate
30. None

35. ### df.mean() Price (X3) 28.051205 Margin (X5) 15.525602 Stock (X6) 12.293333

dtype: float64

39. None

43. None

46. ### df.median() Price (X3) 22.652655 Margin (X5) 12.826328 Stock (X6) 12.000000

dtype: float64
47. None
48. ### univariate_stocks = pd.DataFrame(df["Stock (X6)"].value_counts()) univariate_stocks = univariate_stocks.sort_index() univariate_stocks.columns = ["Absolute

Frequency (ni)"] univariate_stocks["Relative Frequency (fi)"] = univariate_stocks["Absolute Frequency (ni)"]/ univariate_stocks["Absolute Frequency (ni)"].sum() univariate_stocks['Relative Cumulative Frequency (Fi)'] = univariate_stocks['Relative Frequency (fi)'].cumsum() univariate_stocks

STATISTICS

59. ### LIKE YOU ARE 3 STD FROM THE MEAN (NERDY WAY

TO SAY YOU ARE UNIQUE)

71. ### AT THE END WE HAVE 13 CONDITIONED MEANS/VARIANCES AND 2

MARGINAL MEANS/VARIANCES
72. None

76. None
77. None

TO 1