Lecture slides for Week 04 of the Saint Louis University Course Quantitative Analysis: Applied Inferential Statistics. These slides cover probability and provide a brief overview of Bayes' Theorem.
Vignettes 3 (if applicable) and 4; please open an Issue in your project repo with an update on your progress next Monday. Lab 03 and Lecture Prep 05 are due before the next lecture. There are no new R functions this week. That does not mean to take a break - review, ask questions, and get ready for next week! 1. FRONT MATTER ANNOUNCEMENTS
resources on GitHub (including replications) • Link to topic index entries that allow you to see all weeks in which specific topics were covered; package index links to documentation • Link to syllabus and lecture recordings ▸ Make sure you’re checking in with the #_news channel on Slack ▸ Post questions in #helpdesk… channels on Slack and celebrate victories in #weekly-wins • Important threads are being catalogued on the lecture webpages 1. FRONT MATTER
auto017 > > exSubset <- select(id, mfr) Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "function" 2. RECAP AND REVIEW
auto017 > > exSubset <- select(id, mfr) Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "function" > > exSubset <- select(id) Error in select(id) : object ‘id' not found 2. RECAP AND REVIEW
Rename vars x <- rename(auto17, ID = id) arrange() Reorder obs x <- arrange(auto17, hwyFE) y <- arrange(auto17, desc(hwyFE)) filter() Subset obs x <- filter(auto17, hwyFE > 25) select() Subset vars x <- select(auto17, id, hwyFE, cityFE) mutate() Modify vars x <- mutate(auto17, hiCty = ifelse(cityFE > 25, TRUE, FALSE))
%>% > arrange(hwyFE) %>% > filter(hwyFE > 25) %>% > select(ID, hwyFE, cityFE) %>% > mutate(auto17, hiCty = + ifelse(cityFE > 25, TRUE, FALSE)) -> hiHwy 2. RECAP AND REVIEW We take the auto17 data, then we rename the id variable to “ID”, then we sort the data in ascending order by highway fuel efficiency (FE), then we subset observations, retaining only those where highway FE is greater than 25, then we subset columns, retaining only the renamed “ID” plus highway and city FE, then we create a new logical variable for efficient city vehicles, and assign changes to hiHwy.
20 | hwyFE <= 30) > > autoSubset2 <- filter(auto17, hwyFE >= 30 & cityFE >= 30) 2. RECAP AND REVIEW With |, we can meet either condition (good for two conditions on the same variable). With &, we must meet both conditions (good for two conditions on different variables).
SENSE REDUCED TO CALCULUS; IT ENABLES US TO APPRECIATE WITH EXACTNESS THAT WHICH ACCURATE MINDS FEEL WITH A SORT OF INSTINCT FOR WHICH OF TIMES THEY ARE UNABLE TO ACCOUNT. Pierre-Simon Laplace (1749-1827)
“success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” 3. PROBABILITY BASICS BERNOULLI TRIAL
“success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” ▸ Jacob Bernoulli was a Swiss mathematician and professor at the University of Basel 3. PROBABILITY BASICS BERNOULLI TRIAL 1654-1705
“success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” ▸ The more trials we have, the closer we get to the expected value ▸ Demonstrated by John Edmund Kerrich in the 1940s 3. PROBABILITY BASICS BERNOULLI TRIAL
basic sampling strategies ▸ Along with Bernoulli, was instrumental in identifying the mathematical underpinning of “random” (subject to chance) processes ▸ French mathematician 3. PROBABILITY BASICS ABRAHAM DE MOIVRE 1667-1754
SOME CATASTROPHIC OUTCOME…IF YOU DRIVE HOME FROM A VAR WITH A BLOOD ALCOHOL LEVEL OF .15, THERE IS PROBABLY LESS THAN A 1 PERCENT CHANCE YOU WILL CRASH AND DIE; THAT DOES NOT MAKE IT A SENSIBLE THINGS TO DO. Charles Wheelan Naked Statistics (p. 98)
If an experiment is repeated n times under essentially identical conditions and the event A occurs m times, then as n gets large the ratio (m/n) approaches the probability of A.
and B are mutually exclusive… ▸ …and we want to know the combined probability of A and B occurring (A “union” B): A ∩ B = ∅ A B S P(A ∪ B) = P(A) + P(B)
▸ …and we we want to know the probability of A or B occurring (A “union” B): 4. PROBABILITY OPERATIONS ADDITIVE LAW P(A ∪ B) = P(A) + P(B) - P(A ∩ B) A ∪ B A B S
B = on a particular roll, the result is ≤ 4 ▸ If events A and B are not mutually exclusive… ▸ …and we we want to know the probability of A or B occurring (A “union” B): 4. PROBABILITY OPERATIONS ADDITIVE LAW P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
▸ …means “the probability of A happening if B occurs” P(A|B) P(A | B) = the probability of rolling an even number given the roll is ≤ 4 A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4
S ▸ If events A and B are not mutually exclusive… ▸ …and we we want to know the probability of both A and B occurring (A “intersect” B): P(A ∩ B) = P(B) * P(A|B)
B are not mutually exclusive… ▸ …and we we want to know the probability of both A and B occurring (A “intersect” B): P(A ∩ B) = P(B) * P(A|B) A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 P(A ∩ B) = the probability of rolling an even number and the roll is ≤ 4
of the intersection of A and B equals the product of the probabilities of A and B 4. PROBABILITY OPERATIONS INDEPENDENCE P(A ∩ B) = P(A) * P(B) A B S This is known as the joint probability
of the intersection of A and B equals the product of the probabilities of A and B 4. PROBABILITY OPERATIONS INDEPENDENCE P(A ∩ B) = P(A) * P(B) A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 Are events A and B independent? I.e. does the likelihood of A occurring impact the likelihood of B occurring?
of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
2, and are starting Vignettes 3 (if applicable) and 4; please open an Issue in your project repo with an update on your progress next Monday. Lab 03 and Lecture Prep 05 are due before the next lecture. There are no new R functions this week. That does not mean to take a break - review, ask questions, and get ready for next week!