Upgrade to Pro — share decks privately, control downloads, hide ads and more …

SOC 4930 & SOC 5050 - Week 04

SOC 4930 & SOC 5050 - Week 04

Lecture slides for Week 04 of the Saint Louis University Course Quantitative Analysis: Applied Inferential Statistics. These slides cover probability and provide a brief overview of Bayes' Theorem.

Christopher Prener

September 18, 2017
Tweet

More Decks by Christopher Prener

Other Decks in Education

Transcript

  1. WELCOME! GETTING STARTED There is an entry ticket to complete

    (link posted in Slack’s #_news channel)
  2. AGENDA QUANTITATIVE ANALYSIS / WEEK 04 / LECTURE 04 1.

    Front Matter 2. Recap and Review 3. Probability Basics 4. Probability Operations 5. Bayes’ Theorem 6. Back Matter
  3. ⋆ THEME We want to think 
 systematically about the


    likelihood of observing 
 particular outcomes.
  4. Make sure you have completed Vignette 2, and are starting

    Vignettes 3 (if applicable) and 4; please open an Issue in your project repo with an update on your progress next Monday. Lab 03 and Lecture Prep 05 are due before the next lecture. There are no new R functions this week. That does not mean to take a break - review, ask questions, and get ready for next week! 1. FRONT MATTER ANNOUNCEMENTS
  5. RESOURCES REMINDERS ▸ Course website pages: • Link to specific

    resources on GitHub (including replications) • Link to topic index entries that allow you to see all weeks in which specific topics were covered; package index links to documentation • Link to syllabus and lecture recordings ▸ Make sure you’re checking in with the #_news channel on Slack ▸ Post questions in #helpdesk… channels on Slack and celebrate victories in #weekly-wins • Important threads are being catalogued on the lecture webpages 1. FRONT MATTER
  6. 2. RECAP AND REVIEW GIT WORKFLOW Local repos can stay

    in sync with a remote repo, making backup and sharing easy
  7. 2. RECAP AND REVIEW GIT WORKFLOW Copying data for the

    first time from 
 GitHub is called making a clone.
  8. REPREX REDUX > library(dplyr) > library(testDriveR) > > ex <-

    auto017 > > exSubset <- select(id, mfr) Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "function" 2. RECAP AND REVIEW
  9. REPREX REDUX > library(dplyr) > library(testDriveR) > > ex <-

    auto017 > > exSubset <- select(id, mfr) Error in UseMethod("select_") : no applicable method for 'select_' applied to an object of class "function" > > exSubset <- select(id) Error in select(id) : object ‘id' not found 2. RECAP AND REVIEW
  10. REPREX REDUX > library(dplyr) > > ex <- starwars >

    > exSubset <- select(name) Error in select(name) : object 'name' not found 2. RECAP AND REVIEW
  11. 2. RECAP AND REVIEW DPLYR VERBS verb() purpose example rename()

    Rename vars x <- rename(auto17, ID = id) arrange() Reorder obs x <- arrange(auto17, hwyFE) y <- arrange(auto17, desc(hwyFE)) filter() Subset obs x <- filter(auto17, hwyFE > 25) select() Subset vars x <- select(auto17, id, hwyFE, cityFE) mutate() Modify vars x <- mutate(auto17, hiCty = 
 ifelse(cityFE > 25, TRUE, FALSE))
  12. COMBINING WITH PIPES > auto17 %>% > rename(ID = id)

    %>% > arrange(hwyFE) %>% > filter(hwyFE > 25) %>% > select(ID, hwyFE, cityFE) %>% > mutate(auto17, hiCty = + ifelse(cityFE > 25, TRUE, FALSE)) -> hiHwy 2. RECAP AND REVIEW
  13. COMBINING WITH PIPES > auto17 %>% > rename(ID = id)

    %>% > arrange(hwyFE) %>% > filter(hwyFE > 25) %>% > select(ID, hwyFE, cityFE) %>% > mutate(auto17, hiCty = + ifelse(cityFE > 25, TRUE, FALSE)) -> hiHwy 2. RECAP AND REVIEW We take the auto17 data, then
 we rename the id variable to “ID”, then
 we sort the data in ascending order by highway fuel efficiency (FE), then
 we subset observations, retaining only those where highway FE is greater than 25, then
 we subset columns, retaining only the renamed “ID” plus highway and city FE, then
 we create a new logical variable for efficient city vehicles, and assign changes to hiHwy.
  14. LOGICAL & RELATIONAL OPERATORS > autoSubset <- filter(auto17, hwyFE >=

    20 | hwyFE <= 30) > > autoSubset2 <- filter(auto17, hwyFE >= 30 & cityFE >= 30) 2. RECAP AND REVIEW With |, we can meet either condition (good for two conditions on the same variable). With &, we must meet both conditions (good for two conditions on different variables).
  15. IN THE LONG RUN, WE ARE 
 ALL DEAD. John

    Maynard Keynes (1883-1946)
  16. THE THEORY OF PROBABILITIES IS AT BOTTOM NOTHING BUT COMMON

    SENSE REDUCED TO CALCULUS; IT ENABLES US TO APPRECIATE WITH EXACTNESS THAT WHICH ACCURATE MINDS FEEL WITH A SORT OF INSTINCT FOR WHICH OF TIMES THEY ARE UNABLE TO ACCOUNT. Pierre-Simon Laplace (1749-1827)
  17. ▸ A trial where there are only two outcomes -

    “success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” 3. PROBABILITY BASICS BERNOULLI TRIAL
  18. ▸ A trial where there are only two outcomes -

    “success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” ▸ Jacob Bernoulli was a Swiss mathematician and professor at the University of Basel 3. PROBABILITY BASICS BERNOULLI TRIAL 1654-1705
  19. ▸ A trial where there are only two outcomes -

    “success” and “failure” ▸ Over the long run (law of large numbers), there is a 50% chance of “success” and a 50% chance of “failure” ▸ The more trials we have, the closer we get to the expected value ▸ Demonstrated by John Edmund Kerrich in the 1940s 3. PROBABILITY BASICS BERNOULLI TRIAL
  20. ▸ Demonstrated the probabilities of things like rolling dice and

    basic sampling strategies ▸ Along with Bernoulli, was instrumental in identifying the mathematical underpinning of “random” (subject to chance) processes ▸ French mathematician 3. PROBABILITY BASICS ABRAHAM DE MOIVRE 1667-1754
  21. UNLIKELY THINGS HAPPEN. IN FACT, OVER A LONG ENOUGH PERIOD

    OF TIME, THEY ARE NOT EVEN THAT UNLIKELY. PEOPLE GET HIT BY LIGHTENING ALL THE TIME. Charles Wheelan Naked Statistics
 (p. 99)
  22. VERY LITTLE ATTENTION WAS DEVOTED TO… THE SMALL RISK OF

    SOME CATASTROPHIC OUTCOME…IF YOU DRIVE HOME FROM A VAR WITH A BLOOD ALCOHOL LEVEL OF .15, THERE IS PROBABLY LESS THAN A 1 PERCENT CHANCE YOU WILL CRASH AND DIE; THAT DOES NOT MAKE IT A SENSIBLE THINGS TO DO. Charles Wheelan Naked Statistics
 (p. 98)
  23. ▸ Assuming events are independent when they aren’t ▸ Example

    - SIDS and Meadow’s Law, see Wheelan pp. 100-102 3. PROBABILITY BASICS COMMON PITFALLS
  24. 3. PROBABILITY BASICS COMMON PITFALLS ▸ Not understanding when events

    are independent ▸ Examples - free throws and the “hot hand” (see Wheelan pp. 102-103)
  25. 3. PROBABILITY BASICS COMMON PITFALLS ▸ Clusters do happen ▸

    Examples - getting multiple heads in a row (see Wheelan pp. 103-104)
  26. 3. PROBABILITY BASICS COMMON PITFALLS ▸ The Prosecutor’s fallacy ▸

    Examples - DNA databases (see Wheelan pp. 104-105)
  27. ▸ “A intersect B” … ▸ …means “both A and

    B” 4. PROBABILITY OPERATIONS INTERSECT A ∩ B A = “roll a 1” B = “roll a 3” A ∩ B = “roll a 1 and a 3”
  28. 4. PROBABILITY OPERATIONS INTERSECT A ∩ B A ∩ B

    A B S ▸ “A intersect B” … ▸ …means “both A and B”
  29. 4. PROBABILITY OPERATIONS UNION A ∪ B A = “roll

    a 2” B = “roll a 10” A ∪ B = “roll a 2 or a 10” ▸ “A union B” … ▸ …means “either A or B or both”
  30. 4. PROBABILITY OPERATIONS UNION A ∪ B A ∪ B

    A B S ▸ “A union B” … ▸ …means “either A or B or both”
  31. 4. PROBABILITY OPERATIONS COMPLIMENT Ac A = “roll a 1”

    Ac = “roll anything else” ▸ “A compliment” … ▸ …means “either A or not A”
  32. 4. PROBABILITY OPERATIONS COMPLIMENT Ac A Ac S ▸ “A

    compliment” … ▸ …means “either A or not A”
  33. ▸ A null event ▸ “Cannot happen” or “contradiction” 4.

    PROBABILITY OPERATIONS MUTUALLY EXCLUSIVE EVENTS ∅ A = “roll a 1” Ac = “roll anything else” A ∩ Ac = ∅
  34. MUTUALLY EXCLUSIVE EVENTS 4. PROBABILITY OPERATIONS A ∩ B =

    ∅ A B S ▸ A null event ▸ “Cannot happen” or “contradiction” ∅
  35. 4. PROBABILITY OPERATIONS MUTUALLY EXCLUSIVE EVENTS A ∩ B =

    ∅ A B S C A ∩ C = ∅ B ∩ C = ∅
  36. 4. PROBABILITY OPERATIONS DEFINITION OF PROBABILITY P(A) = m n

    If an experiment is repeated n times under essentially identical conditions and the event A occurs m times, then as n gets large the ratio (m/n) approaches the probability of A.
  37. 4. PROBABILITY OPERATIONS DEFINITION OF PROBABILITY 1.
 2.For any event

    A: a. 
 b. & P(impossible) = 0 n = 0 P(surething) = n n = 1 m ≤ n 0 ≤ P(A) ≤ 1
  38. 4. PROBABILITY OPERATIONS DEFINITION OF PROBABILITY 3.Compliment: a. 
 b.

    
 c. P(A) = m n P(Ac) = n − m n = 1 − P(A) P(A) + P(Ac) = 1
  39. 4. PROBABILITY OPERATIONS MUTUALLY EXCLUSIVE EVENTS ▸ If events A

    and B are mutually exclusive… ▸ …and we want to know the combined probability of A and B occurring (A “union” B): A ∩ B = ∅ A B S P(A ∪ B) = P(A) + P(B)
  40. ▸ If events A and B are not mutually exclusive…

    ▸ …and we we want to know the probability of A or B occurring 
 (A “union” B): 4. PROBABILITY OPERATIONS ADDITIVE LAW P(A ∪ B) = P(A) + P(B) - P(A ∩ B) A ∪ B A B S
  41. A = roll an even number with a fair die

    B = on a particular roll, the result is ≤ 4 ▸ If events A and B are not mutually exclusive… ▸ …and we we want to know the probability of A or B occurring 
 (A “union” B): 4. PROBABILITY OPERATIONS ADDITIVE LAW P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
  42. A = roll an even number with a fair die

    B = on a particular roll, the result is ≤ 4 4. PROBABILITY OPERATIONS ADDITIVE LAW A B S 1 3 5 6 4 2 Solving for A or B
  43. 4. PROBABILITY OPERATIONS ADDITIVE LAW The probability of A (a

    even number) or B (a roll ≤ 4) is or .833. A B S 1 3 5 6 4 2 Solving for A or B
  44. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY ▸ “A given B” …

    ▸ …means “the probability of A happening if B occurs” P(A|B) S B A
  45. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY ▸ “A given B” …

    ▸ …means “the probability of A happening if B occurs” P(A|B) S A B
  46. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY ▸ “A given B” …

    ▸ …means “the probability of A happening if B occurs” P(A|B) P(A | B) = the probability of rolling an even number given the roll is ≤ 4 A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4
  47. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY A B S 1 3

    5 6 4 2 A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 Solving for the intersect given B
  48. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY A B S 1 3

    5 6 4 2 Solving for the intersect given B
  49. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY A B S 1 3

    5 6 4 2 Solving for the intersect given B
  50. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY A B S 1 3

    5 6 4 2 Solving for the intersect given B
  51. 4. PROBABILITY OPERATIONS CONDITIONAL PROBABILITY P(A | B) = the

    probability of rolling an even number given the roll is ≤ 4 is or .5. A B S 1 3 5 6 4 2 Solving for the intersect given B
  52. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A ∩ B A B

    S ▸ If events A and B are not mutually exclusive… ▸ …and we we want to know the probability of both A and B occurring (A “intersect” B): P(A ∩ B) = P(B) * P(A|B)
  53. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW ▸ If events A and

    B are not mutually exclusive… ▸ …and we we want to know the probability of both A and B occurring (A “intersect” B): P(A ∩ B) = P(B) * P(A|B) A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 P(A ∩ B) = the probability of rolling an even number and the roll is ≤ 4
  54. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A B S 1 3

    5 6 4 2 A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 Solving for the intersect
  55. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A B S 1 3

    5 6 4 2 Solving for the intersect (a)
  56. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A B S 1 3

    5 6 4 2 Solving for the intersect (a) (b)
  57. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A B S 1 3

    5 6 4 2 Solving for the intersect (a) (b)
  58. 4. PROBABILITY OPERATIONS MULTIPLICATIVE LAW A B S 1 3

    5 6 4 2 Solving for the intersect P(A ∩ B) = the probability of rolling an even number and the roll is ≤ 4 is or .333.
  59. ▸ If events A and B are independent, the probability

    of the intersection of A and B equals the product of the probabilities of A and B 4. PROBABILITY OPERATIONS INDEPENDENCE P(A ∩ B) = P(A) * P(B) A B S This is known as the joint probability
  60. ▸ If events A and B are independent, the probability

    of the intersection of A and B equals the product of the probabilities of A and B 4. PROBABILITY OPERATIONS INDEPENDENCE P(A ∩ B) = P(A) * P(B) A = roll an even number with fair dice B = on a particular roll, the result is ≤ 4 Are events A and B independent? I.e. does the likelihood of A occurring impact the likelihood of B occurring?
  61. 5. BAYES’ THEOREM BAYES THEOREM ▸ P(A) = your prior

    and P(B) = another event ▸ P(B|A) = probability of event given your prior ▸ P(A|B) = posterior probability - degree of belief accounting for B
  62. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED ▸ x = your

    prior ▸ y = probability of event given that the hypothesis is true ▸ z = probability of event given that the hypothesis is false posterior probability
  63. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Initial estimate

    of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
  64. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Initial estimate

    of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
  65. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Initial estimate

    of how likely it is that terrorists would crash planes into Manhattan skyscrapers. x 0.005% A Event Occurs: First Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 38%
  66. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Updated estimate

    of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
  67. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Updated estimate

    of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
  68. 5. BAYES’ THEOREM BAYES THEOREM SIMPLIFIED Prior Probability Updated estimate

    of terrorists crashing planes into skyscrapers given one crash has occurred. x 38% A Event Occurs: Second Plane Strike Probability of plane hitting if terrorists are attacking Manhattan skyscrapers. y 100% Probability of plane hitting if terrorists are not attacking Manhattan skyscrapers. z 0.008% Posterior Probability Revised estimate of probability of terror attack, given first plane hitting the World Trade Center. 99.99%
  69. AGENDA REVIEW 6. BACK MATTER 2. Recap and Review 3.

    Probability Basics 4. Probability Operations 5. Bayes’ Theorem
  70. REMINDERS 6. BACK MATTER Make sure you have completed Vignette

    2, and are starting Vignettes 3 (if applicable) and 4; please open an Issue in your project repo with an update on your progress next Monday. Lab 03 and Lecture Prep 05 are due before the next lecture. There are no new R functions this week. That does not mean to take a break - review, ask questions, and get ready for next week!