Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Class 2: Cancer and Causation

40e37c08199ed4d3866ce6e1ff0be06d?s=47 David Evans
January 17, 2019
700

Class 2: Cancer and Causation

https://uvammm.github.io/class2

Markets, Mechanisms, and Machines
University of Virginia
cs4501/econ4559 Spring 2019
David Evans and Denis Nekipelov
https://uvammm.github.io/

40e37c08199ed4d3866ce6e1ff0be06d?s=128

David Evans

January 17, 2019
Tweet

Transcript

  1. MARKETS, MECHANISMS, MACHINES University of Virginia, Spring 2019 Class 2:

    Cancer and Causation cs4501/econ4559 Spring 2019 David Evans and Denis Nekipelov https://uvammm.github.io 17 January 2019
  2. Plan Course Why I’m Teaching this Class Causation Definitions Correlation

    Cancer 1 Everyone who wants to take the class (including unregistered students) should have a teammate for Project 1. If you don’t, talk to us after class today.
  3. Why I’m Teaching this Class 2

  4. 1. Learn about Economics 3

  5. 4

  6. 5

  7. 6

  8. 2. Economics in Security and Privacy 7

  9. Mirai Botnet 8

  10. Mirai Botnet 9 Mirai-Source-Code/mirai/bot/scanner.c

  11. Software Vulnerabilities as Externalities 10 “According to one common view,

    information security comes down to technical measures. Given better access control policy models, formal proofs of cryptographic protocols, approved firewalls, better ways of detecting intrusions and malicious code, and better tools for system evaluation and assurance, the problems can be solved. In this note, I put forward a contrary view: information insecurity is at least as much due to perverse incentives. Many, if not most, of the problems can be explained more clearly and convincingly using the language of microeconomics: network externalities, asymmetric information, moral hazard, adverse selection, liability dumping and the tragedy of the commons.”
  12. How much should we spend on security? $124B Projected 2019

    spending on information security [Gartner] 11
  13. How much should we spend on security? $124B Projected 2019

    worldwide spending on information security [Gartner] 12 $265B Apple’s 2018 Revenues $1700B Military spending worldwide (2017) US: $610B $3.5B University of Virginia, 2018 operating budget (~50% Medical) Half the money I spend on advertising is wasted; the trouble is I don't know which half. John Wanamaker
  14. Value of a Vulnerability?

  15. 3. Experimental Interdisciplinary Course 14

  16. 15

  17. 16

  18. Course Questions? 17

  19. Causation 18

  20. What is the goal of science? 19

  21. Does smoking cause cancer? 20 CC: Silberio77

  22. 21 Federal Cigarette Labeling and Advertising Act of 1966

  23. 22 FDA Proposed Warning (2011), blocked by companies/courts UK Warnings

  24. 23 (Sir) Ronald A. Fisher Nature, 30 August 1958

  25. Correlation 24 Hanging suicides US spending on science US spending

    on science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com
  26. Random Variable Definition: a random variable (e.g., !) is a

    distribution of values that is the measured outcome of some experiment. 25
  27. Random Variable Definition: a random variable (e.g., !) is a

    distribution of values that is the measured outcome of some experiment. Function from a probability space (set of possible outcomes) to a measurable space (usually a real numbers) 26 !: Ω → &
  28. Example: Playing Card Think of a playing card 27

  29. 28

  30. 29

  31. Covariance Measure of joint variability of two random variables: 30

    '()*+,*-'. !, 0 = E ! − E ! (0 − E 0 )
  32. Independent Variables don’t Covary Theorem: If ! is independent of

    0, covariance !, 0 = 0. 31 covariance !, 0 = E ! − E ! (0 − E 0 )
  33. Independent Variables don’t Covary Theorem: If ! is independent of

    0, covariance !, 0 = 0. 32 covariance !, 0 = E ! − E ! (0 − E 0 ) = E !0 − ! @ E 0 − Y @ E ! + &[!] @ &[0] = E !0] − E ! @ E 0 − E 0 @ E ! + &[!] @ &[0] = E !0 − E ! @ E 0 = 0
  34. Covariance with Itself covariance !, ! =? 33 covariance !,

    ! = E ! − E ! (! − E ! ) = E[ X − µ H] = variance(X)
  35. Population Covariance 34 '() !, 0 = I JK(LK −

    E ! )(MK − E 0 ) N KOP
  36. Measuring Correlation 35 Pearson correlation coefficient QR,S = covariance(R, S)

    TUTV
  37. Correlation 36 Hanging suicides US spending on science US spending

    on science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com r = 0.997
  38. Explanations of Correlation 37 W → X X → W

    Y → W, X
  39. 38 Hanging suicides US spending on science US spending on

    science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com http://tylervigen.com/spurious-correlations
  40. Smoking and Cancer 39

  41. 40 In England and Wales the phenomenal increase in the

    number of deaths attributed to cancer of the lung provides one of the most striking changes in the pattern of mortality recorded by the Registrar-General. For example, in the quarter of a century between 1922 and 1947 the annual number of deaths recorded increased from 612 to 9,287, or roughly fifteenfold. ... The rise seems to have been particularly rapid since the end of the first world war, between 1921- 30 and 1940-4 the death rate of men at ages 45 and over increased sixfold and of women of the same ages approximately threefold. This increase is still continuing. It has occurred, too, in Switzerland, Denmark, the U.S.A., Canada, and Australia, and has been reported from Turkey and Japan. Sir Richard Doll (1912-2005) Sir Austin Bradford Hill (1897-1991)
  42. 41

  43. 42 Study Design: arrange for hospitals to contact investigators when

    a patent is admitted with lung cancer
  44. 43 Study Design: arrange for hospitals to contact investigators when

    a patent is admitted with lung cancer interview patient about smoking also interview a non-cancer “control” patient
  45. 44

  46. 45

  47. 46

  48. “How much did you smoke before the onset of your

    present illness?” 47
  49. 48 “I gave up smoking two-thirds of the way though

    the study.” - Richard Doll
  50. Probability Test Probability that if the null hypothesis were true,

    the measured correlation would be higher than observed. 49
  51. 50 Sir Ronald Fisher (1890-1962) "...the null hypothesis is never

    proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis."
  52. More Evidence 51

  53. Fisher’s Response 52

  54. Fisher’s Response 53 smoking → '*-'.+ '*-'.+ → smoking Y

    → '*-'.+, smoking
  55. 54

  56. 55 Virginia → '*-'.+ smoking → '*-'.+ pipes ↛ '*-'.+

  57. 56 cancer → smoking

  58. 57 genotype → cancer, smoking

  59. 58 Ronald A. Fisher Galton Professor of Eugenics at University

    College London
  60. 59 2014

  61. 60 2014

  62. 61 2014

  63. What causes breast cancer? 62

  64. 63 http://www.cbcrp.org/causes/index.php

  65. 64

  66. 65

  67. 66

  68. Lessons Learned? 67

  69. Lessons Learned? 68 “Statistics has gained a place of modest

    usefulness in medical research. It can derive and retain this only by complete impartiality, which is not unattainable by rational minds. We should not be content to be “not so unfair”, for without fairness the statistician is in danger of scientific errors through his moral fault. ...”
  70. Lessons Learned? 69 “Statistics has gained a place of modest

    usefulness in medical research. It can derive and retain this only by complete impartiality, which is not unattainable by rational minds. We should not be content to be “not so unfair”, for without fairness the statistician is in danger of scientific errors through his moral fault. ...” Ronald A. Fisher Alleged Dangers of Cigarette-Smoking, British Medical Journal 1957
  71. Hill’s Lessons 70

  72. Hill’s Lessons 71 1. Strength of association 2. Consistency –

    repeated observation 3. Specificity 4. Temporality 5. Gradient: more smoking → more cancer 6. Plausibility (not required) 7. Coherence (not conflict with known facts) 8. Experiment 9. Analogy
  73. 72

  74. Charge Next class: Statistical Learning Theory Project 1: due 9:29am,

    Tuesday (Jan 22) 73 Australian Cigarette Packaging