Slide 1

Slide 1 text

MARKETS, MECHANISMS, MACHINES University of Virginia, Spring 2019 Class 2: Cancer and Causation cs4501/econ4559 Spring 2019 David Evans and Denis Nekipelov https://uvammm.github.io 17 January 2019

Slide 2

Slide 2 text

Plan Course Why I’m Teaching this Class Causation Definitions Correlation Cancer 1 Everyone who wants to take the class (including unregistered students) should have a teammate for Project 1. If you don’t, talk to us after class today.

Slide 3

Slide 3 text

Why I’m Teaching this Class 2

Slide 4

Slide 4 text

1. Learn about Economics 3

Slide 5

Slide 5 text

4

Slide 6

Slide 6 text

5

Slide 7

Slide 7 text

6

Slide 8

Slide 8 text

2. Economics in Security and Privacy 7

Slide 9

Slide 9 text

Mirai Botnet 8

Slide 10

Slide 10 text

Mirai Botnet 9 Mirai-Source-Code/mirai/bot/scanner.c

Slide 11

Slide 11 text

Software Vulnerabilities as Externalities 10 “According to one common view, information security comes down to technical measures. Given better access control policy models, formal proofs of cryptographic protocols, approved firewalls, better ways of detecting intrusions and malicious code, and better tools for system evaluation and assurance, the problems can be solved. In this note, I put forward a contrary view: information insecurity is at least as much due to perverse incentives. Many, if not most, of the problems can be explained more clearly and convincingly using the language of microeconomics: network externalities, asymmetric information, moral hazard, adverse selection, liability dumping and the tragedy of the commons.”

Slide 12

Slide 12 text

How much should we spend on security? $124B Projected 2019 spending on information security [Gartner] 11

Slide 13

Slide 13 text

How much should we spend on security? $124B Projected 2019 worldwide spending on information security [Gartner] 12 $265B Apple’s 2018 Revenues $1700B Military spending worldwide (2017) US: $610B $3.5B University of Virginia, 2018 operating budget (~50% Medical) Half the money I spend on advertising is wasted; the trouble is I don't know which half. John Wanamaker

Slide 14

Slide 14 text

Value of a Vulnerability?

Slide 15

Slide 15 text

3. Experimental Interdisciplinary Course 14

Slide 16

Slide 16 text

15

Slide 17

Slide 17 text

16

Slide 18

Slide 18 text

Course Questions? 17

Slide 19

Slide 19 text

Causation 18

Slide 20

Slide 20 text

What is the goal of science? 19

Slide 21

Slide 21 text

Does smoking cause cancer? 20 CC: Silberio77

Slide 22

Slide 22 text

21 Federal Cigarette Labeling and Advertising Act of 1966

Slide 23

Slide 23 text

22 FDA Proposed Warning (2011), blocked by companies/courts UK Warnings

Slide 24

Slide 24 text

23 (Sir) Ronald A. Fisher Nature, 30 August 1958

Slide 25

Slide 25 text

Correlation 24 Hanging suicides US spending on science US spending on science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com

Slide 26

Slide 26 text

Random Variable Definition: a random variable (e.g., !) is a distribution of values that is the measured outcome of some experiment. 25

Slide 27

Slide 27 text

Random Variable Definition: a random variable (e.g., !) is a distribution of values that is the measured outcome of some experiment. Function from a probability space (set of possible outcomes) to a measurable space (usually a real numbers) 26 !: Ω → &

Slide 28

Slide 28 text

Example: Playing Card Think of a playing card 27

Slide 29

Slide 29 text

28

Slide 30

Slide 30 text

29

Slide 31

Slide 31 text

Covariance Measure of joint variability of two random variables: 30 '()*+,*-'. !, 0 = E ! − E ! (0 − E 0 )

Slide 32

Slide 32 text

Independent Variables don’t Covary Theorem: If ! is independent of 0, covariance !, 0 = 0. 31 covariance !, 0 = E ! − E ! (0 − E 0 )

Slide 33

Slide 33 text

Independent Variables don’t Covary Theorem: If ! is independent of 0, covariance !, 0 = 0. 32 covariance !, 0 = E ! − E ! (0 − E 0 ) = E !0 − ! @ E 0 − Y @ E ! + &[!] @ &[0] = E !0] − E ! @ E 0 − E 0 @ E ! + &[!] @ &[0] = E !0 − E ! @ E 0 = 0

Slide 34

Slide 34 text

Covariance with Itself covariance !, ! =? 33 covariance !, ! = E ! − E ! (! − E ! ) = E[ X − µ H] = variance(X)

Slide 35

Slide 35 text

Population Covariance 34 '() !, 0 = I JK(LK − E ! )(MK − E 0 ) N KOP

Slide 36

Slide 36 text

Measuring Correlation 35 Pearson correlation coefficient QR,S = covariance(R, S) TUTV

Slide 37

Slide 37 text

Correlation 36 Hanging suicides US spending on science US spending on science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com r = 0.997

Slide 38

Slide 38 text

Explanations of Correlation 37 W → X X → W Y → W, X

Slide 39

Slide 39 text

38 Hanging suicides US spending on science US spending on science, space, and technology correlates with Suicides by hanging, strangulation and suffocation Hanging suicides US spending on science 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 6000 suicides 8000 suicides 4000 suicides 10000 suicides $15 billion $20 billion $25 billion $30 billion tylervigen.com http://tylervigen.com/spurious-correlations

Slide 40

Slide 40 text

Smoking and Cancer 39

Slide 41

Slide 41 text

40 In England and Wales the phenomenal increase in the number of deaths attributed to cancer of the lung provides one of the most striking changes in the pattern of mortality recorded by the Registrar-General. For example, in the quarter of a century between 1922 and 1947 the annual number of deaths recorded increased from 612 to 9,287, or roughly fifteenfold. ... The rise seems to have been particularly rapid since the end of the first world war, between 1921- 30 and 1940-4 the death rate of men at ages 45 and over increased sixfold and of women of the same ages approximately threefold. This increase is still continuing. It has occurred, too, in Switzerland, Denmark, the U.S.A., Canada, and Australia, and has been reported from Turkey and Japan. Sir Richard Doll (1912-2005) Sir Austin Bradford Hill (1897-1991)

Slide 42

Slide 42 text

41

Slide 43

Slide 43 text

42 Study Design: arrange for hospitals to contact investigators when a patent is admitted with lung cancer

Slide 44

Slide 44 text

43 Study Design: arrange for hospitals to contact investigators when a patent is admitted with lung cancer interview patient about smoking also interview a non-cancer “control” patient

Slide 45

Slide 45 text

44

Slide 46

Slide 46 text

45

Slide 47

Slide 47 text

46

Slide 48

Slide 48 text

“How much did you smoke before the onset of your present illness?” 47

Slide 49

Slide 49 text

48 “I gave up smoking two-thirds of the way though the study.” - Richard Doll

Slide 50

Slide 50 text

Probability Test Probability that if the null hypothesis were true, the measured correlation would be higher than observed. 49

Slide 51

Slide 51 text

50 Sir Ronald Fisher (1890-1962) "...the null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis."

Slide 52

Slide 52 text

More Evidence 51

Slide 53

Slide 53 text

Fisher’s Response 52

Slide 54

Slide 54 text

Fisher’s Response 53 smoking → '*-'.+ '*-'.+ → smoking Y → '*-'.+, smoking

Slide 55

Slide 55 text

54

Slide 56

Slide 56 text

55 Virginia → '*-'.+ smoking → '*-'.+ pipes ↛ '*-'.+

Slide 57

Slide 57 text

56 cancer → smoking

Slide 58

Slide 58 text

57 genotype → cancer, smoking

Slide 59

Slide 59 text

58 Ronald A. Fisher Galton Professor of Eugenics at University College London

Slide 60

Slide 60 text

59 2014

Slide 61

Slide 61 text

60 2014

Slide 62

Slide 62 text

61 2014

Slide 63

Slide 63 text

What causes breast cancer? 62

Slide 64

Slide 64 text

63 http://www.cbcrp.org/causes/index.php

Slide 65

Slide 65 text

64

Slide 66

Slide 66 text

65

Slide 67

Slide 67 text

66

Slide 68

Slide 68 text

Lessons Learned? 67

Slide 69

Slide 69 text

Lessons Learned? 68 “Statistics has gained a place of modest usefulness in medical research. It can derive and retain this only by complete impartiality, which is not unattainable by rational minds. We should not be content to be “not so unfair”, for without fairness the statistician is in danger of scientific errors through his moral fault. ...”

Slide 70

Slide 70 text

Lessons Learned? 69 “Statistics has gained a place of modest usefulness in medical research. It can derive and retain this only by complete impartiality, which is not unattainable by rational minds. We should not be content to be “not so unfair”, for without fairness the statistician is in danger of scientific errors through his moral fault. ...” Ronald A. Fisher Alleged Dangers of Cigarette-Smoking, British Medical Journal 1957

Slide 71

Slide 71 text

Hill’s Lessons 70

Slide 72

Slide 72 text

Hill’s Lessons 71 1. Strength of association 2. Consistency – repeated observation 3. Specificity 4. Temporality 5. Gradient: more smoking → more cancer 6. Plausibility (not required) 7. Coherence (not conflict with known facts) 8. Experiment 9. Analogy

Slide 73

Slide 73 text

72

Slide 74

Slide 74 text

Charge Next class: Statistical Learning Theory Project 1: due 9:29am, Tuesday (Jan 22) 73 Australian Cigarette Packaging