$30 off During Our Annual Pro Sale. View Details »

The Three Forms of (Legal) Prediction - Experts, Crowds + Algorithms

The Three Forms of (Legal) Prediction - Experts, Crowds + Algorithms

Professor Daniel Martin Katz presentation -- The Three Forms of (Legal) Prediction - Experts, Crowds + Algorithms as well as a discussion of SCOTUS and Associated Stock Market Returns aka "Law on the Market"

Daniel Martin Katz
PRO

February 17, 2021
Tweet

More Decks by Daniel Martin Katz

Other Decks in Research

Transcript

  1. Experts, Crowds + Algorithms Applied to SCOTUS
    daniel martin katz
    The Three Forms of
    (Legal) Prediction
    blog | ComputationalLegalStudies.com
    edu | illinois tech - chicago kent law
    lab | TheLawLab.com
    page | DanielMartinKatz.com

    View Slide

  2. TODAY I WOULD
    LIKE TO BEGIN
    WITH A PRACTICAL
    SET OF LEGAL AND
    LAW RELATED
    PROBLEMS

    View Slide

  3. AS THESE ARE
    THE CLASSIC
    QUESTIONS THAT
    CLIENTS POSE ON
    AN ONGOING
    BASIS …

    View Slide

  4. View Slide

  5. HOW MUCH IS THIS
    GOING TO COST ?

    View Slide

  6. HOW LONG WILL
    THIS MATTER TAKE ?

    View Slide

  7. ARE WE GOING
    TO WIN ?
    WHAT IS OUR
    EXPECTED
    LIABILITY?

    View Slide

  8. View Slide

  9. IF YOU REFLECT
    UPON IT …

    View Slide

  10. LAW-LAW LAND HAS
    MANY PREDICTION
    PROBLEMS

    View Slide

  11. THE ONLY
    QUESTION IS ON
    WHAT BASIS THOSE
    PREDICTIONS WILL
    BE OFFERED

    View Slide

  12. View Slide

  13. I WOULD LIKE TO
    REMIND EVERYONE
    AT THE OUTSET

    View Slide

  14. BEFORE THERE
    WERE COMPUTERS

    View Slide

  15. HUMANS DID ALL OF
    THE COMPUTING

    View Slide

  16. IF YOU REFLECT
    UPON OUR OWN
    DECISION MAKING …

    View Slide

  17. HERE ARE JUST A
    FEW COGNITIVE
    PROCESSES …

    View Slide

  18. LOOK FOR PATTERNS
    WEIGH VARIABLES
    MAKE CONCEPTUAL LEAPS
    (using analogical reasoning)

    View Slide

  19. ABSTRACTION OF A
    PROJECTING WEIGHTS
    INTO A DECISION
    f( )
    dimension 1
    dimension 2
    dimension 3
    .
    .
    .
    .
    dimension n
    OUTPUT
    (Prediction, Decision, etc.)
    and / or
    INPUTS

    View Slide

  20. PATTERN MATCHING
    evolutionary biology is an algorithm
    which privileges good pattern matching

    View Slide

  21. PATTERN MATCHING
    Biology is why you have
    trouble with Pie Charts

    View Slide

  22. PATTERN MATCHING
    But are very good at
    interpreting distances

    View Slide

  23. ANALOGICAL
    REASONING
    Lawyers are particularly
    good at this task

    View Slide

  24. View Slide

  25. IN ALL OF HUMAN
    HISTORY …

    View Slide

  26. WE HAVE
    DEVELOPED THREE
    APPROACHES TO
    PREDICT
    OUTCOMES …

    View Slide

  27. IN LAW OR MORE
    BROADLY …

    View Slide

  28. THOSE THREE
    APPROACHES
    ARE
    EXPERTS
    CROWDS
    ALGORITHMS

    View Slide

  29. View Slide

  30. MORE SIMPLY
    STATED …
    WE CAN ASK
    EXPERT
    CROWD
    ALGORITHM
    TO PREDICT
    SOMETHING

    View Slide

  31. IN LAW, BOTH
    HISTORICALLY AND
    STILL TODAY
    HUMAN DECISION
    MAKING IS THE
    DOMINANT FORM
    OF COMPUTATION

    View Slide

  32. IN THE PAST DECADE
    THERE HAS BEEN AT
    LEAST SOME
    EXPLORATION AT
    ALTERNATIVES

    View Slide

  33. LETS CALL IT THE
    ROBOT LAWYER
    THESIS …?

    View Slide

  34. View Slide

  35. OR MORE
    REALISTICALLY IT
    SHOULD BE
    CALLLED THE USE
    OF ALGORITHMS
    TO INTERROGATE
    PATTERNS IN DATA

    View Slide

  36. REMEMBER AN ‘ALGORITHM’ IS
    JUST A RECIPE FOR A MACHINE
    TO COMPLETE A TASK

    View Slide

  37. A.I.

    View Slide

  38. ARTIFICIAL
    INTELLIGENCE

    View Slide

  39. THIS IS A TOPIC
    FOR WHICH I AM
    OFTEN CALLED
    UPON TO
    COMMENT

    View Slide

  40. AND FOR WHICH
    A FULL FLEDGED
    PRESENTATION
    COULD BE
    OFFERED …

    View Slide

  41. https://www.slideshare.net/Danielkatz/
    artificial-intelligence-and-law-a-primer
    ACCESS MORE HERE

    View Slide

  42. BUT AT A HIGH
    LEVEL LET ME
    JUST SAY …

    View Slide

  43. ARTIFICIAL INTELLIGENCE IS A BROAD FIELD

    View Slide

  44. LETS LOOK AT THESE SPECIFIC SUBFIELDS

    View Slide

  45. data driven AI rules based AI

    View Slide

  46. View Slide

  47. DATA DRIVEN A.I.

    View Slide

  48. DATA
    DRIVEN
    A.I.
    =
    MACHINE
    LEARNING
    NATURAL
    LANGUAGE
    PROCESSING

    View Slide

  49. View Slide

  50. THE ULTIMATE
    GOAL IS TO PREDICT
    SOME CLASS OF
    LEGAL OUTCOMES

    View Slide

  51. HERE ARE
    JUST A FEW
    USE CASES
    IN LAW

    View Slide

  52. #Predict Relevant Documents
    Data Driven EDiscovery/Due Diligence
    (Predictive Coding)

    View Slide

  53. #Predict Relevant Documents
    Data Driven EDiscovery/Due Diligence
    (Predictive Coding)
    #Predict Contract Terms/Outcomes
    Data Driven Transactional Work

    View Slide

  54. #Predict Relevant Documents
    Data Driven EDiscovery/Due Diligence
    (Predictive Coding)
    #Predict Rogue Behavior
    Data Driven Compliance
    #Predict Contract Terms/Outcomes
    Data Driven Transactional Work

    View Slide

  55. #Predict Relevant Documents
    #Predict Case Outcomes / Costs
    Data Driven Legal Underwriting
    Data Driven EDiscovery/Due Diligence
    (Predictive Coding)
    #Predict Rogue Behavior
    Data Driven Compliance
    #Predict Contract Terms/Outcomes
    Data Driven Transactional Work

    View Slide

  56. #Predict Relevant Documents
    #Predict Case Outcomes / Costs
    Data Driven Legal Underwriting
    Data Driven EDiscovery/Due Diligence
    (Predictive Coding)
    #Predict Rogue Behavior
    Data Driven Compliance
    #Predict Contract Terms/Outcomes
    Data Driven Transactional Work
    #Predict Regulatory Outcomes
    Data Driven Lobbying, etc.

    View Slide

  57. View Slide

  58. TODAY I WANT TO
    TAKE THESE AND
    OTHER IDEAS AND
    DISCUSS THEIR
    APPLICATION TO
    SCOTUS PREDICTION

    View Slide

  59. SCOTUS PREDICTION
    IS A PASTIME

    View Slide

  60. EVERY YEAR, LAW REVIEWS, MAGAZINE AND
    N E W S PA P E R A R T I C L E S , T E L E V I S I O N A N D
    RADIO TIME, CONFERENCE PANELS, BLOG
    P O S T S , A N D T W E E T S A R E D E V O T E D T O
    QUESTIONS SUCH AS:
    H O W W I L L T H E C O U R T R U L E I N T H I S
    PARTICULAR CASE?
    IN THE DIRECTION OF WHICH PARTY WILL AN
    INDIVIDUAL JUSTICE VOTE?

    View Slide

  61. JUDICIAL PREDICTION
    IS ALSO THE LAW +
    POLITICAL SCIENCE
    GRAIL QUEST

    View Slide

  62. YOU COULD START
    WITH HOLMES AND
    THE LEGAL REALISTS
    BUT THESE WERE
    *NOT* REALLY
    SCIENTIFIC EFFORTS

    View Slide

  63. FRED KORT, PREDICTING SUPREME
    COURT DECISIONS MATHEMATICALLY:
    A QUANTITATIVE ANALYSIS OF THE
    “RIGHT TO COUNSEL” CASES, 51
    AMER. POL. SCI. REV. 1 (1957).
    1957
    S. SIDNEY ULMER, QUANTITATIVE
    ANALYSIS OF JUDICIAL PROCESSES:
    SOME PRACTICAL AND THEORETICAL
    APPLICATIONS, 28 LAW & CONTEMP.
    PROBS. 164 (1963).
    1963
    A CO U P L E O F E A R LY E F F O RT S

    View Slide

  64. COMPUTERWORLD
    JULY 1971
    PROGRAM WRITTEN IN FORTRAN
    (THE 91% PREDICTION MARK WAS
    LIMITED TO CERTAIN CASES)
    HAROLD SPAETH

    View Slide

  65. JEFFREY A. SEGAL, PREDICTING
    S U P R E M E C O U R T C A S E S
    PROBABILISTICALLY: THE SEARCH
    AND SEIZURE CASES, 1962-1981,
    78 AMERICAN POLITICAL SCIENCE
    REVIEW 891 (1984)
    1984
    A N I M P O RTA N T
    L AT E R E F F O RT

    View Slide

  66. Columbia Law Review (2004)
    Theodore W. Ruger, Pauline T. Kim,
    Andrew D. Martin, Kevin M. Quinn
    Legal and Political Science Approaches
    to Predicting Supreme Court Decision
    Making
    The Supreme Court
    Forecasting Project:
    B U T T H I S WA S T H E PA P E R
    T H AT I N S P I R E D O U R E F F O RT S
    2004

    View Slide

  67. View Slide

  68. EXPERTS

    View Slide

  69. Columbia Law Review
    October, 2004
    Theodore W. Ruger, Pauline T. Kim,
    Andrew D. Martin, Kevin M. Quinn
    Legal and Political
    Science Approaches to
    Predicting Supreme
    Court Decision Making
    The Supreme Court
    Forecasting Project:

    View Slide

  70. experts

    View Slide

  71. Case Level Prediction
    Justice Level Prediction
    67.4% experts
    58% experts
    From the 68
    Included
    Cases
    for the
    2002-2003
    Supreme
    Court Term

    View Slide

  72. these experts probably
    overfit

    View Slide

  73. they fit to the noise
    and
    not the signal

    View Slide

  74. View Slide

  75. if this were
    finance this
    would be
    trading
    worse than
    S&P500

    View Slide

  76. #NoiseTrading

    View Slide

  77. #BuffetChallenge

    View Slide

  78. #BuffetChallenge

    View Slide

  79. like many other forms
    human endeavor
    law is full of 

    noise predictors …

    View Slide

  80. from a pure
    forecasting
    standpoint

    View Slide

  81. the best known
    SCOTUS predictor is

    View Slide

  82. View Slide

  83. the law
    version of
    superforecasting

    View Slide

  84. View Slide

  85. CROWDS

    View Slide

  86. not
    enough
    crowd
    based
    decision
    making in
    institutions

    View Slide

  87. View Slide

  88. “Software developers were asked on two
    separate days to estimate the completion
    time for a given task, the hours they
    projected differed by 71%, on average.
    When pathologists made two assessments of
    the severity of biopsy results, the correlation
    between their ratings was only .61 (out of a
    perfect 1.0), indicating that they made
    inconsistent diagnoses quite frequently.
    Judgments made by different people are
    even more likely to diverge.”

    View Slide

  89. View Slide

  90. View Slide

  91. FA N TA SY S COT U S I S A S
    CO O L A S I T S O U N D S

    View Slide

  92. FA N TA SY S COT U S WA S F O U N D E D
    B Y J O S H B L AC K M A N I N 2 0 0 9

    View Slide

  93. P R I Z E S A N D
    S P O N S O R S H I P H AV E
    VA R I E D B U T T H E R E
    H AV E B E E N T E N S O F
    T H O U S A N D S O F $ $ A N D
    P R I Z E S D I ST R I B U T E D
    OV E R T H E Y E A R S

    View Slide

  94. View Slide

  95. View Slide

  96. U S E R S
    C R E AT E
    A LO G I N
    A N D
    ACC E S S
    T H E S I T E

    View Slide

  97. O N A CA S E B Y CA S E B A S I S ,
    U S E R S CA N E N T E R T H E I R
    R E S P E C T I V E P R E D I C T I O N S

    View Slide

  98. U S E R S A R E F R E E
    TO C H A N G E T H E I R
    P R E D I C T I O N S
    U N T I L T H E DAT E O F
    F I N A L D E C I S I O N

    View Slide

  99. https://fivethirtyeight.com/features/
    obamacares-chances-of-survival-are-looking-
    better-and-better/
    ( S O M E T I M E S I N
    I N T E R E ST I N G WAY S
    A S S H O W N A B OV E )
    U S E R S A R E F R E E
    TO C H A N G E T H E I R
    P R E D I C T I O N S
    U N T I L T H E DAT E O F
    F I N A L D E C I S I O N

    View Slide

  100. F O R E AC H CA S E , W E
    A R E A B L E TO T R AC K
    TO P E R F O R M A N C E
    O F P L AY E R S A N D
    CO M PA R E I T TO
    T H E O U TCO M E O F
    T H E CA S E S

    View Slide

  101. We can
    generate
    Crowd
    Sourced
    Predictions

    View Slide

  102. View Slide

  103. S U M M A R Y
    STAT I ST I C S

    View Slide

  104. 6 S COT U S T E R M S

    View Slide

  105. 4 2 5 L I ST E D CA S E S
    6 S COT U S T E R M S

    View Slide

  106. 7 2 8 4 U N I Q U E PA RT I C I PA N T S
    4 2 5 L I ST E D CA S E S
    6 S COT U S T E R M S

    View Slide

  107. 7 2 8 4 U N I Q U E PA RT I C I PA N T S
    4 2 5 L I ST E D CA S E S
    6 3 6 8 5 9 P R E D I C T I O N S
    6 S COT U S T E R M S

    View Slide

  108. W E H AV E A
    S I G N I F I CA N T A M O U N T
    O F P L AY E R T U R N OV E R
    WO R K I N G W I T H
    R E A L DATA
    ( ~ 3 % O F M A X PA RT I C I PAT I O N )

    View Slide

  109. S O M E T I M E S F O L K S
    C H A N G E T H E I R VOT E S
    6 3 6 8 5 9
    5 4 5 8 4 5 F I N A L P R E D I C T I O N S
    OV E R A L L P R E D I C T I O N S
    WO R K I N G W I T H
    R E A L DATA

    View Slide

  110. T H E N U M B E R O F
    P L AY E R S H A S D E C L I N E D
    B U T T H E E N G AG E M E N T
    R AT E H A S I N C R E A S E D
    WO R K I N G W I T H
    R E A L DATA

    View Slide

  111. W E B E L I E V E T H I S I S * N OT *
    R E A L LY S U RV I VO R S H I P
    B I A S B U T R AT H E R A
    R E V E L AT I O N M E C H A N I S M
    ( I . E . YO U ST I C K A R O U N D I F
    T H I N K YO U A R E G O O D AT T H E
    U N D E R LY I N G TA S K )

    View Slide

  112. S U M M A R Y DATA

    View Slide

  113. View Slide

  114. CROWD
    MODELING
    PRINCIPLES

    View Slide

  115. C R O W D S O U R C I N G
    C R O W D S O U R C I N G D O E S
    * N OT * R E F E R TO A S P E C I F I C
    T E C H N I Q U E O R A LG O R I T H M

    View Slide

  116. C R O W D S O U R C I N G
    G E N E R A L LY R E F E R S TO A
    P R O C E S S O F AG G R E G AT I O N
    A N D / O R S E G M E N TAT I O N O F
    I N F O R M AT I O N S I G N A L S

    View Slide

  117. VA R I O U S S I G N A L TY P E S
    T H E I N P U T S I G N A L S CA N A S S U M E
    M A N Y D I F F E R E N T F O R M S
    I N C LU D I N G F R O M M O D E L S O R
    I N D I V I D UA L S O R S E N S O R S
    ( O R S O M E CA S E S E V E N OT H E R C R O W D S )

    View Slide

  118. CROWD OF INDIVIDUALS
    The most well know approach involves
    extracting ‘wisdom’ from crowds where crowds
    are built from individual people

    View Slide

  119. CROWD OF SENSORS
    Note crowds need not be composed of humans
    but could be networked IT systems
    Decentralized Distributed Ledgers -or- Oracles
    -or- IOT sensors with Crowdsourcing Validation
    #Blockchain
    #InternetofThings
    #Crypto

    View Slide

  120. #CRYPTO CROWD
    Thanks to Team Augur for the Shoutout

    View Slide

  121. Random Forest Model
    Breiman, L.(2001). Random forests.
    Machine learning, 45(1), 5-32.
    Grow a set of differentiated trees
    through bagging and random
    substrates (predict using a consensus
    mechanism)
    C R O W D O F M O D E L S

    View Slide

  122. View Slide

  123. A S W E R E V I E W E D T H E
    C R O W D S O U R C I N G
    L I T E R AT U R E …

    View Slide

  124. W E O B S E RV E D T H AT I T
    WA S D I F F I C U LT TO A P P LY
    T H E P R I N C I P L E S TO
    C R O W D S S U C H A S O U R S

    View Slide

  125. C R O W D
    S O U R C I N G I S
    ‘ U N D I S C I P L I N E D
    ZO O O F M O D E L S ’
    J E S S I CA F L AC K
    P R O F E S S O R
    S A N TA F E I N ST I T U T E
    D E C . 2 7 , 2 0 1 7
    ( V I A T W I T T E R )

    View Slide

  126. W E AG R E E …
    A N D T H U S I N T H E PA P E R
    W E J U ST STA RT E D OV E R …

    View Slide

  127. A N D AT T E M P T E D TO
    B U I L D C R O W D S F R O M
    F I R ST P R I N C I P L E S …

    View Slide

  128. View Slide

  129. CROWD
    CONSTRUCTION
    FRAMEWORK

    View Slide

  130. W E O U T L I N E A
    G E N E R A L F R A M E WO R K
    F O R CO N ST R U C T I N G
    C R O W D S F R O M F I R ST
    P R I N C I P L E S

    View Slide

  131. I N T H E C L A S S I C
    CO N D O R C E T J U R Y
    S E T T I N G , M O D E L S
    TY P I CA L LY U S E
    P R E D I C T I O N S F R O M
    A L L PA RT I C I PA N T S

    View Slide

  132. H O W E V E R , M O D E L S CA N
    A L S O TA K E I N TO ACCO U N T
    I N F O R M AT I O N ( S I G N A L S )
    F R O M S O M E S U B S E T O F
    PA RT I C I PA N T S
    ( D E F I N E D U S I N G E I T H E R I N C LU S I O N
    R U L E S O R E XC LU S I O N R U L E S )

    View Slide

  133. View Slide

  134. E X P E R I E N C E
    P E R F O R M A N C E
    R A N K
    STAT I ST I CA L T H R E S H O L D I N G
    W E I G H T I N G
    C R O W D CO N ST R U C T I O N R U L E S

    View Slide

  135. C R O W D CO N ST R U C T I O N R U L E S
    T H E I N T E R AC T I O N O F T H E S E
    R U L E S Y I E L D S * N OT * A N
    I N D I V I D UA L M O D E L B U T
    R AT H E R A M O D E L S PAC E

    View Slide

  136. T H E M O D E L S PAC E
    M O D E L S PAC E
    F E AT U R E S 2 7 7 , 2 0 1
    P OT E N T I A L M O D E L S

    View Slide

  137. T H E M O D E L S PAC E
    1 + ( 2 8 · 9 9 · 1 0 0 ) = 2 7 7 2 0 1
    F I R ST T E R M I S T H E S I M P L E ST
    C R O W D S O U R C I N G M O D E L W I T H N O S U B S E T
    O R W E I G H T I N G R U L E S
    S E CO N D T E R M CO R R E S P O N D S TO 2 8 M O D E L S
    ( CO M B I N AT I O N O F P E R F O R M A N C E
    T H R E S H O L D / W E I G H T I N G R U L E S )
    F O R E AC H CO M B I N AT I O N O F 9 9 R A N K A N D
    1 0 0 E X P E R I E N C E T H R E S H O L D S

    View Slide

  138. View Slide

  139. MODEL TESTING
    & RESULTS

    View Slide

  140. W E S I M U L AT E T H E
    P E R F O R M A N C E O F
    * E AC H * O F T H E
    2 7 7 , 2 0 1 P OT E N T I A L
    C R O W D M O D E L S
    M O D E L ( S )
    ACC U R ACY

    View Slide

  141. A LT H O U G H I T I S A
    L A R G E M O D E L S PAC E
    W E D O H I G H L I G H T T H E
    P E R F O R M A N C E O F F O U R
    E X A M P L E M O D E L S
    ( A N D T H E N U L L M O D E L )

    View Slide

  142. B A S E L I N E
    A LWAY S
    G U E S S
    R E V E R S E
    N U L L
    M O D E L

    View Slide

  143. M O D E L 1
    A L L
    C R O W D
    S I M P L E
    M A J O R I TY

    View Slide

  144. M O D E L 2
    F O L LO W T H E
    L E A D E R
    W I T H N O
    T H R E S H O L D I N G

    View Slide

  145. M O D E L 3
    F O L LO W T H E
    L E A D E R
    W I T H
    E X P E R I E N C E
    T H R E S H O L D I N G
    ( X P = 5 )

    View Slide

  146. M O D E L 4
    M A X I M U M
    ACC U R ACY
    ( T H E R E A R E AC T UA L LY S E V E R A L M O D E L
    CO N F I G U R AT I O N S W H I C H O F F E R R O U G H LY
    E Q U I VA L E N T P E R F O R M A N C E )
    E X P E R I E N C E T H R E S H O L D O F 5
    C R O W D S I Z E I S CA P P E D AT 2 2
    E X P O N E N T I A L W E I G H T W I T H A L P H A O F 0 . 1

    View Slide

  147. CASE
    LEVEL
    CUMULATIVE
    ACCURACY

    View Slide

  148. JUSTICE
    LEVEL
    CUMULATIVE
    ACCURACY

    View Slide

  149. DISTRIBUTION
    OF JUSTICE
    LEVEL MODEL
    ACCURACY

    View Slide

  150. View Slide

  151. W E S I M U L AT E T H E
    P E R F O R M A N C E O F
    * E AC H * O F T H E
    2 7 7 , 2 0 1 P OT E N T I A L
    C R O W D M O D E L S
    R O B U ST N E S S
    O F P E R F O R M A N C E

    View Slide

  152. ROBUSTNESS VISUALIZED
    T H I S I S A L L R E L AT I V E
    TO T H E N U L L M O D E L
    ( O F A LWAY S G U E S S
    R E V E R S E )

    View Slide

  153. ROBUSTNESS VISUALIZED
    T H E CO N TO U R P LOT
    F L AT T E N S T H E
    D I M E N S I O N A L I TY
    O F T H E S PAC E
    ( E AC H C E L L I S T H E AV E R AG E
    M O D E L P E R F O R M A N C E OV E R
    A L L OT H E R M O D E L
    PA R A M E T E R AT E AC H
    E X P E R I E N C E , R A N K CO M B O )

    View Slide

  154. ROBUSTNESS VISUALIZED
    J U ST I C E L E V E L
    CA S E L E V E L

    View Slide

  155. T H E K E Y I D E A

    View Slide

  156. N OT A L L
    M E M B E R S O F
    C R O W D A R E
    M A D E E Q UA L

    View Slide

  157. W E M A I N TA I N
    A ‘ S U P E R C R O W D ’
    W H I C H I S T H E TO P N
    O F P R E D I C TO R S
    U P TO T I M E T- 1

    View Slide

  158. the ‘supercrowd’ outperforms
    the overall crowd
    (and even the best single player)

    View Slide

  159. H T T P S : / / A R X I V. O RG / A B S / 1712 . 0 3 84 6
    H T T P S : / / PA P E R S . S S R N . C O M / S O L 3 /
    PA P E R S . C F M ? A B S T R AC T _ I D = 3 0 8 5710

    View Slide

  160. View Slide

  161. ALGORITHMS

    View Slide

  162. http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0174698
    Katz DM, Bommarito MJ II, Blackman J (2017), A General
    Approach for Predicting the Behavior of the Supreme Court
    of the United States. PLoS ONE 12(4): e0174698.

    View Slide

  163. Our algorithm is a special version
    of random forest (time evolving)
    http://journals.plos.org/
    plosone/article?id=10.1371/
    journal.pone.0174698
    available at
    RESEARCH ARTICLE
    A general approach for predicting the
    behavior of the Supreme Court of the United
    States
    Daniel Martin Katz1,2*, Michael J. Bommarito II1,2, Josh Blackman3
    1 Illinois Tech - Chicago-Kent College of Law, Chicago, IL, United States of America, 2 CodeX - The Stanford
    Center for Legal Informatics, Stanford, CA, United States of America, 3 South Texas College of Law Houston,
    Houston, TX, United States of America
    * [email protected]
    Abstract
    Building on developments in machine learning and prior work in the science of judicial pre-
    diction, we construct a model designed to predict the behavior of the Supreme Court of the
    United States in a generalized, out-of-sample context. To do so, we develop a time-evolving
    random forest classifier that leverages unique feature engineering to predict more than
    240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015).
    Using only data available prior to decision, our model outperforms null (baseline) models at
    both the justice and case level under both parametric and non-parametric tests. Over nearly
    two centuries, we achieve 70.2% accuracy at the case outcome level and 71.9% at the jus-
    tice vote level. More recently, over the past century, we outperform an in-sample optimized
    null model by nearly 5%. Our performance is consistent with, and improves on the general
    level of prediction demonstrated by prior work; however, our model is distinctive because it
    can be applied out-of-sample to the entire past and future of the Court, not a single term.
    Our results represent an important advance for the science of quantitative legal prediction
    and portend a range of other potential applications.
    Introduction
    As the leaves begin to fall each October, the first Monday marks the beginning of another term
    for the Supreme Court of the United States. Each term brings with it a series of challenging,
    important cases that cover legal questions as diverse as tax law, freedom of speech, patent law,
    administrative law, equal protection, and environmental law. In many instances, the Court’s
    decisions are meaningful not just for the litigants per se, but for society as a whole.
    Unsurprisingly, predicting the behavior of the Court is one of the great pastimes for legal
    and political observers. Every year, newspapers, television and radio pundits, academic jour-
    nals, law reviews, magazines, blogs, and tweets predict how the Court will rule in a particular
    case. Will the Justices vote based on the political preferences of the President who appointed
    them or form a coalition along other dimensions? Will the Court counter expectations with an
    unexpected ruling?
    PLOS ONE | https://doi.org/10.1371/journal.pone.0174698 April 12, 2017 1 / 18
    a1111111111
    a1111111111
    a1111111111
    a1111111111
    a1111111111
    OPEN ACCESS
    Citation: Katz DM, Bommarito MJ, II, Blackman J
    (2017) A general approach for predicting the
    behavior of the Supreme Court of the United
    States. PLoS ONE 12(4): e0174698. https://doi.
    org/10.1371/journal.pone.0174698
    Editor: Luı
    ´s A. Nunes Amaral, Northwestern
    University, UNITED STATES
    Received: January 17, 2017
    Accepted: March 13, 2017
    Published: April 12, 2017
    Copyright: © 2017 Katz et al. This is an open
    access article distributed under the terms of the
    Creative Commons Attribution License, which
    permits unrestricted use, distribution, and
    reproduction in any medium, provided the original
    author and source are credited.
    Data Availability Statement: Data and replication
    code are available on Github at the following URL:
    https://github.com/mjbommar/scotus-predict-v2/.
    Funding: The author(s) received no specific
    funding for this work.
    Competing interests: All Authors are Members of
    a LexPredict, LLC which provides consulting
    services to various legal industry stakeholders. We
    received no financial contributions from LexPredict
    or anyone else for this paper. This does not alter
    our adherence to PLOS ONE policies on sharing
    data and materials.

    View Slide

  164. T H E S O U R C E CO D E F O R
    O U R A LG O PA P E R I S
    AVA I L A B L E O N

    View Slide

  165. View Slide

  166. W E CA L L O U R
    A LG O PA P E R A
    ‘ G E N E R A L’
    A P P R OAC H

    View Slide

  167. B E CAU S E W E A R E
    N OT I N T E R E ST E D
    I N A LO CA L LY
    T U N E D M O D E L
    B U T R AT H E R A
    M O D E L T H AT CA N
    ‘ STA N D T H E T E ST
    O F T I M E ’

    View Slide

  168. G E N E R A L S COT U S
    P R E D I C T I O N
    243,882
    28,009
    Case Outcomes
    Justice Votes
    1816-2015

    View Slide

  169. G E N E R A L S COT U S
    P R E D I C T I O N
    70.2% accuracy at the
    case outcome level
    71.9% at the justice
    vote level
    1816-2015

    View Slide

  170. W E P R E D I C T
    * N OT * A S I N G L E
    Y E A R B U T R AT H E R
    ~ 2 0 0 Y E A R S
    ( 1 8 1 6 - 2 0 1 5 )
    O U T O F S A M P L E

    View Slide

  171. N O W I T I S WO RT H N OT I N G
    T H AT P R E D I C T I O N
    O R I E N T E D PA P E R S A R E
    C U R R E N T LY S W I M M I N G
    AG A I N ST M A I N ST R E A M
    S O C I A L S C I E N C E ( A N D L AW )

    View Slide

  172. CAU S A L I N F E R E N C E
    I S T H E H A L L M A R K O F
    M O ST Q UA N T O R I E N T E D
    L AW + S O C I A L S C I E N C E
    S C H O L A R S H I P

    View Slide

  173. I T I S B E ST S U I T E D TO
    P O L I CY E VA LUAT I O N
    ( S U C H A S D O E S T H I S
    PA RT I C U L A R P O L I CY
    I N T E RV E N T I O N AC H I E V E
    I T S STAT E D O B J E C T I V E S )

    View Slide

  174. O R I N STA N C E S W H E R E
    E STA B L I S H I N G L I N K S
    B E T W E E N CAU S E A N D
    E F F E C T A R E C R I T I CA L

    View Slide

  175. B U T T H E R E I S A N
    A LT E R N AT I V E
    PA R A D I G M
    # P R E D I C T I O N

    View Slide

  176. M AC H I N E L E A R N I N G
    P R E D I C T I V E A N A LY T I C S
    ‘ I N V E R S E ’ P R O B L E M
    B -S C H O O L
    CO M P S C I
    P H Y S I C S
    P R E D I C T I O N

    View Slide

  177. Andrew D. Martin, Kevin M. Quinn, Theodore W. Ruger & Pauline T.
    Kim, Competing Approaches to Predicting Supreme Court Decision
    Making, 2 Perspectives on Politics 761 (2004).
    “the best test of an explanatory theory is its
    ability to predict future events. To the extent
    that scholars in both disciplines (social
    science and law) seek to explain court
    behavior, they ought to test their theories
    not only against cases already decided, but
    against future outcomes as well.”

    View Slide

  178. https://www.computationallegalstudies.com/2017/08/28/legal-analytics-versus-empirical-
    legal-studies-causal-inference-vs-prediction/
    https://www.slideshare.net/Danielkatz/legal-analytics-versus-empirical-
    legal-studies-or-causal-inference-vs-prediction-redux
    M O R E O N T H AT
    TO P I C H E R E

    View Slide

  179. View Slide

  180. T H E R E I S
    G R O W I N G
    I N T E R E ST I N T H E
    P R E D I C T I O N
    C E N T R I C
    A P P R OAC H

    View Slide

  181. “There are two cultures in the use of
    statistical modeling to reach conclusions
    from data. One assumes that the data
    are generated by a given stochastic data
    model. The other uses algorithmic
    models and treats the data mechanism
    as unknown …. If our goal as a field is
    to use data to solve problems, then we
    need to move away from exclusive
    dependence on data models and adopt
    a more diverse set of tools.”
    Leo Breiman, Statistical modeling:
    The two cultures (with comments
    and a rejoinder by the author), 16
    Statistical Science 199 (2001)
    Note: Leo Breiman Invented
    Random Forests

    View Slide

  182. View Slide

  183. View Slide

  184. View Slide

  185. View Slide

  186. View Slide

  187. Susan Athey, The Impact of Machine Learning on Economics
    http://www.nber.org/chapters/c14009.pdf

    View Slide

  188. 3 5 5 S C I E N C E 6 3 2 4
    3 F E B R UA R Y 2 0 1 7
    S P E C I A L I S S U E
    O N P R E D I C T I O N

    View Slide

  189. View Slide

  190. EXPERTS,
    CROWDS,
    ALGORITHMS

    View Slide

  191. P R E D I C T I O N I S
    N OT N E C E S S A R I LY
    # M L A LO N E B U T
    R AT H E R S O M E
    E N S E M B L E O F
    E X P E RT S ,
    C R O W D S +
    A LG O R I T H M S

    View Slide

  192. http://www.sciencemag.org/news/
    2017/05/artificial-intelligence-prevails-
    predicting-supreme-court-decisions
    Professor Katz noted
    that in the long term
    …“We believe the
    blend of experts,
    crowds, and algorithms
    is the secret sauce for
    the whole thing.”
    May 2nd 2017

    View Slide

  193. IN MANY INSTANCES
    BLENDS OF
    INTELLIGENCE WILL
    OUTPERFORM A
    SINGLE STREAM OF
    INTELLIGENCE

    View Slide

  194. THE
    PSEUDOCODE OF
    OUR TIMES …

    View Slide

  195. HUMANS
    +
    MACHINES
    HUMANS
    OR
    MACHINES
    >

    View Slide

  196. crowd
    forecast
    learning problem is to discover how to blend streams of intelligence
    algorithm
    forecast
    ensemble method
    ENSEMBLE MODEL
    we can use
    machine learning
    methods and
    metadata such as
    case topic, lower
    court as well as
    crowd metadata
    to ‘learn’ the
    conditional
    weights to apply to
    the input signals

    View Slide

  197. View Slide

  198. THERE ARE LOTS OF
    INTERESTING
    APPLICATIONS ONCE
    YOU CAN PREDICT
    SOMETHING …

    View Slide

  199. AND BY PREDICT …
    I AM TALKING ABOUT
    PREDICTION AT THE
    PORTFOLIO LEVEL …

    View Slide

  200. I HAVE BEEN VERY
    INTERESTED IN THE
    OVERLAP BETWEEN
    LEGAL TECH
    FIN TECH

    View Slide

  201. Fin(Legal)Tech Conference
    finlegaltechconference.com
    @Illinois Tech - Chicago Kent College of Law

    View Slide

  202. https://computationallegalstudies.com/2017/10/24/10-legal-tech-lessons-dollars-doughnuts-fin-legal-tech-via-aba-journal/

    View Slide

  203. nearly 75+ videos and counting
    TheLawLabChannel.com

    View Slide

  204. View Slide

  205. Law on the Market

    View Slide

  206. When we would
    present this work on
    #SCOTUS Prediction
    folks would ask us
    “why do I care about
    marginal improvements
    in prediction ? “

    View Slide

  207. Well at a very minimum — if you
    could predict the cases you could
    perhaps trade on them in the
    relevant securities market …

    View Slide

  208. In other words, given our
    ability to offer forecasts of
    judicial outcomes, we
    wondered if this information
    could inform an event
    driven trading strategy ?

    View Slide

  209. http://arxiv.org/abs/1508.05751
    available at
    http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2649726

    View Slide

  210. We call this idea
    “Law on the Market”
    LOTM

    View Slide

  211. A Motivating Example
    Myriad Genetics
    NASDAQ: MYGN
    Market Cap of ~$3 billion+

    View Slide

  212. Myraid Genetics
    “Myriad employs a number of proprietary
    technologies that permit doctors and patients
    to understand the genetic basis of human
    disease and the role that genes play in the
    onset, progression and treatment of disease.”

    View Slide

  213. Myraid Genetics
    “Myriad was the subject of scrutiny
    after it became involved in a lengthy
    lawsuit over its controversial patenting
    practices” which including the
    patenting of human gene sequences ....

    View Slide

  214. View Slide

  215. June 13, 2013
    Supreme Court
    Offers this
    Decision
    ~10:05am

    View Slide

  216. Initial Media
    Reports and
    Initial Trading
    11:48am

    View Slide

  217. Initial Media
    Reports
    Early
    Afternoon
    “In early afternoon trading
    Thursday, Myriad shares
    were up 5.4 percent, or
    $2.36, at $35.73.”

    View Slide

  218. Final Media
    Reports

    View Slide

  219. Final Media
    Reports

    View Slide

  220. 9:30am
    Open

    View Slide

  221. 10:00am
    SCOTUS

    View Slide

  222. 11:05am
    MYGN Trades UP

    View Slide

  223. 2:15pm
    MYGN is Off its
    Daily Peak but still up

    View Slide

  224. Day 1 Close
    MYGN
    is Off Nearly 10% from Open
    and 20% from Daily High

    View Slide

  225. Day 2 the Sell Off Continues

    View Slide

  226. A Good Time to Buy an Option :)

    View Slide

  227. View Slide

  228. View Slide

  229. SO THESE EXAMPLES
    REPRESENT A FORM OF
    EXISTENCE PROOF …

    View Slide

  230. BUT PERHAPS THEY
    ARE RARE AND
    ANACHRONISTIC
    CASES …?

    View Slide

  231. View Slide

  232. ONE OBVIOUS CHALLENGE
    IS THE PROSPECT THAT
    THIS INFORMATION IS
    ALREADY INCORPORATED
    INTO THE PRICE OF THE
    RELEVANT SECURITY
    #EfficientMarketHypothesis
    #Fama #EMH

    View Slide

  233. IN ALLIED FIELDS OF HUMAN
    ENDEAVOR, THERE ARE FAIRLY RAPID
    MARKET RESPONSES TO CHANGES IN
    THE INFORMATION ENVIRONMENT

    View Slide

  234. THIS ALL PRESUPPOSES A
    RIGOROUS INFORMATION AND
    MODELING ENVIRONMENT —
    THAT IS GENERALLY LACKING
    IN QUESTIONS OF LEGAL
    PREDICTION
    #QuantitativeLegalPrediction
    #LegalAnalytics #FinLegalTech

    View Slide

  235. View Slide

  236. BUT PERHAPS THEY
    ARE RARE AND
    ANACHRONISTIC
    CASES …?

    View Slide

  237. View Slide

  238. Theoretical +
    Empirical Questions

    View Slide

  239. Market Incorporation Hypothesis
    Are judicial decisions already
    reflected in the share price ?
    (If this were true - we would rarely
    see market move post decision)

    View Slide

  240. How General Are
    These Specific Examples?
    Theoretical + Empirical Questions
    (In other words, is this a general phenomenon ?)

    View Slide

  241. What is the nature of the signal
    incorporation environment ?
    (In other words, what are the dynamics associated with does
    the market response ?)
    Theoretical + Empirical Questions

    View Slide

  242. View Slide

  243. METHODS

    View Slide

  244. (1) Coding / PreProcessing
    (2) Candidate LOTM Events
    (3) Formal Evaluation Using
    CAPM (market model of returns)
    (4) Evaluate Speed of Incorporation
    and Related Informational Dynamics

    View Slide

  245. (1) Coding / PreProcessing
    We reviewed and coded 1,363 total
    cases decided over the period in
    questions.
    We asked a simple question - could
    this case plausibly impact a publicly
    traded security ?

    View Slide

  246. All Data &
    Code is
    Available
    Here^
    ^Other than the WRDS Data
    which is *not* open source but
    can be obtained from Wharton
    https://github.com/mjbommar/law-on-the-market
    https://wrds-web.wharton.upenn.edu/wrds/

    View Slide

  247. (3) Formal Evaluation Using
    CAPM (market model of returns)

    View Slide

  248. Abnormal
    Returns
    Common approach is
    to use index as baseline
    and seek to identify
    statistically significant
    deviations from that
    baseline
    We want to isolate the
    effect of the event from
    other general market
    movements

    View Slide

  249. This Paper Leverages 5 Minute Data
    -5 Days, +5 Days
    A KEY POINT
    much higher frequency than
    most papers in literature

    View Slide

  250. View Slide

  251. SOME
    RESULTS

    View Slide

  252. Summary
    Results

    View Slide

  253. Market
    Cap

    View Slide

  254. Some
    Additional
    Cases

    View Slide

  255. (4) Evaluate Speed of Incorporation
    and Related Informational Dynamics

    View Slide

  256. Speed
    of
    Information
    Incorporation

    View Slide

  257. This is very
    slow …
    Perhaps the
    real action is
    in the options
    market ?

    View Slide

  258. In conclusion, we believe that
    this research raises many
    questions and justifies a range
    of future work in the area

    View Slide

  259. Future Work
    Real Trading Strategy Analysis
    Other Classes of Litigation Events
    8k’s and Docket Arbitrage
    Higher and Lower Order Analysis
    Litigation Reserves, etc.

    View Slide

  260. Litigation Funding

    View Slide

  261. Other Classes of Litigation Events
    Litigation Funding and Reserves
    Litigation
    Reserves

    View Slide

  262. FT Big Read
    Feb 7, 2019

    View Slide

  263. View Slide

  264. TODAY I HAVE
    GIVEN YOU AN
    OVERVIEW OF
    THREE
    APPROACHES TO
    LEGAL PREDICTION

    View Slide

  265. AND
    DEMONSTRATED
    THEIR
    APPLICATION TO
    ONE PARTICULAR
    PROBLEM
    (#SCOTUS)

    View Slide

  266. HOWEVER, I
    HOPE IT IS
    CLEAR THAT
    THESE
    TECHNIQUES
    CAN BE APPLIED
    MORE BROADLY

    View Slide

  267. TO A WIDE
    RANGE OF
    PREDICTION
    PROBLEMS
    ACROSS THE
    LEGAL
    INDUSTRY AND
    BEYOND …

    View Slide

  268. View Slide

  269. Daniel Martin Katz
    @ computational
    computationallegalstudies.com
    danielmartinkatz.com
    illinois tech - chicago kent college of law
    @
    thelawlab.com

    View Slide