Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Modeling Football Results Using Match-specific ...

Modeling Football Results Using Match-specific Covariates

A model for results of football matches is proposed that is able to take into account match-specific covariates as, for example, the total distance a team runs in the specific match. The model extends the Bradley-Terry model in many diff erent ways. In addition to the inclusion of covariates, it considers ordered response values and (possibly team-specific) home e ffects. Penalty terms are used to reduce the complexity of the model and to find clusters of teams with equal covariate effects.

MunichDataGeeks

October 08, 2016
Tweet

More Decks by MunichDataGeeks

Other Decks in Science

Transcript

  1. Modelling Football Results Using Match-specific Covariates Andreas Groll & Gunther

    Schauberger & Gerhard Tutz Datageeks Data Day 2016 October 8th, Munich
  2. Bundesliga 2015/2016 Position Team Goals For Goals Against Points 1

    Bayern München 80 17 88 2 Borussia Dortmund 82 34 78 3 Bayer 04 Leverkusen 56 40 60 4 Bor. Mönchengladbach 67 50 55 5 FC Schalke 04 51 49 52 6 1. FSV Mainz 05 46 42 50 7 Hertha BSC 42 42 50 8 VfL Wolfsburg 47 49 45 9 1. FC Köln 38 42 43 10 Hamburger SV 40 46 41 11 FC Ingolstadt 04 33 42 40 12 FC Augsburg 42 52 38 13 Werder Bremen 50 65 38 14 SV Darmstadt 98 38 53 38 15 TSG Hoffenheim 39 54 37 16 Eintracht Frankfurt 34 52 36 17 VfB Stuttgart 50 75 33 18 Hannover 96 31 62 25 Andreas Groll 2 / 29
  3. Bradley-Terry Model Set of objects {a1 , . . .

    , am } for Y(r,s) = 1 if ar preferred over as 0 if as preferred over ar P(Y(r,s) = 1) = exp(γr − γs ) 1 + exp(γr − γs ) , m r=1 γr = 0 γr attractivity/strength of object r γs attractivity/strength of object s Andreas Groll 3 / 29
  4. Bradley-Terry Model Set of objects {a1 , . . .

    , am } for Y(r,s) = 1 if ar preferred over as 0 if as preferred over ar P(Y(r,s) = 1) = exp(γr − γs ) 1 + exp(γr − γs ) , m r=1 γr = 0 γr attractivity/strength of object r γs attractivity/strength of object s ⇒ We additionally need • ordinal response • order effects Andreas Groll 4 / 29
  5. Basic Model A match between teams r and s is

    treated as a paired comparison with ordinal response Y(r,s) , with Y(r,s) =                1 if team r wins by at least 2 goals difference 2 if team r wins by 1 goal difference 3 if the match ends with a draw 4 if team s wins by 1 goal difference 5 if team s wins by at least 2 goals difference. P(Y(r,s) ≤ k) = exp(δ + θk + γr − γs ) 1 + exp(δ + θk + γr − γs ) , k = 1, . . . , 5 δ home effect θk category-specific threshold parameters, θ1 = −θ4 , θ2 = −θ3 γr , γs team-specific abilities, 18 r=1 γr = 0 Andreas Groll 5 / 29
  6. Results Basic Model P(Y(r,s) ≤ k) = exp(δ + θk

    + γr − γs) 1 + exp(δ + θk + γr − γs) ˆ δ = 0.265 ˆ θ1 = −ˆ θ4 = −1.591 ˆ θ2 = −ˆ θ3 = −0.576 Rank Team ˆ γr Rank(ˆ γr) 1 BAY 1.899 1 2 DOR 1.598 2 3 LEV 0.433 4 4 MGB 0.475 3 5 S04 0.133 5 6 MAI 0.088 6 7 BER -0.001 7 8 WOB -0.142 9 9 KOE -0.045 8 10 HSV -0.183 10 11 ING -0.228 11 12 AUG -0.363 13 13 BRE -0.361 12 14 DAR -0.467 15 15 HOF -0.448 14 16 FRA -0.623 16 17 STU -0.699 17 18 HAN -1.068 18 Andreas Groll 6 / 29
  7. Covariates (per match, per team) Distance Total amount of km

    run BallPossession Percentage of ball possession TacklingRate Rate of won tacklings ShotsonGoal Total number of shots on goal CompletionRate Percentage of passes reaching teammates FoulsSuffered Number of fouls suffered Offside Number of offsides (in attack) Source: German football magazine kicker (http://www.kicker.de/) Andreas Groll 7 / 29
  8. Data Example Match Goals Home Team Distance Shots on Goal

    . . . 1 5 yes Bayern München 109 23 . . . 1 0 no Hamburger SV 111 5 . . . 2 2 yes Bayer 04 Leverkusen 116 25 . . . 2 1 no TSG Hoffenheim 116 6 . . . 3 0 yes FC Augsburg 106 20 . . . 3 1 no Hertha BSC 04 111 11 . . . . . . . . . . . . . . . . . . . . . ... Andreas Groll 8 / 29
  9. Inclusion of covariates P(Y(r,s) ≤ k) = exp(δ + θk

    + γr − γs ) 1 + exp(δ + θk + γr − γs ) Match-specific response: Y(r,s) → Yi(r,s) Team-specific home effects: δ → δr Match-specific team abilities: γr → γir = βr0 + zT ir αr P(Yi(r,s) ≤ k) = exp(δr + θk + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) Andreas Groll 9 / 29
  10. Model Specification P(Yi(r,s) ≤ k) = exp(δr + θk +

    γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) δr team-specific home effects of team r θk category-specific threshold parameters βr0 team-specific intercepts zir p-dimensional covariate vector that varies over teams and matches αr p-dimensional parameter vector that varies over teams. Andreas Groll 10 / 29
  11. Estimation and Penalization Maximize penalized log-likelihood lp (·) = l(·)

    − λJ(·) , with J(·) = Pδ (·) + Pα (·) combing the penalties Pδ (δ1 , . . . , δm ) = r<s |δr − δs | , ⇒ Fusion of home effects Pα (α1 , . . . , αm ) = p j=1 r<s |αrj − αsj | + m r=1 |αrj | . ⇒ Fusion and selection of covariate effects Andreas Groll 11 / 29
  12. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 12 / 29
  13. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 13 / 29
  14. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 14 / 29
  15. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 15 / 29
  16. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 16 / 29
  17. Coefficient Paths 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0

    Home log(λ + 1) HOF HSV BRE LEV MAI KOE S04 HAN FRA AUG BAY DAR STU BER MGB DOR ING WOB 2.0 1.5 1.0 0.5 0.0 −4 −2 0 2 4 6 Intercept log(λ + 1) ING DAR HAN STU FRA BRE AUG HSV LEV BER HOF KOE MAI WOB S04 MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 Distance log(λ + 1) BAY HOF BER S04 STU WOB BRE AUG HAN LEV KOE MGB DOR FRA HSV ING DAR MAI 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) DAR HSV STU BAY DOR FRA BRE S04 MAI HOF LEV AUG KOE BER ING WOB MGB HAN 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 1.5 TacklingRate log(λ + 1) DAR BAY STU BER S04 DOR FRA LEV HSV MAI BRE ING HAN AUG KOE HOF WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 1.0 ShotsonGoal log(λ + 1) KOE HOF WOB ING LEV MGB HAN BAY DAR DOR FRA S04 HSV STU AUG BER BRE MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 CompletionRate log(λ + 1) HAN AUG ING MAI KOE DAR BER S04 HSV HOF BRE FRA WOB LEV STU MGB DOR BAY 2.0 1.5 1.0 0.5 0.0 −0.5 0.0 0.5 1.0 FoulsSuffered log(λ + 1) BAY BER BRE AUG HOF MGB HAN DOR STU DAR HSV WOB FRA LEV MAI KOE ING S04 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 0.5 Offside log(λ + 1) ING BER HOF BAY BRE DAR WOB KOE STU FRA HSV HAN AUG DOR S04 LEV MAI MGB Andreas Groll 17 / 29
  18. Variable Importance 2.0 1.5 1.0 0.5 0.0 0 1 2

    3 4 5 6 7 log(λ + 1) FoulsSuffered ShotsonGoal Offside Tackling CompletionRate BallPossession Distance Andreas Groll 18 / 29
  19. Specification Model II P(Yi(r,s) ≤ k) = exp(δr + θk

    + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk + βr0 − βs0 + zT ir αr − zT is αs ) Andreas Groll 19 / 29
  20. Specification Model II P(Yi(r,s) ≤ k) = exp(δr + θk

    + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) Andreas Groll 20 / 29
  21. Specification Model II P(Yi(r,s) ≤ k) = exp(δr + θk

    + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) Match-specific team abilities: γir = zT ir αr =⇒ Abilities are solely explained by covariates Andreas Groll 21 / 29
  22. Specification Model II P(Yi(r,s) ≤ k) = exp(δr + θk

    + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) 1 + exp(δr + θk +   XXXX X βr0 − βs0 + zT ir αr − zT is αs ) Match-specific team abilities: γir = zT ir αr =⇒ Abilities are solely explained by covariates P(Yi(r,s) ≤ k) = exp(δr + θk + γir − γis ) 1 + exp(δr + θk + γir − γis ) = exp(δr + θk + zT ir αr − zT is αs ) 1 + exp(δr + θk + zT ir αr − zT is αs ) Andreas Groll 22 / 29
  23. Coefficient Paths Model II 2.0 1.5 1.0 0.5 0.0 −1

    0 1 2 Home log(λ + 1) HOF HSV FRA BRE LEV DAR HAN S04 MAI DOR KOE AUG BER STU BAY ING WOB MGB 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 3.0 Distance log(λ + 1) BAY HAN S04 WOB HOF AUG MGB KOE BER STU LEV BRE MAI HSV FRA ING DAR DOR 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) HSV DOR STU MAI FRA BAY BRE HOF HAN S04 AUG LEV DAR BER KOE ING WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.5 −0.5 0.5 1.5 TacklingRate log(λ + 1) DAR BAY LEV STU S04 FRA HAN DOR HOF HSV ING BER KOE MAI AUG BRE WOB MGB 2.0 1.5 1.0 0.5 0.0 −2 −1 0 1 2 ShotsonGoal log(λ + 1) KOE WOB MGB FRA ING STU DAR BER HOF LEV BAY AUG HSV HAN BRE S04 DOR MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 CompletionRate log(λ + 1) AUG MAI KOE HSV HAN DAR HOF LEV BER FRA STU ING BRE S04 MGB WOB BAY DOR 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 FoulsSuffered log(λ + 1) MGB AUG BAY STU WOB BRE DAR MAI S04 HSV HAN BER HOF ING FRA DOR KOE LEV 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 1.0 2.0 Offside log(λ + 1) BER KOE ING HOF BRE WOB DAR AUG FRA STU S04 BAY HSV DOR HAN LEV MAI MGB Andreas Groll 23 / 29
  24. Coefficient Paths Model II 2.0 1.5 1.0 0.5 0.0 −1

    0 1 2 Home log(λ + 1) HOF HSV FRA BRE LEV DAR HAN S04 MAI DOR KOE AUG BER STU BAY ING WOB MGB 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 3.0 Distance log(λ + 1) BAY HAN S04 WOB HOF AUG MGB KOE BER STU LEV BRE MAI HSV FRA ING DAR DOR 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) HSV DOR STU MAI FRA BAY BRE HOF HAN S04 AUG LEV DAR BER KOE ING WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.5 −0.5 0.5 1.5 TacklingRate log(λ + 1) DAR BAY LEV STU S04 FRA HAN DOR HOF HSV ING BER KOE MAI AUG BRE WOB MGB 2.0 1.5 1.0 0.5 0.0 −2 −1 0 1 2 ShotsonGoal log(λ + 1) KOE WOB MGB FRA ING STU DAR BER HOF LEV BAY AUG HSV HAN BRE S04 DOR MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 CompletionRate log(λ + 1) AUG MAI KOE HSV HAN DAR HOF LEV BER FRA STU ING BRE S04 MGB WOB BAY DOR 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 FoulsSuffered log(λ + 1) MGB AUG BAY STU WOB BRE DAR MAI S04 HSV HAN BER HOF ING FRA DOR KOE LEV 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 1.0 2.0 Offside log(λ + 1) BER KOE ING HOF BRE WOB DAR AUG FRA STU S04 BAY HSV DOR HAN LEV MAI MGB Andreas Groll 24 / 29
  25. Coefficient Paths Model II 2.0 1.5 1.0 0.5 0.0 −1

    0 1 2 Home log(λ + 1) HOF HSV FRA BRE LEV DAR HAN S04 MAI DOR KOE AUG BER STU BAY ING WOB MGB 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 3.0 Distance log(λ + 1) BAY HAN S04 WOB HOF AUG MGB KOE BER STU LEV BRE MAI HSV FRA ING DAR DOR 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) HSV DOR STU MAI FRA BAY BRE HOF HAN S04 AUG LEV DAR BER KOE ING WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.5 −0.5 0.5 1.5 TacklingRate log(λ + 1) DAR BAY LEV STU S04 FRA HAN DOR HOF HSV ING BER KOE MAI AUG BRE WOB MGB 2.0 1.5 1.0 0.5 0.0 −2 −1 0 1 2 ShotsonGoal log(λ + 1) KOE WOB MGB FRA ING STU DAR BER HOF LEV BAY AUG HSV HAN BRE S04 DOR MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 CompletionRate log(λ + 1) AUG MAI KOE HSV HAN DAR HOF LEV BER FRA STU ING BRE S04 MGB WOB BAY DOR 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 FoulsSuffered log(λ + 1) MGB AUG BAY STU WOB BRE DAR MAI S04 HSV HAN BER HOF ING FRA DOR KOE LEV 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 1.0 2.0 Offside log(λ + 1) BER KOE ING HOF BRE WOB DAR AUG FRA STU S04 BAY HSV DOR HAN LEV MAI MGB Andreas Groll 25 / 29
  26. Coefficient Paths Model II 2.0 1.5 1.0 0.5 0.0 −1

    0 1 2 Home log(λ + 1) HOF HSV FRA BRE LEV DAR HAN S04 MAI DOR KOE AUG BER STU BAY ING WOB MGB 2.0 1.5 1.0 0.5 0.0 0.0 1.0 2.0 3.0 Distance log(λ + 1) BAY HAN S04 WOB HOF AUG MGB KOE BER STU LEV BRE MAI HSV FRA ING DAR DOR 2.0 1.5 1.0 0.5 0.0 −4 −3 −2 −1 0 BallPossession log(λ + 1) HSV DOR STU MAI FRA BAY BRE HOF HAN S04 AUG LEV DAR BER KOE ING WOB MGB 2.0 1.5 1.0 0.5 0.0 −1.5 −0.5 0.5 1.5 TacklingRate log(λ + 1) DAR BAY LEV STU S04 FRA HAN DOR HOF HSV ING BER KOE MAI AUG BRE WOB MGB 2.0 1.5 1.0 0.5 0.0 −2 −1 0 1 2 ShotsonGoal log(λ + 1) KOE WOB MGB FRA ING STU DAR BER HOF LEV BAY AUG HSV HAN BRE S04 DOR MAI 2.0 1.5 1.0 0.5 0.0 0 2 4 6 8 CompletionRate log(λ + 1) AUG MAI KOE HSV HAN DAR HOF LEV BER FRA STU ING BRE S04 MGB WOB BAY DOR 2.0 1.5 1.0 0.5 0.0 −0.5 0.5 1.0 FoulsSuffered log(λ + 1) MGB AUG BAY STU WOB BRE DAR MAI S04 HSV HAN BER HOF ING FRA DOR KOE LEV 2.0 1.5 1.0 0.5 0.0 −1.0 0.0 1.0 2.0 Offside log(λ + 1) BER KOE ING HOF BRE WOB DAR AUG FRA STU S04 BAY HSV DOR HAN LEV MAI MGB Andreas Groll 26 / 29
  27. Summary I HomeEffect positive and (almost) equal for all teams

    Model I • Covariate effects represent additional information compared to team-specific intercepts • Distance most influential covariate (positive effect) • Negative effect of BallPossession • strong effect of CompletionRate for Bayern München Model II • Explanatory power of team-specific intercepts is shifted to covariate effects • Dominance of Bayern München and Borussia Dortmund mainly explainable by high CompletionRate • Only one covariate is excluded Andreas Groll 27 / 29
  28. Summary II • General paired comparison model for the inclusion

    of subject-object-specific covariates • Penalty automatically distinguishes between − clusters of teams with equal covariate effects − global covariate effects − exclusion of covariates • Corresponding R-package BTLLasso under development • Technical report is available on the website of the Department of Statistics of the LMU Andreas Groll 28 / 29
  29. References Agresti, A. (1992). Analysis of ordinal paired comparison data.

    Applied Statistics 41(2), 287–297. Bradley, R. A. and M. E. Terry (1952). Rank analysis of incomplete block designs, I: The method of pair comparisons. Biometrika 39, 324–345. Schauberger, G. (2015). BTLLasso: Modelling Heterogeneity in Paired Comparison Data. R package version 0.1-2. Schauberger, G., A. Groll, and G. Tutz (2016). Modeling football results in the german bundesliga using match-specific covariates. Technical Report 197, Department of Statistics, Ludwig-Maximilians-Universität München, Germany. Schauberger, G. and G. Tutz (2015). Modelling heterogeneity in paired comparison data - an L1 penalty approach with an application to party preference data. Technical Report 183, Department of Statistics, Ludwig-Maximilians-Universität München, Germany. Tutz, G. (1986). Bradley-Terry-Luce models with an ordered response. Journal of Mathematical Psychology 30, 306–316. Andreas Groll 29 / 29
  30. Correlations Home Distance BallPossession Tackling ShotsonGoal CompletionRate FoulsSuffered Offside Home

    1.000 Distance 0.035 1.000 BallPossession 0.102 -0.113 1.000 Tackling 0.102 -0.082 0.186 1.000 ShotsonGoal 0.230 0.042 0.519 0.261 1.000 CompletionRate 0.068 0.103 0.717 0.118 0.422 1.000 FoulsSuffered 0.067 -0.200 0.089 0.236 0.035 -0.160 1.000 Offside 0.038 -0.037 0.091 0.088 0.055 0.042 -0.011 1.000 Andreas Groll 30 / 29
  31. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 31 / 29
  32. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 32 / 29
  33. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 33 / 29
  34. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ • Voters compare different parties with each other, e.g. CDU vs. SPD, CDU vs. Greens,... • Include subject-specific covariates like • age • gender • ... of the voters Andreas Groll 34 / 29
  35. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 35 / 29
  36. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 36 / 29
  37. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ • Model football data of a whole Bundesliga season • Include object-specific covariates like • budget • ... of the teams Andreas Groll 37 / 29
  38. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 38 / 29
  39. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 39 / 29
  40. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ • Model football data of a whole Bundesliga season • Include subject-object-specific covariates like • ball possession • number of passes • ... of the teams per match. Andreas Groll 40 / 29
  41. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 41 / 29
  42. Extensions of the Bradley-Terry Model P(Yi(r,s) = 1 | xi

    , zr , zs , zir , zis ) = exp(ηi(rs) ) 1 + exp(ηi(rs) ) = exp(γir − γis ) 1 + exp(γir − γis ) Covariate type Effect type γir = γis = ηi(rs) = intercept object-spec. βr0 βs0 βr0 − βs0 subject-spec. xi object-spec. +xT i βr +xT i βs +xT i (βr − βs ) object-spec. zr global +zT r τ +zT s τ +(zr − zs )Tτ subject-object-spec. zir global +zT ir τ +zT is τ +(zir − zis )Tτ subject-object-spec. zir object-spec. +zT ir αr +zT is αs +zT ir αr - zT is αs order effect object-spec. +δr +δr → incl. order effect global +δ +δ Andreas Groll 42 / 29
  43. "Predictive" Performance Boxplots of differences of the predicted probabilities of

    correct 3-way match outcome between respective model and bookmakers’ odds: q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Basic Model Model I Model II −0.4 −0.2 0.0 0.2 0.4 0.6 Andreas Groll 43 / 29