データ分析にPythonは必要ですか?なお野球の場合は #spoana / Python, R, SQL And SABRmetrics

2c0947c6a28e7f771ebd9859ecf54e5c?s=47 Shinichi Nakagawa
February 16, 2020

データ分析にPythonは必要ですか?なお野球の場合は #spoana / Python, R, SQL And SABRmetrics

Sports Analyst Meetup #6 2020/02/16
https://spoana.connpass.com/event/162114/
#Baseball #DataScience #SABRmetrics #Python

2c0947c6a28e7f771ebd9859ecf54e5c?s=128

Shinichi Nakagawa

February 16, 2020
Tweet

Transcript

  1. 3.

    Who am I ?ʢ͓લ୭Αʣ • Shinichi Nakagawaʢத઒ɹ৳Ұʣ • JX Press

    Corporation/Senior Engineer • Python΋͘΋ࣗ͘शࣨΦʔΨφΠβʔ https://jisyupy.connpass.com/ • SNS౳͸@shinyorkeʢ͠ΜΑʔ͘ʣͱ͍͏໊લͰੜଉ • #Python #DataScience #Baseball⚾ #SABRmetrics #Agile
  2. 18.

    ྨࣅੑείΞ #ͱ͸ • ຊਓΛ1000఺ͱ͠, ੒੷͓Αͼकඋϙδγϣϯ͔Βݮͯ͡ධՁ ݮ఺͕গͳ͚Ε͹গͳ͍΄Ͳࣅ͍ͯΔ͜ͱʹͳΔ • ܭࢉํ๏ https://www.baseball-reference.com/about/similarity.shtml •

    ηΠόʔϝτϦΫεͷੜΈͷ਌ϏϧɾδΣʔϜζࢯۘ͝੡ ͭ·Γྺ࢙͸ݹ͘, େੲ͔Βଘࡏ͢Δʢॳग़͸1994೥ʣ ͜Ε, ςετʹग़·͢Αʂ
  3. 19.

    ؾʹͳΔબखɹ˞ϝδϟʔϦʔΨʔݶఆ • Ichiro Suzuki • Yoshitomo Tsutsugo • Shohei Ohtani

    ※౵߳બखͷΈ, ೔ຊ࣌୅ͷ࠷ऴ੒੷ͱൺֱʢ͋͘·Ͱࢀߟ஋ʣ
  4. 26.

    ϒϩάͷ୅දίϯςϯπ • ΤϯδχΞʢϓϩάϥϛϯάʣ • Pythonɾσʔλܥͷຊ঺հ https://shinyorke.hatenablog.com/entry/python2020 • Pandas, σʔλج൫, etc…

    https://shinyorke.hatenablog.com/entry/nyumon-pandas • ໺ٿʢσʔλ෼ੳɾηΠόʔϝτϦΫεʣ • ηΠόʔϝτϦΫε https://shinyorke.hatenablog.com/entry/sabr-metrics-2020 • ໺ٿ෼ੳωλʢ͍ͬͺ͍͋Δʣ
  5. 43.

    υϝΠϯ஌ࣝʢͱѪ৘ʣʹภΔͳ • ʮυϝΠϯ஌ࣝ = ର৅ڝٕ΁ͷ஌ࣝʯ, Ѫ৘͸ݴΘͣ΋͕ͳ ͻͬ͘ΔΊͯ, ʮϑΝϯʢΦλΫʣతͳࢥߟɾߦಈʯͷ͜ͱ. • ϑΝϯ໨ઢɾࢥߟ͔Β͘Δภͬͨݟղɾ஌͕͍ࣝͭͯ͘Δͱ

    Ͳ͏ͯ͠΋ओ؍తͳΞΫγϣϯʹͳͬͯ͠·͏. • ཁ͢Δʹ, ʮϑΝϯͳ͚ͩͰ͸࢓ࣄʹͰ͖·ͤΜΑʂʯ ϏδωεϚϯͱͯ͠࢓ࣄΛ͢Δೳྗͷํ͕େࣄʢ࢓ࣄʹ͢ΔͳΒʣ
  6. 64.

    ৗࣝʹറΒΕͳ͍ ʮैདྷͷৗࣝ΍ܦݧଇʹറΒΕͣʹࠜຊతʹ෺ࣄͷ࢓૊ΈΛଊ͑௚͢ʯ by ηΠόʔϝτϦΫεೖ໳ • ྫ͑͹, ϝσΟΞɾϑΝϯɾී௨ͷਓ͸༏लͳଧऀΛʮଧ཰ɾଧ఺ɾຊྥଧʯͰݟ·͢ • ͕ϗϯτʹͦΕͰ͍͍ͷͩΖ͏͔ʁ •

    ૬खͷकඋ͕Լख͔ͦͩ͘ΒώοτՔ͛Δ͔΋ʢ૬खνʔϜͷ͓͔͛ʣ • ͦ΋ͦ΋ϥϯφʔ͕ͨ͘͞Μग़ΔνʔϜ͔ͩΒଧ఺ޤ৯ʹʢࣗνʔϜͷ͓͔͛ʣ • ʁʁʁʮϗʔϜϥϯςϥε࠷ߴ΍ʂʯʢٿ৔ͷ͓͔͛ʣ • ࠓ·Ͱʮৗࣝʯͱݺ͹Ε͖ͯͨ΋ͷΛ·ͣߟ͑௚͢ͱ͜Ζ͔Β͸͡ΊΔ
  7. 65.

    ٬؍తͳࣄ࣮Λॏࢹ͢Δ ʮओ؍ʹཔΒͣʹ٬؍తͳࣄ࣮ʹ΋ͱ͍ͮͯߟ͑Δ͜ͱʯ by ηΠόʔϝτϦΫεೖ໳ • ໺ٿ͸ָ͍͠ɾੌ͍ͷͰओ؍ʢओޠʣ͕େ͖͘ͳΓ͕ͪ • γϡΞͳόοςΟϯά͕΢Ϧͳ૸߈कࡾഥࢠἧͬͨ֎໺ख • ౤͛ͬ΀Γ͕͍͍ϑΥʔϜ͔Β܁Γग़͞ΕΔΩϨ͕ൈ܈ͷετϨʔτ

    • …ͳͲͳͲ, ΈΜͳ޷͖ʢὃ͞ΕΔʣ͡ΌΜʁࢲ΋ͩΑʢ͔ͩΒ٬؍తʹݟΔʣ • ָ͍͠ؾ࣋ͪΛ཈͑ͯ, ʮ٬؍తʯͳࣄ࣮ʹண໨ͯ͠ಡΉɾޠΔ ʮϦʔάฏۉΑΓٿ଎͕ग़ΔετϨʔτʯʮଧٿ଎౓͕଎͍ϥΠφʔ͕ଟ͍ʯͱ͔
  8. 66.

    ఆྔతʹߟ͑Δ ʮ٬؍తͳࣄ࣮ΛධՁ͢ΔࡍʹॏཁͱͳΔͷ͕ఆྔతʹߟ͑Δͱ͍͏͜ͱͰ͢ʯ by ηΠόʔϝτϦΫεೖ໳ • ٬؍తʹߟ͑Δ = Կ͔͠Βͷํ๏Ͱܭଌ͕Ͱ͖Δ, ͜ͱ. •

    ʮྔʢউར਺ɾಘ఺ɾείΞͳͲʣʯʮ཰ʯͱ͍ͬͨ΋ͷʹ׵ࢉͯ͠ఆྔԽ͍ͯ͘͠ • ྔɿWARʢউར਺ʣ, wRAAʢಘ఺ʣ, ྨࣅੑείΞʢείΞʣͳͲ • ཰ɿFIPʢxFIPʣ, K/BB, wOBA, OPSͳͲ • ྔ͸࣮ࡍͷߩݙ౓, ࣭͸ϓϨʔ΍Πϕϯτͷਫ਼౓ɾ੒ޭ཰ʹد༩͢Δࢦඪ͕ଟ͍
  9. 67.

    ʮఆྔతʯʹଊ͑Δͷʹʮఔ౓ʯ͕େ੾ • ղ͖͍ͨ՝୊ʢISSUEʣʹ߹ΘͤͯʮఆྔԽʯ͸ఔʑʹ • ʮݱ໾ͰҰ൪͜Θ͍ଧऀ͸୭͔ʁʯΛఆྔԽ͢Δͱͯ͠ • ʮಘ఺ʯʹεέʔϧ͢ΔRC΍wRAA͕ྑͦ͞͏͕ͩܭࢉํ๏͸Ͳͬͪ΋ෳࡶ • ୯ʹڧ͍ଧऀΛฒ΂ΔͳΒOPSͰ΋ࣄ͕଍ΓΔʢग़ྥ཰ͱ௕ଧ཰Λ଍͚ͩ͢ʣ •

    RC΋wRAA΋OPS΋, ॏཁͳઆ໌ม਺͸ʮྥଧʯͳͷͰ݁Ռ͸વఔมΘΒͳ͍ ʢগͳ͘ͱ΋ଧܸͰڧ͍ଧऀΛ্͔Β3ਓग़͢෼ʹ͸ʣ • ʮఆྔԽʯͦͷ΋ͷΛ໨తʹͤͣ,ʮ٬؍ࢹʯ͢ΔͨΊʮఆྔԽʯΛ೗Կʹγϯϓϧʹ͢Δ͔Λ໨తʹ
  10. 70.

    ے͕ྑ͍ISSUEɾྑ͘ͳ͍ISSUE • ے͕ྑ͍ʢఆྔతͳࢦඪΛ༻͍ͯূ໌Մೳʣ • νʔϜͰߩݙ͍ͯ͠Δʢ଍ΛҾͬு͍ͬͯΔʣଧऀΛ஌Γ, ΑΓΑ͍ى༻ํ๏Λ໛ࡧͯ͠ΈΔ • ळࢁᠳޗͷ୅ΘΓʹʓʓʢ୭͔֎໺खʣΛҰ೥ؒηϯλʔͰى༻ͨ͠Β੢෢ͷ੒੷͸্͕ΔʁԼ͕Δʁ • େ୩ᠳฏ͕౤खʹઐ೦ͨ͠৔߹ΤϯδΣϧεͷಘࣦ఺ࠩ͸Ͳ͏มԽ͢Δʁ

    • ے͕ྑ͘ͳ͍ʢझຯɾ༡ͼͱͯ͠͸ྑ͍͕σʔλ෼ੳͱͯ͠͸ʁʣ • ೥เ༧ଌʢ੒੷ͱ૊Έ߹ΘͤͯROIධՁͳΒΑ͍ISSUE, ༧ଌ”͚ͩ”ͩͱझຯωλͬΆ͍ʣ • ग़਎஍ผͰ੒੷ͷ܏޲ʢϝσΟΞاըͱͯ͠໘ന͍͕, ͜ΕͰͲ͏͍ͬͨISSUE͕ղ͚Δ͔ͳʁʣ