Slide 1

Slide 1 text

(΄΅)Python͚ͩͰ෼ੳج൫Λ࡞ͬͨϋφγ ʙPyConJP 2017นଧͪฤʙ Shinichi Nakagawa(Retty.Inc Engineer/Baseball Analyst) kawasaki.rb #051 5೥໨ಥೖLTେձ

Slide 2

Slide 2 text

ਖ਼૷ = ໺ٿϢχϑΥʔϜʂ ※ࣸਅ෇͖ͰγΣΞ͓ئ͍͠·͢ʂ

Slide 3

Slide 3 text

Who am I ?(͓લ୭Α) • Pythonք۾ͷʮ໺ٿͷਓʯͰ͢ • Shinichi Nakagawa(@shinyorke) • Retty.Inc Engineering Manager
 ݉,ڕྉཧ୲౰ • Baseball Scientist/໺ٿσʔλ෼ੳऀ • #Python #SABRmetrics #໺ٿ౷ܭֶ #Agile #Scrum

Slide 4

Slide 4 text

Starting member(͓͠ͳ͕͖) • Kawasaki.rb5೥໨ಥೖ͓ΊͰͱ͏͍͟͝·͢ • PythonͰͭ͘ΔʮԶʑ໺ٿ෼ੳج൫ʯ • Scrapy(σʔλूΊ) • sabr + SQLAlchemy(લॲཧ) • Airflow(λεΫ੍ޚ) • ʲ࣮ફྫʳΧʔϓ͕ڧ͗͢Δཧ༝ΛwRAAͰূ໌͢Δ

Slide 5

Slide 5 text

5೥໨ಥೖ ͓ΊͰͱ͏͍͟͝·͢ʂ

Slide 6

Slide 6 text

ࢲͱ #kwskrb • ॳࢀՃ(2014/8)ɿPyCon JP 2014นଧͪLT • ೋճ໨(2015/9)ɿPyCon JP 2015นଧͪLT(2೥࿈ଓ) • ࡾճ໨(2016/8)ɿPyCon JP 2016นଧͪLT(3೥࿈ଓ)
 ※઒࡚Rubyձٞ01 LT • ࢛ճ໨(2016/12)ɿҿΈͨ͞ͱ࿩ͨ͠͞Ͱ๨೥ձࢀՃ • ޒճ໨(2017/8)ɿ PyCon JP 2017นଧͪLT(4೥࿈ଓ)←ΠϚίί

Slide 7

Slide 7 text

(΋͔ͯ͠͠) #kwskpy ?

Slide 8

Slide 8 text

ͱ͍͏Θ͚Ͱ ࠓ೔΋Pythonͷ࿩Λ

Slide 9

Slide 9 text

໺ٿΛՊֶ͢Δٕज़ PythonΛ༻͍ͨ౷ܭϥΠϒϥϦ࡞੒ͱ෼ੳج൫ߏங 9/8(ۚ) 10:55 a.m.-11:25 a.m. https://pycon.jp/2017/ja/schedule/presentation/15/

Slide 10

Slide 10 text

ࠓ೔͸෼ੳج൫ͱ ͪΐͬͱͨ͠෼ੳ&ՄࢹԽΛ ൸࿐͠·͢

Slide 11

Slide 11 text

Զʑ໺ٿ෼ੳج൫(શମ૾)

Slide 12

Slide 12 text

Զʑ໺ٿ෼ੳج൫(શମ૾) ᶃ4DSBQZ εΫϨΠϐϯά બख੒੷Λ୳ͯ͠อଘ

Slide 13

Slide 13 text

Զʑ໺ٿ෼ੳج൫(શମ૾) ᶃ4DSBQZ εΫϨΠϐϯά બख੒੷Λ୳ͯ͠อଘ ᶄલॲཧ 4"#3NFUSJDT ࢦඪ஋ܭࢉσʔλߋ৽

Slide 14

Slide 14 text

Զʑ໺ٿ෼ੳج൫(શମ૾) ᶃ4DSBQZ εΫϨΠϐϯά બख੒੷Λ୳ͯ͠อଘ ᶄલॲཧ 4"#3NFUSJDT ࢦඪ஋ܭࢉσʔλߋ৽ ᶅ෼ੳՄࢹԽ +VQZUFS σʔλΛΰχϣͬͯՄࢹԽ

Slide 15

Slide 15 text

Զʑ໺ٿ෼ੳج൫(શମ૾) ᶃ4DSBQZ εΫϨΠϐϯά બख੒੷Λ୳ͯ͠อଘ ᶄલॲཧ 4"#3NFUSJDT ࢦඪ஋ܭࢉσʔλߋ৽ ᶅ෼ੳՄࢹԽ +VQZUFS σʔλΛΰχϣͬͯՄࢹԽ "JSqPX +0#؅ཧ εΫϨΠϐϯάલॲཧͷ࣮ߦ੍ޚ

Slide 16

Slide 16 text

ίΞٕज़(ओʹPython) • Scrapy • લॲཧ(sabr + SQLAlchemy) • Airflow
 
 ※Jupyter͸ΈΜͳ஌ͬͯΔͱࢥ͏ͷͰলུ

Slide 17

Slide 17 text

ScrapyʙΫϩʔϥʔFW • WebαΠτͷΫϩʔϧͱεΫϨΠϐϯά,σʔλͷอଘͳ ͲΛҰؾ௨؏ʹߦ͑ΔΫϩʔϥʔFW • ΫϩʔϥʔքͷDjango/Ruby On RailsͱݺΜͰ͍͍ଘࡏ • εέδϡʔϥʔ,UserAgent,HTTP Header,μ΢ϯϩʔυͷ λΠϛϯά,Ωϟογϡetc…ඞཁʹͳΔ΋ͷ͕͋Β͔͡Ί ༻ҙ͞Ε͍ͯΔ&ύϥϝʔλͷઃఆͳͲͰ؆୯ʹઃఆՄೳ

Slide 18

Slide 18 text

લॲཧ(sabr + SQLAlchemy) • ʲ՝୊ʳԿ౓΋ग़ͯ͘ΔηΠόʔϝτϦΫεܭࢉ
 ຖճίʔυॻ͘ͷ΋ΞϨͩͳ͋ • ͱ͍͏Θ͚Ͱ,OPS,RC,wOBA,wRAA,ΞμϜɾμϯ཰etc…
 Λܭࢉ͢ΔΫϥεΛύοέʔδʹͯ͠ެ։ • https://github.com/Shinichi-Nakagawa/sabr • εΫϨΠϐϯάͨ݁͠Ռ͔Βࢦඪܭࢉ͢ΔΑ͏ʹͨ͠
 DBૢ࡞͸SQLAlchemy(O/R Mapper)ͰαΫοͱ։ൃ

Slide 19

Slide 19 text

SABR(Example) $ pip install sabr $ python >>> import sabr >>> from sabr.stats import Stats >>> Stats.hr9(26, 209.7) # Yu Darvish(2013) HR/9 1.1

Slide 20

Slide 20 text

AirflowʙJOB؅ཧ • σʔλΛຖ೔Ϋϩʔϧ&εΫϨΠϐϯά
 JOB؅ཧ͍ΔΑͶ? • ͱ͍͏༁Ͱ,Airbnbۘ੡ͷʮAirflow(ؾྲྀ)ʯΛར༻
 https://airflow.incubator.apache.org/ • ؾྲྀ(airflow)ͷ༻ʹྲྀΕͯ࢖͑ΔΒ͍͠…
 ͕,๻ʹͱͬͯ͸ʮཚؾྲྀ(Turbulence)ʯͩͬͨw
 ※Կ͕ཚؾྲྀ͔͸PyCon JPຊ൪orࠓ೔ͷ࠙਌ձͰʂ • ઃఆͱ͔ಈ࡞͕Ϋι໘౗͍͘͞ͷͰDocker imageʹͨ͠(·ͩ։ൃத)
 https://hub.docker.com/r/shinyorke/airflow/

Slide 21

Slide 21 text

[ྫ]޿ౡଧઢͱڊਓଧઢΛൺֱ • 2017/8/20࣌఺ͷσʔλͰ޿ౡͱڊਓΛൺֱ • ΄΅نఆଧ਺ͷଧऀͷwRAAΛൺֱͯ͠ධՁ
 ˞wRAA(ଧܸߩݙ౓)ɿ+10Ҏ্ੌ͍,ϚΠφε(ry • #kwskrb ͷ #51 ճ໨ʹͪͳΜͰ,
 ޿ౡͷ #51ͷύϑΥʔϚϯεධՁ΋͍ͭͰʹ΍Δ

Slide 22

Slide 22 text

޿ౡVSڊਓ(نఆଧ੮Ҏ্) ޿ౡͷ਺ࣈ͕೿ख,ͳ͓ڊਓ

Slide 23

Slide 23 text

޿ౡVSڊਓ, wRAA(ଧܸߩݙ౓)ΛάϥϑԽ ࠨɿ޿ౡ,ӈɿڊਓ…͕ࠩ։͖͍͗͢

Slide 24

Slide 24 text

ླ໦੣໵ͷwOBA(ॏΈ෇͖ग़ྥ཰)ͱwRAA(ଧܸߩݙ౓) 8/15-8/20·Ͱ,ԜΜͰ͍Δ19೔͸4ଧ਺ແ҆ଧ

Slide 25

Slide 25 text

·ͱΊ • (PythonͰશ෦Ͱ͖Δͷ͸)ݟͯͷ௨ΓͰ͢ • ऩूˠલॲཧˠՄࢹԽΛಉ͡ݴޠͰ
 Ұؾ௨؏ʹ࡞ΕΔͷ͸ָ • ScrapyͱAirflowͷ૊Έ߹ΘͤͰσʔλऩू&อଘ͸݁ߏΠέΔ
 (ͨͩ͠,Airflowͷҋ͸ਂ͍) • ޿ౡଧઢ͸Τά͍,ڊਓ͕Μ͹Ε,ླ໦੣໵͍͢͝ • ͳʹ͸ͱ΋͋Ε #kwskrb 5೥໨͓ΊͰͱ͏͍͟͝·͢ʂ

Slide 26

Slide 26 text

ήʔϜηοτʂʂʂ ͝ਗ਼ௌ͋Γ͕ͱ͏͍͟͝·ͨ͠&PyCon JP 2017Ͱ͓ձ͍͠·͠ΐ͏ʂ Shinichi Nakagawa(Twitter/Facebook/hatena:@shinyorke)