Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
FukuokaR #7
Search
Hiroki Mizukami
March 25, 2017
Science
0
330
FukuokaR #7
https://www.amazon.co.jp/dp/4774188778
Hiroki Mizukami
March 25, 2017
Tweet
Share
More Decks by Hiroki Mizukami
See All by Hiroki Mizukami
音楽配信サービスにおける 推薦システムの概要と 数理モデルについて
hiroki_mizukami
0
220
CADEDA #6 AWAにおけるデータ利活用の取り組みと今後の展望について
hiroki_mizukami
4
2.4k
オンライン広告の数理モデルと数学ソフトウェア MSFD#23
hiroki_mizukami
6
4.7k
Other Decks in Science
See All in Science
Algorithmic Aspects of Quiver Representations
tasusu
0
180
AI(人工知能)の過去・現在・未来 —AIは人間を超えるのか—
tagtag
PRO
0
140
Hakonwa-Quaternion
hiranabe
1
170
コミュニティサイエンスの実践@日本認知科学会2025
hayataka88
0
120
Rashomon at the Sound: Reconstructing all possible paleoearthquake histories in the Puget Lowland through topological search
cossatot
0
470
論文紹介 音源分離:SCNET SPARSE COMPRESSION NETWORK FOR MUSIC SOURCE SEPARATION
kenmatsu4
0
490
力学系から見た現代的な機械学習
hanbao
3
3.8k
次代のデータサイエンティストへ~スキルチェックリスト、タスクリスト更新~
datascientistsociety
PRO
2
27k
イロレーティングを活用した関東大学サッカーの定量的実力評価 / A quantitative performance evaluation of Kanto University Football Association using Elo rating
konakalab
0
190
My Little Monster
juzishuu
0
540
LayerXにおける業務の完全自動運転化に向けたAI技術活用事例 / layerx-ai-jsai2025
shimacos
2
21k
(メタ)科学コミュニケーターからみたAI for Scienceの同床異夢
rmaruy
0
160
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
432
66k
Introduction to Domain-Driven Design and Collaborative software design
baasie
1
580
For a Future-Friendly Web
brad_frost
182
10k
Jess Joyce - The Pitfalls of Following Frameworks
techseoconnect
PRO
1
62
From Legacy to Launchpad: Building Startup-Ready Communities
dugsong
0
130
First, design no harm
axbom
PRO
2
1.1k
Paper Plane (Part 1)
katiecoart
PRO
0
3.8k
The Art of Programming - Codeland 2020
erikaheidi
57
14k
YesSQL, Process and Tooling at Scale
rocio
174
15k
Writing Fast Ruby
sferik
630
62k
How To Stay Up To Date on Web Technology
chriscoyier
791
250k
Mozcon NYC 2025: Stop Losing SEO Traffic
samtorres
0
130
Transcript
Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ Ӣ
Ӣ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ֗ ొ ཽ
ཽ ܭ ౷ ొ ཧ ֬ 2
ཧϞσϦϯάͱɺ ౷ܭϞσϦϯάͱɺ ͦΕ͔Βɺࢲɻ 2
ཽ ܭ ౷ ొ ཧ ֬ 3
@ Fukuoka R Mar 25, 2017 य़ Hiroki Mizukami Destroy 3
ཽ ܭ ౷ ొ ཧ ֬ 4
※ݸਓͷݟղɻɻɻ 4
ཽ ܭ ౷ ొ ཧ ֬ 5
5 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
• Έ͔ͣΈ ͻΖ͖ • LINE_ID: @piroyoung • αΠόʔܥͷAI Labɽ •
αʔόαΠυΤϯδχΞ • σʔλαΠΤϯςΟετ • ౦ژࡏॅʗԬग़ • ֶʗࠂʗWeb • Love έΰύʔΫ • R/Python/Scala/javascript/Spark/ Docker/AWS/Stan/Tableau/AWS/GCP ࣗݾհ ϔϏϝλ
Rݴޠ ʢڱٛʣ Rݴޠʢ͋ʔΔ͛Μ͝ʣΦʔϓϯιʔεɾϑϦʔιϑτΣΞͷ౷ܭղੳ͚ ͷϓϩάϥϛϯάݴޠٴͼͦͷ։ൃ࣮ߦڥͰ͋Δɻ RݴޠχϡʔδʔϥϯυͷΦʔΫϥϯυେֶͷRoss IhakaͱRobert Clifford GentlemanʹΑΓ࡞ΒΕͨɻݱࡏͰR Development Core
TeamʢSݴޠ։ൃऀ Ͱ͋ΔJohn M. Chambersࢀը͍ͯ͠Δ[1]ɻʣʹΑΓϝϯςφϯεͱ֦ு͕ͳ ͞Ε͍ͯΔɻ RݴޠͷιʔείʔυओʹCݴޠɺFORTRANɺͦͯ͠RʹΑͬͯ։ൃ͞Εͨɻ - wikipedia -
Rݴޠ ʢٛʣ σʔλੳΛੜۀͱ͢Δܑ͓͞Μ͓Ͷ͐͞ΜୡͷίϛϡχςΟͷ૯শɾ֓೦ɾε ϥϯάɻདྷΔͷશͯڋ·ͳ͍ελΠϧͰɺ࣮ࡍʹσʔλੳΛ͍ͬͯΔ͔ ͢ΒجຊతʹࣗݾਃࠂɻϢʔϞΞͱϢʔϞΞͱਓฑ͕ΛूΊΔϙΠϯτɽෳ ͷελʔτΞοϓϕϯνϟʔΛੜΈग़͍ͯ͠Δɽ ͱ͋Δ౷ܭʹΑΔͱ࣮ࡍʹRΛ͔ͭͬͯΔͻͱ Α͏͢ΔʹɼࠓRͷίΞͳ͠ͳ͍ͬͯ͜ͱͰ͢͢Έ·ͤΜɽ - mikipedia
-
ཽ ܭ ౷ ొ ཧ ֬ 9
9 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
ཧϞσϦϯά ཧϞσϦϯά ͱσʔλͷதʹ͋ΔߏΛࣜͰهड़͢Δ͜ͱ ྫ͑͜Μͳσʔλ͕༗Δ ͜ͷͱ͖όωAʹؔͯ͠ ʦόωͷ͞ʧʹ 0.2 x [͓Γͷॏ͞] +
3 ͱݱʹؔ͢Δࣜͷදݱ͕ಘΒΕΔɽ
ཧϞσϦϯά Ͳ͏ͬͨʁ όωAʹؔͯ͠ҎԼͷ࿈ཱํఔ͕ࣜͨͯΒΕΔ ͜ΕΛղ͚
ཧϞσϦϯά Կ͕͏Ε͍͠ʁ • ݱ࣮ͷͷߟʹֶͷςΫχοΫͰ͑ΒΕΔɽ • ײ͕ٴͳ͍ʹ͑Δ • ݫີ • ఆྔత
• ʮόωAͷํ͕৳ͼ͍͢ʯ
ཧϞσϦϯά ݫີʻʼײɼఆྔతʻʼఆੑత ʮؾԹ͕ߴ͍ͱδϝδϝ͢ΔͶ͐ʯ ͜Ε͜ΕͰॏཁɽ
ཽ ܭ ౷ ొ ཧ ֬ 14
ʮͱΓ͋͑ͣɺՄࢹԽ͠Αʁʯ 14
ཽ ܭ ౷ ొ ཧ ֬ 15
ʮࣜɺͨͯΐʁʯ 15
ཽ ܭ ౷ ొ ཧ ֬ 16
ʮσʔλΛೖ͠Αʁʯ 16
ཽ ܭ ౷ ొ ཧ ֬ 17
ʮύϥϝλܭࢉͰ͖ͨ͊ʂʂʯ 17
ཽ ܭ ౷ ొ ཧ ֬ 18
18 Click = CTR · Imp
ཽ ܭ ౷ ొ ཧ ֬ 19
19 pV = nRT
ཽ ܭ ౷ ొ ཧ ֬ 20
20
ཽ ܭ ౷ ొ ཧ ֬ 21
21 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
౷ܭϞσϦϯά ౷ܭϞσϦϯά ͱ֬ʹجͮ͘ཧϞσϦϯάɽ ֬มΛؚΉϞσϧࣜΛ༻͍Δɽ ֬มͱϥϯμϜͳৼΔ͍ʹ؍ଌΛରԠ͚ΔΈͷ͜ͱɽ ཁ͢Δʹ ʮ ͕ग़ͨ−ʂʂʯʹʼ 1 ͬͯͳ۩߹ɽ
X : ! 2 ⌦ 7! X(!) 2 R
౷ܭϞσϦϯά ࣄͱߟͷରͱ͢ΔϥϯμϜͳৼΔ͍ͷ͋ͭ·Γɽ ͜Ε؍ଌ͕͇ΛԼճΔͱ͍͏ৼΔ͍ͷू·Γͷ͜ͱ ֶతͳఆٛ X : ! 2 ⌦ 7!
X(!) 2 R X < x [ X < x ] := X 1([ 1 , x )) = { ! 2 ⌦| X ( ! ) < x }
౷ܭϞσϦϯά ֬ͱ؍ଌͷཚࡶ͞ͷֶతදݱ ͜Εਖ਼نͰ͜Μͳײ͡ʹද͢ɽ ʮ֬มX͕ฏۉμɼඪ४ภࠩσͷਖ਼نʹै͏ʯͱಡΉɽ μσͳͲͷΛݸੑ͚ΔύϥϝλΛ ͱ͍͏ɽ X ⇠ N(µ,
2)
౷ܭϞσϦϯά ਪఆͱɼσʔλΛͱʹΛ༧͢Δ͜ͱ ʮΉΉʔʂ͜Ε֬0.5Ͱද͕ग़Δͷ͔͠Εͳ͍ʂʯ ʮͬͺ10ͷ1͘Β͍͔͠Εͳ͍ɽɽɽʯ ͜ͷਪఆͷʢͬͱʣΒ͠͞ͱݺΕ͍ͯΔ ද ཪ ද ཪ ཪ
ཪ ཪ ཪ ཪ ཪ ཪ ཪ
ཽ ܭ ౷ ొ ཧ ֬ 26
26 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ༧ଌͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά ղऍͱ൚Խ
ઢܗճؼϞσϧ ҎԼͷΑ͏ͳσʔλ͕༗Δɽ ͕ɼ࣮෩͕ਧ͍ͯͯਖ਼֬ʹܭଌग़དྷͯͳ͍ͬΆ͍ɽ ࠷ॳͱ͓ͳ͡ઢܗͷϞσϧࣜʹσʔλΛೖͯ͠ΈΔͱ
ઢܗճؼϞσϧ
ཽ ܭ ౷ ొ ཧ ֬ 29
ղ͚ͳ͌ɻɻɻ 29
ཽ ܭ ౷ ొ ཧ ֬ 30
ղͷͳ͌ɺ࿈ཱํఔࣜɻɻɻ 30
ཽ ܭ ౷ ొ ཧ ֬ 31
୳ͯ͠ɺݟ͔ͭΒͳ͌ͬͯίτɻɻɻ 31
ཽ ܭ ౷ ొ ཧ ֬ 32
͏ŵŧƄແཧɻ౷ܭ͠ΐɻɻɻ 32
ઢܗճؼϞσϧ ͜ͷϞσϧ؍ଌޡ͕ࠩߟྀ͞Ε͍ͯͳ͌ɻɻɻ ਖ਼نͷޡࠩԾఆ͢Δ y = ✓0 + ✓1x +✏ y
= ✓0 + ✓1x ✏ ⇠ N(0, 2)
ઢܗճؼϞσϧ ਖ਼نʹै͏ޡࠩΛԾఆͨ͠ϞσϧΛઢܗճؼϞσϧͱ͍͏ ✏ ⇠ N(0, 2) Y (✓0 + ✓1X)
⇠ N(0, 2) Y ⇠ N(✓0 + ✓1X, 2)
ཽ ܭ ౷ ొ ཧ ֬ 35
ਪఆ͠ΐɻɻɻ 35
ઢܗճؼϞσϧ ਖ਼نʹै͏ޡࠩΛԾఆͨ͠ϞσϧΛઢܗճؼϞσϧͱ͍͏ Ұ൪Β͍͠θͱσΛܭࢉ͢Δ ͜͜Ͱ Y ⇠ N(✓0 + ✓1X, 2)
L(✓1, ✓2, ) = Y i 1 p 2⇡ 2 e (yi µi)2 2 2 µi = ✓0 + ✓1xi
ઢܗճؼϞσϧ ରؔ θͷਪఆԼઢ෦Λ࠷খʹ͢Ε͍͍ࣄ͕Θ͔Δ ͜ΕΛ࠷খ2๏ͱ͍͏ɽ = 0
ઢܗճؼϞσϧ σʹؔ͢Δํఔࣜ ͜ΕΛղ͚ ͕ಘΒΕΔɽ͜Εඪຊࢄɽ @ @ log L ( ✓1,
✓2, ) = 0 2 = 1 n X i (yi µi)2
ઢܗճؼϞσϧ Rͩͱ؆୯ʹܭࢉͰ͖Δɽ
ཽ ܭ ౷ ొ ཧ ֬ 40
PythonͩͬͨΒɻɻɻ statsmodels / sklearn.linear_model.*** 40
ཽ ܭ ౷ ొ ཧ ֬ 41
41 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ղऍͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά
ઢܗճؼϞσϧ ղऍλεΫ ؍ଌ͞Εͨσʔλͷੑ࣭ΛௐΔɽ ੑผ༧ଌϞσϧ αΠτAΛݟͯΔͷஉੑ͕ଟ͍ɽ
ઢܗճؼϞσϧ ղऍλεΫ ͜ͷCPAʢ͋ͨΓίετʣࢪࡦͷྑ͞ͷධՁͱͯ͠༗ޮ Ͱɽɽɽ ʮ2ஹԁग़ͨ͠ΔΘ ɼ2ԯCVΖʯʹʼ͑ͬɾɾɾ ͪΖΜແཧ͕͋Δ CV = 1
CPA · Cost
ઢܗճؼϞσϧ ղऍλεΫ ͜ͷCPAʢ͋ͨΓίετʣࢪࡦͷྑ͞ͷධՁͱͯ͠༗ޮ Ͱɽɽɽ ʮ2ஹग़ͨ͠ΔΘʯ ʹʼ 2ԯCVʁʁʁ ͪΖΜແཧ͕͋Δ CV =
1 CPA · Cost y=x/CPA
ઢܗճؼϞσϧ ൚ԽλεΫ ະͷσʔλʹର͢Δ༧ଌੑೳࢸ্ओٛ • Neural Network • Gradient Boosting Decision
Tree • SVM with some kernel • Ridge/Lasso • Feature Hashing ౷ܭతͳͷΈͰಈ͍͍ͯͳ͍͕ଟ͍ Α͘Θ͔ΒΜ͕Կނ͔ͨΔ
ઢܗճؼϞσϧ ൚ԽλεΫ minimize: loss(label, Feature) Feature Label
ཽ ܭ ౷ ొ ཧ ֬ 47
47 ࣗݾհͱ͝ΊΜͳ͍͞ ౷ܭϞσϦϯά ղऍͱ൚Խ ઢܗճؼϞσϧ ·ͱΊ ཧϞσϦϯά
• ཧϞσϦϯάΛ༻͍Εݱ࣮ͷΛֶͷϊ ϋͰղܾͰ͖Δ • ౷ܭతͳςΫχοΫΛ͏͜ͱͰߋʹॊೈʹ • ൚ԽͱղऍϞσϧผͷςΫχοΫ ·ͱΊ
ੈా୩۠ࡏॅ H.M͞Μ ʮ࠷ॳʰ͜Μͳॻ੶Ͱඞཁͳ͕ࣝΈʹͭ͘ͳΜͯɾɾɾʱͱ͍͏ؾ࣋ͪ ͋Γɺ৴ٙͰ͜ͷຊΛखʹऔΓ·ͨ͠ɻ͍͟खʹͱͬͯݟΔͱShell ScriptSQLͷجૅͪΖΜɼPythonʹΑΔ࣮ફతͳΞϓϦέʔγϣϯͷ ࡞Γํ·Ͱஸೡʹղઆ͞Ε͍ͯͯ༧Ҏ্ͷϘϦϡʔϜͰͨ͠ɻͱ͘ʹۤख ͩͬͨ౷ܭϞσϦϯάטΈࡅ͍ͯॻ͔Ε͍ͯͯऔֻ͔ͬΓʹ࠷ߴͩͬͨ ͱࢥ͍·͢ɻ2000ԁऑͱ͍͏Ձֶ֨ੜʹخ͍͠Ͱ͢ɻࠓͰຖ൴ঁͱ ͤʹΒͯ͠ډ·͢ɻʯ ͨͳ͠ΎΜύΫͬͨ͝ΊΜ