Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Exploratory: 決定木の紹介と使い方
Search
Kan Nishida
June 27, 2019
Technology
0
2.7k
Exploratory: 決定木の紹介と使い方
機械学習のアルゴリズムのうちの一つで有名な決定木の紹介と、Exploratoryの中での使い方の紹介。
Kan Nishida
June 27, 2019
Tweet
Share
More Decks by Kan Nishida
See All by Kan Nishida
Seminar #52 - Introduction to Exploratory Server
kanaugust
0
370
Exploratory セミナー #61 政府のオープンデータ e-Statの活用
kanaugust
0
1.1k
Exploratory セミナー #60 時系列データの加工、可視化、分析手法の紹介
kanaugust
0
1.2k
Seminar #51 - Machine Learning - How Variable Importance Works
kanaugust
0
710
Exploratory セミナー #59 テキストデータの加工
kanaugust
0
720
Seminar #50 - Salesforce Data, Clean, Visualize, Analyze, & Dashboard
kanaugust
1
440
Exploratory セミナー #58 Exploratory x Salesforce
kanaugust
0
370
Exploratory Seminar #49 - Introduction to Dashboard Cycle with Exploratory
kanaugust
0
440
Seminar #48 - Introduction to Exploratory v6.6
kanaugust
0
380
Other Decks in Technology
See All in Technology
GitHub Copilotを使いこなす 実例に学ぶAIコーディング活用術
74th
3
3.2k
文字列の並び順 / Unicode Collation
tmtms
3
590
乗りこなせAI駆動開発の波
eltociear
1
1.1k
AIプラットフォームにおけるMLflowの利用について
lycorptech_jp
PRO
1
160
会社紹介資料 / Sansan Company Profile
sansan33
PRO
11
390k
生成AI時代におけるグローバル戦略思考
taka_aki
0
190
re:Invent2025 3つの Frontier Agents を紹介 / introducing-3-frontier-agents
tomoki10
0
150
Lookerで実現するセキュアな外部データ提供
zozotech
PRO
0
140
新 Security HubがついにGA!仕組みや料金を深堀り #AWSreInvent #regrowth / AWS Security Hub Advanced GA
masahirokawahara
1
2.1k
IAMユーザーゼロの運用は果たして可能なのか
yama3133
1
400
モダンデータスタック (MDS) の話とデータ分析が起こすビジネス変革
sutotakeshi
0
500
シニアソフトウェアエンジニアになるためには
kworkdev
PRO
3
120
Featured
See All Featured
Understanding Cognitive Biases in Performance Measurement
bluesmoon
32
2.7k
Become a Pro
speakerdeck
PRO
31
5.7k
Fashionably flexible responsive web design (full day workshop)
malarkey
407
66k
For a Future-Friendly Web
brad_frost
180
10k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
128
54k
How STYLIGHT went responsive
nonsquared
100
6k
Building Better People: How to give real-time feedback that sticks.
wjessup
370
20k
Build your cross-platform service in a week with App Engine
jlugia
234
18k
Refactoring Trust on Your Teams (GOTO; Chicago 2020)
rmw
35
3.3k
個人開発の失敗を避けるイケてる考え方 / tips for indie hackers
panda_program
122
21k
The MySQL Ecosystem @ GitHub 2015
samlambert
251
13k
Learning to Love Humans: Emotional Interface Design
aarron
274
41k
Transcript
EXPLORATORY
2 εϐʔΧʔ ా צҰ CEO EXPLORATORY ུྺ 2016ɺσʔλαΠΤϯεͷຽओԽͷͨΊɺExploratory, Inc Λཱͪ
্͛Δɻ Exploratory, Inc.ͰCEOΛΊΔ͔ͨΘΒɺσʔλαΠΤϯεɾϒʔ τΩϟϯϓɾτϨʔχϯάͳͲΛ௨ͯ͠γϦίϯόϨʔͰߦΘΕ͍ͯ Δ࠷ઌͷσʔλαΠΤϯεͷීٴͱڭҭʹऔΓΉɻ ถΦϥΫϧຊࣾͰɺ16ʹΘͨΓσʔλαΠΤϯεͷ։ൃνʔϜΛ ͍ɺػցֶशɺϏοάɾσʔλɺϏδωεɾΠϯςϦδΣϯεɺσʔ λϕʔεʹؔ͢Δଟ͘ͷΛੈʹૹΓग़ͨ͠ɻ @KanAugust
Vision ΑΓΑ͍ҙࢥܾఆΛ͢ΔͨΊʹ σʔλΛ͏͜ͱ͕ͨΓલʹͳΔ
Mission σʔλαΠΤϯεͷຽओԽ
5 ୈ̏ͷ σʔλαΠΤϯεɺAIɺػցֶश౷ܭֶऀɺ։ൃऀͷͨΊ͚ͩͷͷͰ͋Γ·ͤΜɻ σʔλʹڵຯͷ͋ΔਓͳΒ୭͕ੈքͰ࠷ઌͷΞϧΰϦζϜΛͬͯ ϏδωεσʔλΛ؆୯ʹੳͰ͖Δ͖Ͱ͢ɻ Exploratory͕ͦ͏ͨ͠ੈքΛՄೳʹ͠·͢ɻ
ୈ1ͷ ୈ̎ͷ ୈ̏ͷ ϓϥΠϕʔτ(ߴ͍/ݹ͍) Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ϓϩάϥϛϯά ϓϩάϥϛϯά 2016
2000 1976 ϚωλΠθʔγϣϯ ίϞσΟςΟԽ ຽओԽ ౷ܭֶऀ σʔλαΠΤϯςΟετ Exploratory ΞϧΰϦζϜ Ϣʔβʔɾ ମݧ πʔϧ Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ࣗಈԽ ϏδωεɾϢʔβʔ ςʔϚ σʔλαΠΤϯεͷຽओԽ
質問 ExploratoryͰ؆୯ʹͰ͖ΔλεΫ 伝える データアクセス 加⼯ 可視化 機械学習・AI 統計 UI
EXPLORATORY ΦϯϥΠϯɾηϛφʔ
Analytics ܾఆ
10 σʔλੳͱ ૬ؔɺύλʔϯΛݟ͚ͭΔ͜ͱ
11 څྉ ྸ ৬छ ۈଓ ੑผ 10,000 60 Manager 24
Male 3,000 40 Sales Rep 3 Female 11,000 50 Research Director 35 Female 4,000 20 HR Rep 4 Male 5,000 30 HR Rep 5 Female 10,000 45 Manager 20 Female Γ͍ͨ͜ͱ ଐੑσʔλ
12 ՄࢹԽʂ
څྉ vs. ৬छ
څྉ vs. ۈଓ
څྉ vs. ֊ڃ
16 σʔλ ૬ؔɾ ύλʔϯ ՄࢹԽͯ͠૬ؔɾύλʔϯΛҰͭҰͭͰݟͯݕ͢Δ
17 ΊΜͲ͍͘͞ʂ
18 ΞφϦςΟΫεʂ
19 σʔλ ૬ؔɾ ύλʔϯ ػցֶशɾ౷ܭ ΞφϦςΟΫεΛͬͯ૬ؔɾύλʔϯΛޮՌతʹݟ͚ͭΔɻ ΞφϦςΟΫε
20 ܾఆ Ϟσϧ ༧ଌϞσϧΛ࡞Δ σʔλ ΞϧΰϦζϜ
21 Monthly Income Age Job Role Department Gender ? 60
Manager Sales Male ? 40 Sales Rep R&D Female ? 30 Research Director HR Female Monthly Income Age Job Role Department Gender 10,000 60 Manager HR Male 11,000 40 Research Director R&D Female 4,000 30 HR Rep HR Female ༧ଌ͢Δ ͑ͷͳ͍σʔλ ܾఆ Ϟσϧ
22 Ϟσϧσʔλͷதʹ͋ΔύλʔϯΛͱʹ࡞ΒΕΔ ܾఆ Ϟσϧ
23 Ͳͷม͕ΑΓ૬͕ؔ͋Δͷ͔ɺͲ͏͍͏ؔੑ Λ͍࣋ͬͯΔͷ͔Λ͍ͬͯΔɻ ܾఆ Ϟσϧ
24 σʔλ ΞφϦςΟΫεʹΑͬͯಘΒΕͨΠϯαΠτΛ ՄࢹԽ͢Δ͜ͱͰɺײతʹཧղ͢Δ ΞφϦςΟΫε ʢػցֶशɺ౷ܭʣ ૬ؔ / ύλʔϯ
25 ܾఆʢDecision Treeʣ ܾఆɺҰ࿈ͷ࣭ͱɺ ͦͷ͑ʹΑΔذͰ݁Ռ Λ༧ଌ͢Δख๏Ͱ͋Δɻ
26 Baby ࣇͷମॏ ࣇͷ ૣ࢈͔Ͳ͏͔ A 5.2 1 TRUE B
4.7 2 TRUE C 6.8 1 FALSE D 7.2 1 FALSE E 5.1 2 TRUE Z 5.8 1 ? ͜ͷͪΌΜૣ࢈ʹͳΔͩΖ͏͔ʁ
27 ૣ࢈Λ༧ଌ͢ΔܾఆΛ࡞ͯ͠ΈΔɻ
28 ࣇͷମॏͱͷؔΛՄࢹԽͯ͠ΈΔɻ 28 ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
29 29 ૣ࢈͔Ͳ͏͔ɺͰ৭͚Λ͢Δɻ ͕ૣ࢈ɺ੨ૣ࢈Ͱͳ͍Λҙຯ͢Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
30 30 ઢΛҾ͘͜ͱͰͳΔ͘ಉ͡৭ಉ࢜Λάϧʔϓʹ͚Δɻ ઢΛҾ͘ճΛ࠷খʹ͢Δ͜ͱΛߟ͑Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
31 31 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ·ͣɺࣇͷମॏ͕5.5Ҏ্͔Ͳ͏͔ɺͰάϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5
32 32 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣍ʹɺࣇͷ͕1.5ΑΓଟ͍͔ɺͰେ͖͘άϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5 ࣇͷʼ1.5
33 33 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣇͷମॏ >= 5.5 ࣇͷʼ1.5 ૣ࢈Ͱ͋Δ: Yes ૣ࢈ͷׂ߹: 100% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 0% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 40% શମͷׂ߹: 20%
34 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 TRUE
FALSE 0% 40% 100% ૣ࢈Ͱ͋Δ֬
35 Ͳ͏ͬͯΛ࡞͍ͬͯΔͷ͔
36 Ͳͷ࣭ʢ݅ʣΛઌʹ࣋ͬͯ͘Δ͔
37 ෆ७ʢGini Impurityʣ • 0͔Β1ͷؒͷΛऔΔɻ • ͦΕͧΕͷϊʔυͷσʔλʹͲΕ͚͕ͩࠞ ͍ͬͯ͟Δ͔Λද͢ࢦඪ
pi 38 ෆ७ʢGini Impurityʣ ෆ७ (Gini Impurity) ͦͷϊʔυʹ͋ΔҰҙͷͷΛnͱ͢Δͱɺ ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ( i൪ͷΛ࣋ͭαϯϓϧͷׂ߹)
1 − p2 1 − p2 2 − p2 3 − . . . . p2 n
ෆ७ = 0 39 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
1 - (0/6)2 - (6/6)2 = 0 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ෆ७ = 0 40 ૣ࢈ ૣ࢈ ૣ࢈ 1 - (6/6)2
- (0/6)2 = 0 ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.44 41 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ૣ࢈ ૣ࢈ 1 - (2/6)2 - (4/6)2 = 0.44 Not ૣ࢈
ෆ७ = 0.44 42 Not ૣ࢈ ૣ࢈ ૣ࢈ 1 -
(4/6)2 - (2/6)2 = 0.44 Not ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.5 43 Not ૣ࢈ ૣ࢈ Not ૣ࢈ Not
ૣ࢈ ૣ࢈ ૣ࢈ 1 - (3/6)2 - (3/6)2 = 0.5
44 ૣ࢈ Not ૣ࢈ ૣ࢈ ૣ࢈ Impurity: 0.5 ૣ࢈ ૣ࢈
Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ ελʔτ
45 ࣇͷମॏͰ࠷ॳʹάϧʔϓ͚͢Δ߹
46 ࣇͷମॏ >= 5.5 TRUE FALSE
47 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41
48 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41 ෆ७: 3/10*0 + 7/10*0.41 = 0.29
49 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5
50 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5 ෆ७ͷݮগ: 0.21
51 ࣇͷͰ࠷ॳʹάϧʔϓ͚͢Δ߹
52 ࣇͷ > 1.5 TRUE FALSE
53 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 ෆ७:
1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
54 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
55 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5
56 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5 ෆ७ͷݮগ: 0.02
57 ࣇͷମॏͷํ͕ɺ ࣇͷΑΓෆ७ΛݮΒͤΔ ࣇͷ 0.02 ʻ 0.48 ࣇͷମॏ ෆ७ͷݮগΛൺֱ͢Δͱŋŋŋ
58 ઌʹࣇͷମॏͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷͰάϧʔϓ͚͢Δ Λ࡞Δ
59 TRUE FALSE TRUE FALSE 100% 50% 0% ࣇͷମॏ >=
5.5 ࣇͷ > 1.5
60 ͢Δͱɺͦ͏Ͱͳ͍߹ʹൺͯ গͳ͍࣭ʢذʣͰ͢Ή
61 ઌʹࣇͷͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷମॏͰάϧʔϓ͚͢Δ Λ࡞Δ
62 Over_35 TRUE FALSE Is_Plural TRUE FALSE 50% 100% Is_Plural
TRUE FALSE 100% 100% ࣇͷମॏ >= 5.5 ࣇͷ > 1.5 ࣇͷମॏ >= 5.5
63 ྨ vs. ճؼ
64 ੜ·Εͯ͘ΔͪΌΜະख़ࣇ͔ʁ ྨ ྨ vs. ճؼ ճؼ ͍ͭͪΌΜ͕ੜ·Εͯ͘Δ͔ʁ
65 65 ྨ ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
66 Mother Age Father Age ճؼ
67 Is_Prural TRUE FALSE Over_35 TRUE FALSE ৷ظؒΛ༧ଌ 20 40
68 Is_Prural TRUE FALSE Over_35 TRUE FALSE 20 weeks 30
weeks 37 weeks ৷ظؒΛ༧ଌ 20 ฏۉ 40 ฏۉ͔ΒͷΒ͖͕ͭ ࠷খʹͳΔΑ͏ʹ ࢬΛ͚͍ͯ͘
69 ΞφϦςΟΫε ܾఆΛͬͯݟΔʂ
is_premature(ૣ࢈)ͷྻΛ࡞Δ ͠ɺis_prematureྻ͕ͳ͍߹ɺૣ࢈͔Ͳ͏͔ (37िະ ຬ͔Ͳ͏͔)ͷཧΛͱΔྻΛgestation_weeks(৷ि) ͷྻ͔Β৽ͨʹ࡞Δɻ 70 gestation_weeks < 37
is_prematureྻΛ࡞Δ gestation_weeks(৷ि)ͷྻϔομϝχϡʔ͔Βܭࢉͷ࡞(Mutate)ΛબͿɻ 71
72 • ྻ໊ʹ is_prematureͱೖྗɻ • ܭࢉʹ gestation_weeks<37 Λೖྗɻ ࡞͞ΕΔྻʹɺ37िະຬͳΒTRUEɺ 37िҎ্ͳΒFALSEͷ͕ೖΔɻ
is_prematureྻΛ࡞Δ
73 ܾఆΞφϦςΟΫε
74 ༧ଌରྻͷબ
75 มͷྻͷબ
76 gestation_weeks(৷ि)Ҏ֎ͷશͯͷྻΛબ
77 ܾఆ͕࡞͞Εͨɻ
։࢝
ଟͷσʔλ : FALSE (Not ૣ࢈). TRUE (Premature) ͷׂ߹ : 12%.
͜ͷͰͷσʔλͷׂ߹ : 100% ։࢝
݅: ମॏʢweight_pounds greaterʣ͕ 5.3 ύϯυҎ্͔?
ଟͷσʔλɿFALSE TRUE (Premature) ͷׂ߹ : 8%. ͜ͷͰͷσʔλͷׂ߹ : 94%
ଟͷσʔλɿTRUE TRUE (Premature) ͷׂ߹ : 72%. ͜ͷͰͷσʔλͷׂ߹ : 6%
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Is_Premature vs. Weight
Is_Premature vs. Weight
ैۀһσʔλΛͬͨྫ
None
ܾఆͷϞσϧΛ࡞Δ
None
None
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Attrition vs. Overtime
Attrition vs. Monthly Income
None
• ϓϩάϥϛϯάͳ͠ RݴޠͷUIͰ͋ΔExploratoryΛੳπʔϧͱͯ͠༻͢ΔͨΊडߨதɺϏδωεͷ Λղܾ͢ΔͨΊʹඞཁͳσʔλαΠΤϯεͷख๏ͷशಘʹ100ˋूதͰ͖Δ • ੳπʔϧͷϕϯμʔϩοΫΠϯͳ͠ ExploratoryͰͷ࡞ۀશͯಠཱͨ͠ΦʔϓϯιʔεͷRڥͰ࠶ݱ͕Մೳ • ࢥߟྗͱεΩϧͷशಘ σʔλαΠΤϯεͷεΩϧशಘ͚ͩͰͳ͘ɺσʔλੳʹඞཁͳࢥߟྗशಘͰ͖Δ
ಛ
Q & A
࿈བྷઌ ϝʔϧ
[email protected]
ΣϒαΠτ https://ja.exploratory.io ϒʔτΩϟϯϓɾτϨʔχϯά https://ja.exploratory.io/training-jp Twitter @KanAugust