Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Exploratory: 決定木の紹介と使い方
Search
Kan Nishida
PRO
June 27, 2019
Technology
0
2.6k
Exploratory: 決定木の紹介と使い方
機械学習のアルゴリズムのうちの一つで有名な決定木の紹介と、Exploratoryの中での使い方の紹介。
Kan Nishida
PRO
June 27, 2019
Tweet
Share
More Decks by Kan Nishida
See All by Kan Nishida
Seminar #52 - Introduction to Exploratory Server
kanaugust
PRO
0
190
Exploratory セミナー #61 政府のオープンデータ e-Statの活用
kanaugust
PRO
0
930
Exploratory セミナー #60 時系列データの加工、可視化、分析手法の紹介
kanaugust
PRO
0
840
Seminar #51 - Machine Learning - How Variable Importance Works
kanaugust
PRO
0
490
Exploratory セミナー #59 テキストデータの加工
kanaugust
PRO
0
530
Seminar #50 - Salesforce Data, Clean, Visualize, Analyze, & Dashboard
kanaugust
PRO
0
260
Exploratory セミナー #58 Exploratory x Salesforce
kanaugust
PRO
0
260
Exploratory Seminar #49 - Introduction to Dashboard Cycle with Exploratory
kanaugust
PRO
0
230
Seminar #48 - Introduction to Exploratory v6.6
kanaugust
PRO
0
240
Other Decks in Technology
See All in Technology
私が trocco を推す理由
__allllllllez__
1
250
FrontDoorとWebAppsを組み合わせた際のリダイレクト処理の注意点
kenichirokimura
1
530
元インフラエンジニアに成る / Human Resources to Human Relations
bobtani
4
930
Azure Container Apps + Bicep 〜 こんな感じで運用しています
kaz29
2
480
DMM.com アルファ室採用案内資料
hsugita
1
160
オーナーシップを持つ領域を明確にする
konifar
13
3.2k
VSCodeの拡張機能を作っている話
ebarakazuhiro
1
540
LayerXにおけるLLMプロダクト開発の今までとこれから
layerx
PRO
1
370
KubeCon EU 2024 Recap “Kubernetes Policy Time Machine: Where to Next?”
ryysud
0
220
Gitlab本から学んだこと - そーだいなるプレイバック / gitlab-book
soudai
4
440
ChatGPT for IT Service Management (IT Pro)
dahatake
7
1.6k
Azure犬駆動開発の記録/GlobalAzureFukuoka2024_20240420
nina01
1
220
Featured
See All Featured
Unsuck your backbone
ammeep
663
57k
How To Stay Up To Date on Web Technology
chriscoyier
782
250k
Bash Introduction
62gerente
604
210k
[RailsConf 2023] Rails as a piece of cake
palkan
23
4k
Gamification - CAS2011
davidbonilla
76
4.6k
Product Roadmaps are Hard
iamctodd
44
9.7k
Fight the Zombie Pattern Library - RWD Summit 2016
marcelosomers
227
16k
Git: the NoSQL Database
bkeepers
PRO
422
63k
VelocityConf: Rendering Performance Case Studies
addyosmani
320
23k
Distributed Sagas: A Protocol for Coordinating Microservices
caitiem20
322
20k
For a Future-Friendly Web
brad_frost
172
9k
Build your cross-platform service in a week with App Engine
jlugia
225
17k
Transcript
EXPLORATORY
2 εϐʔΧʔ ా צҰ CEO EXPLORATORY ུྺ 2016ɺσʔλαΠΤϯεͷຽओԽͷͨΊɺExploratory, Inc Λཱͪ
্͛Δɻ Exploratory, Inc.ͰCEOΛΊΔ͔ͨΘΒɺσʔλαΠΤϯεɾϒʔ τΩϟϯϓɾτϨʔχϯάͳͲΛ௨ͯ͠γϦίϯόϨʔͰߦΘΕ͍ͯ Δ࠷ઌͷσʔλαΠΤϯεͷීٴͱڭҭʹऔΓΉɻ ถΦϥΫϧຊࣾͰɺ16ʹΘͨΓσʔλαΠΤϯεͷ։ൃνʔϜΛ ͍ɺػցֶशɺϏοάɾσʔλɺϏδωεɾΠϯςϦδΣϯεɺσʔ λϕʔεʹؔ͢Δଟ͘ͷΛੈʹૹΓग़ͨ͠ɻ @KanAugust
Vision ΑΓΑ͍ҙࢥܾఆΛ͢ΔͨΊʹ σʔλΛ͏͜ͱ͕ͨΓલʹͳΔ
Mission σʔλαΠΤϯεͷຽओԽ
5 ୈ̏ͷ σʔλαΠΤϯεɺAIɺػցֶश౷ܭֶऀɺ։ൃऀͷͨΊ͚ͩͷͷͰ͋Γ·ͤΜɻ σʔλʹڵຯͷ͋ΔਓͳΒ୭͕ੈքͰ࠷ઌͷΞϧΰϦζϜΛͬͯ ϏδωεσʔλΛ؆୯ʹੳͰ͖Δ͖Ͱ͢ɻ Exploratory͕ͦ͏ͨ͠ੈքΛՄೳʹ͠·͢ɻ
ୈ1ͷ ୈ̎ͷ ୈ̏ͷ ϓϥΠϕʔτ(ߴ͍/ݹ͍) Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ϓϩάϥϛϯά ϓϩάϥϛϯά 2016
2000 1976 ϚωλΠθʔγϣϯ ίϞσΟςΟԽ ຽओԽ ౷ܭֶऀ σʔλαΠΤϯςΟετ Exploratory ΞϧΰϦζϜ Ϣʔβʔɾ ମݧ πʔϧ Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ࣗಈԽ ϏδωεɾϢʔβʔ ςʔϚ σʔλαΠΤϯεͷຽओԽ
質問 ExploratoryͰ؆୯ʹͰ͖ΔλεΫ 伝える データアクセス 加⼯ 可視化 機械学習・AI 統計 UI
EXPLORATORY ΦϯϥΠϯɾηϛφʔ
Analytics ܾఆ
10 σʔλੳͱ ૬ؔɺύλʔϯΛݟ͚ͭΔ͜ͱ
11 څྉ ྸ ৬छ ۈଓ ੑผ 10,000 60 Manager 24
Male 3,000 40 Sales Rep 3 Female 11,000 50 Research Director 35 Female 4,000 20 HR Rep 4 Male 5,000 30 HR Rep 5 Female 10,000 45 Manager 20 Female Γ͍ͨ͜ͱ ଐੑσʔλ
12 ՄࢹԽʂ
څྉ vs. ৬छ
څྉ vs. ۈଓ
څྉ vs. ֊ڃ
16 σʔλ ૬ؔɾ ύλʔϯ ՄࢹԽͯ͠૬ؔɾύλʔϯΛҰͭҰͭͰݟͯݕ͢Δ
17 ΊΜͲ͍͘͞ʂ
18 ΞφϦςΟΫεʂ
19 σʔλ ૬ؔɾ ύλʔϯ ػցֶशɾ౷ܭ ΞφϦςΟΫεΛͬͯ૬ؔɾύλʔϯΛޮՌతʹݟ͚ͭΔɻ ΞφϦςΟΫε
20 ܾఆ Ϟσϧ ༧ଌϞσϧΛ࡞Δ σʔλ ΞϧΰϦζϜ
21 Monthly Income Age Job Role Department Gender ? 60
Manager Sales Male ? 40 Sales Rep R&D Female ? 30 Research Director HR Female Monthly Income Age Job Role Department Gender 10,000 60 Manager HR Male 11,000 40 Research Director R&D Female 4,000 30 HR Rep HR Female ༧ଌ͢Δ ͑ͷͳ͍σʔλ ܾఆ Ϟσϧ
22 Ϟσϧσʔλͷதʹ͋ΔύλʔϯΛͱʹ࡞ΒΕΔ ܾఆ Ϟσϧ
23 Ͳͷม͕ΑΓ૬͕ؔ͋Δͷ͔ɺͲ͏͍͏ؔੑ Λ͍࣋ͬͯΔͷ͔Λ͍ͬͯΔɻ ܾఆ Ϟσϧ
24 σʔλ ΞφϦςΟΫεʹΑͬͯಘΒΕͨΠϯαΠτΛ ՄࢹԽ͢Δ͜ͱͰɺײతʹཧղ͢Δ ΞφϦςΟΫε ʢػցֶशɺ౷ܭʣ ૬ؔ / ύλʔϯ
25 ܾఆʢDecision Treeʣ ܾఆɺҰ࿈ͷ࣭ͱɺ ͦͷ͑ʹΑΔذͰ݁Ռ Λ༧ଌ͢Δख๏Ͱ͋Δɻ
26 Baby ࣇͷମॏ ࣇͷ ૣ࢈͔Ͳ͏͔ A 5.2 1 TRUE B
4.7 2 TRUE C 6.8 1 FALSE D 7.2 1 FALSE E 5.1 2 TRUE Z 5.8 1 ? ͜ͷͪΌΜૣ࢈ʹͳΔͩΖ͏͔ʁ
27 ૣ࢈Λ༧ଌ͢ΔܾఆΛ࡞ͯ͠ΈΔɻ
28 ࣇͷମॏͱͷؔΛՄࢹԽͯ͠ΈΔɻ 28 ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
29 29 ૣ࢈͔Ͳ͏͔ɺͰ৭͚Λ͢Δɻ ͕ૣ࢈ɺ੨ૣ࢈Ͱͳ͍Λҙຯ͢Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
30 30 ઢΛҾ͘͜ͱͰͳΔ͘ಉ͡৭ಉ࢜Λάϧʔϓʹ͚Δɻ ઢΛҾ͘ճΛ࠷খʹ͢Δ͜ͱΛߟ͑Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
31 31 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ·ͣɺࣇͷମॏ͕5.5Ҏ্͔Ͳ͏͔ɺͰάϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5
32 32 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣍ʹɺࣇͷ͕1.5ΑΓଟ͍͔ɺͰେ͖͘άϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5 ࣇͷʼ1.5
33 33 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣇͷମॏ >= 5.5 ࣇͷʼ1.5 ૣ࢈Ͱ͋Δ: Yes ૣ࢈ͷׂ߹: 100% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 0% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 40% શମͷׂ߹: 20%
34 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 TRUE
FALSE 0% 40% 100% ૣ࢈Ͱ͋Δ֬
35 Ͳ͏ͬͯΛ࡞͍ͬͯΔͷ͔
36 Ͳͷ࣭ʢ݅ʣΛઌʹ࣋ͬͯ͘Δ͔
37 ෆ७ʢGini Impurityʣ • 0͔Β1ͷؒͷΛऔΔɻ • ͦΕͧΕͷϊʔυͷσʔλʹͲΕ͚͕ͩࠞ ͍ͬͯ͟Δ͔Λද͢ࢦඪ
pi 38 ෆ७ʢGini Impurityʣ ෆ७ (Gini Impurity) ͦͷϊʔυʹ͋ΔҰҙͷͷΛnͱ͢Δͱɺ ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ( i൪ͷΛ࣋ͭαϯϓϧͷׂ߹)
1 − p2 1 − p2 2 − p2 3 − . . . . p2 n
ෆ७ = 0 39 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
1 - (0/6)2 - (6/6)2 = 0 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ෆ७ = 0 40 ૣ࢈ ૣ࢈ ૣ࢈ 1 - (6/6)2
- (0/6)2 = 0 ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.44 41 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ૣ࢈ ૣ࢈ 1 - (2/6)2 - (4/6)2 = 0.44 Not ૣ࢈
ෆ७ = 0.44 42 Not ૣ࢈ ૣ࢈ ૣ࢈ 1 -
(4/6)2 - (2/6)2 = 0.44 Not ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.5 43 Not ૣ࢈ ૣ࢈ Not ૣ࢈ Not
ૣ࢈ ૣ࢈ ૣ࢈ 1 - (3/6)2 - (3/6)2 = 0.5
44 ૣ࢈ Not ૣ࢈ ૣ࢈ ૣ࢈ Impurity: 0.5 ૣ࢈ ૣ࢈
Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ ελʔτ
45 ࣇͷମॏͰ࠷ॳʹάϧʔϓ͚͢Δ߹
46 ࣇͷମॏ >= 5.5 TRUE FALSE
47 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41
48 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41 ෆ७: 3/10*0 + 7/10*0.41 = 0.29
49 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5
50 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5 ෆ७ͷݮগ: 0.21
51 ࣇͷͰ࠷ॳʹάϧʔϓ͚͢Δ߹
52 ࣇͷ > 1.5 TRUE FALSE
53 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 ෆ७:
1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
54 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
55 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5
56 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5 ෆ७ͷݮগ: 0.02
57 ࣇͷମॏͷํ͕ɺ ࣇͷΑΓෆ७ΛݮΒͤΔ ࣇͷ 0.02 ʻ 0.48 ࣇͷମॏ ෆ७ͷݮগΛൺֱ͢Δͱŋŋŋ
58 ઌʹࣇͷମॏͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷͰάϧʔϓ͚͢Δ Λ࡞Δ
59 TRUE FALSE TRUE FALSE 100% 50% 0% ࣇͷମॏ >=
5.5 ࣇͷ > 1.5
60 ͢Δͱɺͦ͏Ͱͳ͍߹ʹൺͯ গͳ͍࣭ʢذʣͰ͢Ή
61 ઌʹࣇͷͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷମॏͰάϧʔϓ͚͢Δ Λ࡞Δ
62 Over_35 TRUE FALSE Is_Plural TRUE FALSE 50% 100% Is_Plural
TRUE FALSE 100% 100% ࣇͷମॏ >= 5.5 ࣇͷ > 1.5 ࣇͷମॏ >= 5.5
63 ྨ vs. ճؼ
64 ੜ·Εͯ͘ΔͪΌΜະख़ࣇ͔ʁ ྨ ྨ vs. ճؼ ճؼ ͍ͭͪΌΜ͕ੜ·Εͯ͘Δ͔ʁ
65 65 ྨ ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
66 Mother Age Father Age ճؼ
67 Is_Prural TRUE FALSE Over_35 TRUE FALSE ৷ظؒΛ༧ଌ 20 40
68 Is_Prural TRUE FALSE Over_35 TRUE FALSE 20 weeks 30
weeks 37 weeks ৷ظؒΛ༧ଌ 20 ฏۉ 40 ฏۉ͔ΒͷΒ͖͕ͭ ࠷খʹͳΔΑ͏ʹ ࢬΛ͚͍ͯ͘
69 ΞφϦςΟΫε ܾఆΛͬͯݟΔʂ
is_premature(ૣ࢈)ͷྻΛ࡞Δ ͠ɺis_prematureྻ͕ͳ͍߹ɺૣ࢈͔Ͳ͏͔ (37िະ ຬ͔Ͳ͏͔)ͷཧΛͱΔྻΛgestation_weeks(৷ि) ͷྻ͔Β৽ͨʹ࡞Δɻ 70 gestation_weeks < 37
is_prematureྻΛ࡞Δ gestation_weeks(৷ि)ͷྻϔομϝχϡʔ͔Βܭࢉͷ࡞(Mutate)ΛબͿɻ 71
72 • ྻ໊ʹ is_prematureͱೖྗɻ • ܭࢉʹ gestation_weeks<37 Λೖྗɻ ࡞͞ΕΔྻʹɺ37िະຬͳΒTRUEɺ 37िҎ্ͳΒFALSEͷ͕ೖΔɻ
is_prematureྻΛ࡞Δ
73 ܾఆΞφϦςΟΫε
74 ༧ଌରྻͷબ
75 มͷྻͷબ
76 gestation_weeks(৷ि)Ҏ֎ͷશͯͷྻΛબ
77 ܾఆ͕࡞͞Εͨɻ
։࢝
ଟͷσʔλ : FALSE (Not ૣ࢈). TRUE (Premature) ͷׂ߹ : 12%.
͜ͷͰͷσʔλͷׂ߹ : 100% ։࢝
݅: ମॏʢweight_pounds greaterʣ͕ 5.3 ύϯυҎ্͔?
ଟͷσʔλɿFALSE TRUE (Premature) ͷׂ߹ : 8%. ͜ͷͰͷσʔλͷׂ߹ : 94%
ଟͷσʔλɿTRUE TRUE (Premature) ͷׂ߹ : 72%. ͜ͷͰͷσʔλͷׂ߹ : 6%
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Is_Premature vs. Weight
Is_Premature vs. Weight
ैۀһσʔλΛͬͨྫ
None
ܾఆͷϞσϧΛ࡞Δ
None
None
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Attrition vs. Overtime
Attrition vs. Monthly Income
None
• ϓϩάϥϛϯάͳ͠ RݴޠͷUIͰ͋ΔExploratoryΛੳπʔϧͱͯ͠༻͢ΔͨΊडߨதɺϏδωεͷ Λղܾ͢ΔͨΊʹඞཁͳσʔλαΠΤϯεͷख๏ͷशಘʹ100ˋूதͰ͖Δ • ੳπʔϧͷϕϯμʔϩοΫΠϯͳ͠ ExploratoryͰͷ࡞ۀશͯಠཱͨ͠ΦʔϓϯιʔεͷRڥͰ࠶ݱ͕Մೳ • ࢥߟྗͱεΩϧͷशಘ σʔλαΠΤϯεͷεΩϧशಘ͚ͩͰͳ͘ɺσʔλੳʹඞཁͳࢥߟྗशಘͰ͖Δ
ಛ
Q & A
࿈བྷઌ ϝʔϧ
[email protected]
ΣϒαΠτ https://ja.exploratory.io ϒʔτΩϟϯϓɾτϨʔχϯά https://ja.exploratory.io/training-jp Twitter @KanAugust