Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Exploratory: 決定木の紹介と使い方
Search
Kan Nishida
June 27, 2019
Technology
0
2.7k
Exploratory: 決定木の紹介と使い方
機械学習のアルゴリズムのうちの一つで有名な決定木の紹介と、Exploratoryの中での使い方の紹介。
Kan Nishida
June 27, 2019
Tweet
Share
More Decks by Kan Nishida
See All by Kan Nishida
Seminar #52 - Introduction to Exploratory Server
kanaugust
0
330
Exploratory セミナー #61 政府のオープンデータ e-Statの活用
kanaugust
0
1.1k
Exploratory セミナー #60 時系列データの加工、可視化、分析手法の紹介
kanaugust
0
1.1k
Seminar #51 - Machine Learning - How Variable Importance Works
kanaugust
0
660
Exploratory セミナー #59 テキストデータの加工
kanaugust
0
680
Seminar #50 - Salesforce Data, Clean, Visualize, Analyze, & Dashboard
kanaugust
1
390
Exploratory セミナー #58 Exploratory x Salesforce
kanaugust
0
350
Exploratory Seminar #49 - Introduction to Dashboard Cycle with Exploratory
kanaugust
0
380
Seminar #48 - Introduction to Exploratory v6.6
kanaugust
0
350
Other Decks in Technology
See All in Technology
Language Update: Java
skrb
2
190
AIエージェントの活用に重要な「MCP (Model Context Protocol)」とは何か
masayamoriofficial
0
250
RSCの時代にReactとフレームワークの境界を探る
uhyo
8
1.3k
エラーとアクセシビリティ
schktjm
0
230
事業価値と Engineering
recruitengineers
PRO
8
5.4k
PRDの正しい使い方 ~AI時代にも効く思考・対話・成長ツールとして~
techtekt
PRO
0
310
衝突して強くなる! BLUE GIANTと アジャイルチームの共通点とは ― いきいきと活気に満ちたグルーヴあるチームを作るコツ ― / BLUE GIANT and Agile Teams
naitosatoshi
0
290
Webアクセシビリティ入門
recruitengineers
PRO
3
1.5k
シークレット管理だけじゃない!HashiCorp Vault でデータ暗号化をしよう / Beyond Secret Management! Let's Encrypt Data with HashiCorp Vault
nnstt1
2
130
オブザーバビリティが広げる AIOps の世界 / The World of AIOps Expanded by Observability
aoto
PRO
0
250
Webブラウザ向け動画配信プレイヤーの 大規模リプレイスから得た知見と学び
yud0uhu
0
130
Oracle Cloud Infrastructure:2025年8月度サービス・アップデート
oracle4engineer
PRO
0
170
Featured
See All Featured
Git: the NoSQL Database
bkeepers
PRO
431
66k
Chrome DevTools: State of the Union 2024 - Debugging React & Beyond
addyosmani
7
830
Build The Right Thing And Hit Your Dates
maggiecrowley
37
2.8k
Performance Is Good for Brains [We Love Speed 2024]
tammyeverts
11
1.1k
A Tale of Four Properties
chriscoyier
160
23k
Site-Speed That Sticks
csswizardry
10
800
Faster Mobile Websites
deanohume
309
31k
Being A Developer After 40
akosma
90
590k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
284
13k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
15
1.6k
Building an army of robots
kneath
306
46k
The Web Performance Landscape in 2024 [PerfNow 2024]
tammyeverts
9
790
Transcript
EXPLORATORY
2 εϐʔΧʔ ా צҰ CEO EXPLORATORY ུྺ 2016ɺσʔλαΠΤϯεͷຽओԽͷͨΊɺExploratory, Inc Λཱͪ
্͛Δɻ Exploratory, Inc.ͰCEOΛΊΔ͔ͨΘΒɺσʔλαΠΤϯεɾϒʔ τΩϟϯϓɾτϨʔχϯάͳͲΛ௨ͯ͠γϦίϯόϨʔͰߦΘΕ͍ͯ Δ࠷ઌͷσʔλαΠΤϯεͷීٴͱڭҭʹऔΓΉɻ ถΦϥΫϧຊࣾͰɺ16ʹΘͨΓσʔλαΠΤϯεͷ։ൃνʔϜΛ ͍ɺػցֶशɺϏοάɾσʔλɺϏδωεɾΠϯςϦδΣϯεɺσʔ λϕʔεʹؔ͢Δଟ͘ͷΛੈʹૹΓग़ͨ͠ɻ @KanAugust
Vision ΑΓΑ͍ҙࢥܾఆΛ͢ΔͨΊʹ σʔλΛ͏͜ͱ͕ͨΓલʹͳΔ
Mission σʔλαΠΤϯεͷຽओԽ
5 ୈ̏ͷ σʔλαΠΤϯεɺAIɺػցֶश౷ܭֶऀɺ։ൃऀͷͨΊ͚ͩͷͷͰ͋Γ·ͤΜɻ σʔλʹڵຯͷ͋ΔਓͳΒ୭͕ੈքͰ࠷ઌͷΞϧΰϦζϜΛͬͯ ϏδωεσʔλΛ؆୯ʹੳͰ͖Δ͖Ͱ͢ɻ Exploratory͕ͦ͏ͨ͠ੈքΛՄೳʹ͠·͢ɻ
ୈ1ͷ ୈ̎ͷ ୈ̏ͷ ϓϥΠϕʔτ(ߴ͍/ݹ͍) Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ϓϩάϥϛϯά ϓϩάϥϛϯά 2016
2000 1976 ϚωλΠθʔγϣϯ ίϞσΟςΟԽ ຽओԽ ౷ܭֶऀ σʔλαΠΤϯςΟετ Exploratory ΞϧΰϦζϜ Ϣʔβʔɾ ମݧ πʔϧ Φʔϓϯɾιʔε(ແྉ/࠷ઌ) UI & ࣗಈԽ ϏδωεɾϢʔβʔ ςʔϚ σʔλαΠΤϯεͷຽओԽ
質問 ExploratoryͰ؆୯ʹͰ͖ΔλεΫ 伝える データアクセス 加⼯ 可視化 機械学習・AI 統計 UI
EXPLORATORY ΦϯϥΠϯɾηϛφʔ
Analytics ܾఆ
10 σʔλੳͱ ૬ؔɺύλʔϯΛݟ͚ͭΔ͜ͱ
11 څྉ ྸ ৬छ ۈଓ ੑผ 10,000 60 Manager 24
Male 3,000 40 Sales Rep 3 Female 11,000 50 Research Director 35 Female 4,000 20 HR Rep 4 Male 5,000 30 HR Rep 5 Female 10,000 45 Manager 20 Female Γ͍ͨ͜ͱ ଐੑσʔλ
12 ՄࢹԽʂ
څྉ vs. ৬छ
څྉ vs. ۈଓ
څྉ vs. ֊ڃ
16 σʔλ ૬ؔɾ ύλʔϯ ՄࢹԽͯ͠૬ؔɾύλʔϯΛҰͭҰͭͰݟͯݕ͢Δ
17 ΊΜͲ͍͘͞ʂ
18 ΞφϦςΟΫεʂ
19 σʔλ ૬ؔɾ ύλʔϯ ػցֶशɾ౷ܭ ΞφϦςΟΫεΛͬͯ૬ؔɾύλʔϯΛޮՌతʹݟ͚ͭΔɻ ΞφϦςΟΫε
20 ܾఆ Ϟσϧ ༧ଌϞσϧΛ࡞Δ σʔλ ΞϧΰϦζϜ
21 Monthly Income Age Job Role Department Gender ? 60
Manager Sales Male ? 40 Sales Rep R&D Female ? 30 Research Director HR Female Monthly Income Age Job Role Department Gender 10,000 60 Manager HR Male 11,000 40 Research Director R&D Female 4,000 30 HR Rep HR Female ༧ଌ͢Δ ͑ͷͳ͍σʔλ ܾఆ Ϟσϧ
22 Ϟσϧσʔλͷதʹ͋ΔύλʔϯΛͱʹ࡞ΒΕΔ ܾఆ Ϟσϧ
23 Ͳͷม͕ΑΓ૬͕ؔ͋Δͷ͔ɺͲ͏͍͏ؔੑ Λ͍࣋ͬͯΔͷ͔Λ͍ͬͯΔɻ ܾఆ Ϟσϧ
24 σʔλ ΞφϦςΟΫεʹΑͬͯಘΒΕͨΠϯαΠτΛ ՄࢹԽ͢Δ͜ͱͰɺײతʹཧղ͢Δ ΞφϦςΟΫε ʢػցֶशɺ౷ܭʣ ૬ؔ / ύλʔϯ
25 ܾఆʢDecision Treeʣ ܾఆɺҰ࿈ͷ࣭ͱɺ ͦͷ͑ʹΑΔذͰ݁Ռ Λ༧ଌ͢Δख๏Ͱ͋Δɻ
26 Baby ࣇͷମॏ ࣇͷ ૣ࢈͔Ͳ͏͔ A 5.2 1 TRUE B
4.7 2 TRUE C 6.8 1 FALSE D 7.2 1 FALSE E 5.1 2 TRUE Z 5.8 1 ? ͜ͷͪΌΜૣ࢈ʹͳΔͩΖ͏͔ʁ
27 ૣ࢈Λ༧ଌ͢ΔܾఆΛ࡞ͯ͠ΈΔɻ
28 ࣇͷମॏͱͷؔΛՄࢹԽͯ͠ΈΔɻ 28 ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
29 29 ૣ࢈͔Ͳ͏͔ɺͰ৭͚Λ͢Δɻ ͕ૣ࢈ɺ੨ૣ࢈Ͱͳ͍Λҙຯ͢Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
30 30 ઢΛҾ͘͜ͱͰͳΔ͘ಉ͡৭ಉ࢜Λάϧʔϓʹ͚Δɻ ઢΛҾ͘ճΛ࠷খʹ͢Δ͜ͱΛߟ͑Δɻ ࣇͷ ࣇͷମॏ 1 5 2 3
4 5 6 4 7
31 31 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ·ͣɺࣇͷମॏ͕5.5Ҏ্͔Ͳ͏͔ɺͰάϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5
32 32 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣍ʹɺࣇͷ͕1.5ΑΓଟ͍͔ɺͰେ͖͘άϧʔϓ͚Ͱ͖Δɻ ࣇͷମॏ >= 5.5 ࣇͷʼ1.5
33 33 ࣇͷ ࣇͷମॏ 1 5 2 3 4 5
6 4 7 ࣇͷମॏ >= 5.5 ࣇͷʼ1.5 ૣ࢈Ͱ͋Δ: Yes ૣ࢈ͷׂ߹: 100% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 0% શମͷׂ߹: 40% ૣ࢈Ͱ͋Δ: No ૣ࢈ͷׂ߹: 40% શମͷׂ߹: 20%
34 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 TRUE
FALSE 0% 40% 100% ૣ࢈Ͱ͋Δ֬
35 Ͳ͏ͬͯΛ࡞͍ͬͯΔͷ͔
36 Ͳͷ࣭ʢ݅ʣΛઌʹ࣋ͬͯ͘Δ͔
37 ෆ७ʢGini Impurityʣ • 0͔Β1ͷؒͷΛऔΔɻ • ͦΕͧΕͷϊʔυͷσʔλʹͲΕ͚͕ͩࠞ ͍ͬͯ͟Δ͔Λද͢ࢦඪ
pi 38 ෆ७ʢGini Impurityʣ ෆ७ (Gini Impurity) ͦͷϊʔυʹ͋ΔҰҙͷͷΛnͱ͢Δͱɺ ҎԼͷΑ͏ʹܭࢉͰ͖Δɻ( i൪ͷΛ࣋ͭαϯϓϧͷׂ߹)
1 − p2 1 − p2 2 − p2 3 − . . . . p2 n
ෆ७ = 0 39 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
1 - (0/6)2 - (6/6)2 = 0 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ෆ७ = 0 40 ૣ࢈ ૣ࢈ ૣ࢈ 1 - (6/6)2
- (0/6)2 = 0 ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.44 41 Not ૣ࢈ Not ૣ࢈ Not ૣ࢈
ૣ࢈ ૣ࢈ 1 - (2/6)2 - (4/6)2 = 0.44 Not ૣ࢈
ෆ७ = 0.44 42 Not ૣ࢈ ૣ࢈ ૣ࢈ 1 -
(4/6)2 - (2/6)2 = 0.44 Not ૣ࢈ ૣ࢈ ૣ࢈
ෆ७ = 0.5 43 Not ૣ࢈ ૣ࢈ Not ૣ࢈ Not
ૣ࢈ ૣ࢈ ૣ࢈ 1 - (3/6)2 - (3/6)2 = 0.5
44 ૣ࢈ Not ૣ࢈ ૣ࢈ ૣ࢈ Impurity: 0.5 ૣ࢈ ૣ࢈
Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ Not ૣ࢈ ελʔτ
45 ࣇͷମॏͰ࠷ॳʹάϧʔϓ͚͢Δ߹
46 ࣇͷମॏ >= 5.5 TRUE FALSE
47 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41
48 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 0 ෆ७: 1-
(2/7)2 - (5/7)2 = 0.41 ෆ७: 3/10*0 + 7/10*0.41 = 0.29
49 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5
50 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 3/10*0 + 7/10*0.41
= 0.29 ෆ७: 0.5 ෆ७ͷݮগ: 0.21
51 ࣇͷͰ࠷ॳʹάϧʔϓ͚͢Δ߹
52 ࣇͷ > 1.5 TRUE FALSE
53 ࣇͷମॏ >= 5.5 TRUE FALSE ࣇͷ > 1.5 ෆ७:
1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
54 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 1- (2/5)2 - (3/5)2 = 0.48 ෆ७: 1- (3/5)2 - (2/5)2 = 0.48
55 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5
56 ࣇͷମॏ >= 5.5 TRUE FALSE ෆ७: 5/10*0.48 + 5/10*0.48
= 0.48 ࣇͷ > 1.5 ෆ७: 0.5 ෆ७ͷݮগ: 0.02
57 ࣇͷମॏͷํ͕ɺ ࣇͷΑΓෆ७ΛݮΒͤΔ ࣇͷ 0.02 ʻ 0.48 ࣇͷମॏ ෆ७ͷݮগΛൺֱ͢Δͱŋŋŋ
58 ઌʹࣇͷମॏͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷͰάϧʔϓ͚͢Δ Λ࡞Δ
59 TRUE FALSE TRUE FALSE 100% 50% 0% ࣇͷମॏ >=
5.5 ࣇͷ > 1.5
60 ͢Δͱɺͦ͏Ͱͳ͍߹ʹൺͯ গͳ͍࣭ʢذʣͰ͢Ή
61 ઌʹࣇͷͰάϧʔϓ͚ͯ͠ ࣍ʹࣇͷମॏͰάϧʔϓ͚͢Δ Λ࡞Δ
62 Over_35 TRUE FALSE Is_Plural TRUE FALSE 50% 100% Is_Plural
TRUE FALSE 100% 100% ࣇͷମॏ >= 5.5 ࣇͷ > 1.5 ࣇͷମॏ >= 5.5
63 ྨ vs. ճؼ
64 ੜ·Εͯ͘ΔͪΌΜະख़ࣇ͔ʁ ྨ ྨ vs. ճؼ ճؼ ͍ͭͪΌΜ͕ੜ·Εͯ͘Δ͔ʁ
65 65 ྨ ࣇͷ ࣇͷମॏ 1 5 2 3 4
5 6 4 7
66 Mother Age Father Age ճؼ
67 Is_Prural TRUE FALSE Over_35 TRUE FALSE ৷ظؒΛ༧ଌ 20 40
68 Is_Prural TRUE FALSE Over_35 TRUE FALSE 20 weeks 30
weeks 37 weeks ৷ظؒΛ༧ଌ 20 ฏۉ 40 ฏۉ͔ΒͷΒ͖͕ͭ ࠷খʹͳΔΑ͏ʹ ࢬΛ͚͍ͯ͘
69 ΞφϦςΟΫε ܾఆΛͬͯݟΔʂ
is_premature(ૣ࢈)ͷྻΛ࡞Δ ͠ɺis_prematureྻ͕ͳ͍߹ɺૣ࢈͔Ͳ͏͔ (37िະ ຬ͔Ͳ͏͔)ͷཧΛͱΔྻΛgestation_weeks(৷ि) ͷྻ͔Β৽ͨʹ࡞Δɻ 70 gestation_weeks < 37
is_prematureྻΛ࡞Δ gestation_weeks(৷ि)ͷྻϔομϝχϡʔ͔Βܭࢉͷ࡞(Mutate)ΛબͿɻ 71
72 • ྻ໊ʹ is_prematureͱೖྗɻ • ܭࢉʹ gestation_weeks<37 Λೖྗɻ ࡞͞ΕΔྻʹɺ37िະຬͳΒTRUEɺ 37िҎ্ͳΒFALSEͷ͕ೖΔɻ
is_prematureྻΛ࡞Δ
73 ܾఆΞφϦςΟΫε
74 ༧ଌରྻͷબ
75 มͷྻͷબ
76 gestation_weeks(৷ि)Ҏ֎ͷશͯͷྻΛબ
77 ܾఆ͕࡞͞Εͨɻ
։࢝
ଟͷσʔλ : FALSE (Not ૣ࢈). TRUE (Premature) ͷׂ߹ : 12%.
͜ͷͰͷσʔλͷׂ߹ : 100% ։࢝
݅: ମॏʢweight_pounds greaterʣ͕ 5.3 ύϯυҎ্͔?
ଟͷσʔλɿFALSE TRUE (Premature) ͷׂ߹ : 8%. ͜ͷͰͷσʔλͷׂ߹ : 94%
ଟͷσʔλɿTRUE TRUE (Premature) ͷׂ߹ : 72%. ͜ͷͰͷσʔλͷׂ߹ : 6%
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Is_Premature vs. Weight
Is_Premature vs. Weight
ैۀһσʔλΛͬͨྫ
None
ܾఆͷϞσϧΛ࡞Δ
None
None
σʔλΛՄࢹԽ͔ͯ֬͠ΊΔ
Attrition vs. Overtime
Attrition vs. Monthly Income
None
• ϓϩάϥϛϯάͳ͠ RݴޠͷUIͰ͋ΔExploratoryΛੳπʔϧͱͯ͠༻͢ΔͨΊडߨதɺϏδωεͷ Λղܾ͢ΔͨΊʹඞཁͳσʔλαΠΤϯεͷख๏ͷशಘʹ100ˋूதͰ͖Δ • ੳπʔϧͷϕϯμʔϩοΫΠϯͳ͠ ExploratoryͰͷ࡞ۀશͯಠཱͨ͠ΦʔϓϯιʔεͷRڥͰ࠶ݱ͕Մೳ • ࢥߟྗͱεΩϧͷशಘ σʔλαΠΤϯεͷεΩϧशಘ͚ͩͰͳ͘ɺσʔλੳʹඞཁͳࢥߟྗशಘͰ͖Δ
ಛ
Q & A
࿈བྷઌ ϝʔϧ
[email protected]
ΣϒαΠτ https://ja.exploratory.io ϒʔτΩϟϯϓɾτϨʔχϯά https://ja.exploratory.io/training-jp Twitter @KanAugust