Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
200
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
520
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
360
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
210
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.3k
Other Decks in Technology
See All in Technology
コンテナセキュリティの基本と脅威への対策
kyohmizu
3
640
Four keys改善の取り組み事例紹介
sansantech
PRO
2
220
インシデントレスポンスのライフサイクルを廻すポイントってなに / Pinpoints of Incidentresponse Lifecycle for Operation
sakaitakeshi
0
280
AWS を使う上で知っておきたいオンプレミス知識/aws-on-premise-essentials
emiki
1
4.1k
4年前、あるじゃん老害エンジニアLT合戦に登壇、米国西海岸コンピュータ歴史博物館体験記の続編
toshi_atsumi
0
180
Amplify Gen2を 拡張してみよう JAWS-UG北陸新幹線 ( 福井開催 ) 2024-04-06/Let's extend Amplify Gen2
fossamagna
0
280
TransitGatewayの基礎
toru_kubota
0
230
A (short) History of AI
harishpillay
0
100
o11y入門_外形監視を利用したWebアプリケーションへの最適なモニタリング_TechBrew
k5k
2
100
最近たまに見かけるTiDBってなんだ? - Findy
pingcap0315
1
100
入社後初めてのタスクでk8sアップグレードした話.pdf
kkato1
0
370
元インフラエンジニアに成る / Human Resources to Human Relations
bobtani
1
310
Featured
See All Featured
Typedesign – Prime Four
hannesfritz
36
2k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
74
41k
The Success of Rails: Ensuring Growth for the Next 100 Years
eileencodes
29
6k
Debugging Ruby Performance
tmm1
69
11k
Clear Off the Table
cherdarchuk
82
310k
The Invisible Customer
myddelton
114
12k
The Cost Of JavaScript in 2023
addyosmani
13
3.8k
Music & Morning Musume
bryan
40
5.5k
Building Applications with DynamoDB
mza
88
5.6k
Designing for humans not robots
tammielis
247
25k
Fashionably flexible responsive web design (full day workshop)
malarkey
397
65k
Build The Right Thing And Hit Your Dates
maggiecrowley
23
2k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw