Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
230
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
610
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
480
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
240
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.5k
Other Decks in Technology
See All in Technology
Introduction to Sansan, inc / Sansan Global Development Center, Inc.
sansan33
PRO
0
2.7k
大量配信システムにおけるSLOの実践:「見えない」信頼性をSLOで可視化
plaidtech
PRO
0
390
対話型音声AIアプリケーションの信頼性向上の取り組み
ivry_presentationmaterials
3
1.1k
アクセスピークを制するオートスケール再設計: 障害を乗り越えKEDAで実現したリソース管理の最適化
myamashii
1
680
ビジネス職が分析も担う事業部制組織でのデータ活用の仕組みづくり / Enabling Data Analytics in Business-Led Divisional Organizations
zaimy
1
400
An introduction to Claude Code SDK
choplin
2
1.3k
研究開発部メンバーの働き⽅ / Sansan R&D Profile
sansan33
PRO
3
18k
推し書籍📚 / Books and a QA Engineer
ak1210
0
140
ABEMAの本番環境負荷試験への挑戦
mk2taiga
5
1.3k
LLM拡張解体新書/llm-extension-deep-dive
oracle4engineer
PRO
24
6.4k
“日本一のM&A企業”を支える、少人数SREの効率化戦略 / SRE NEXT 2025
genda
1
280
全部AI、全員Cursor、ドキュメント駆動開発 〜DevinやGeminiも添えて〜
rinchsan
10
5.1k
Featured
See All Featured
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Let's Do A Bunch of Simple Stuff to Make Websites Faster
chriscoyier
507
140k
Why Our Code Smells
bkeepers
PRO
337
57k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
251
21k
Facilitating Awesome Meetings
lara
54
6.5k
Designing for humans not robots
tammielis
253
25k
Into the Great Unknown - MozCon
thekraken
40
1.9k
BBQ
matthewcrist
89
9.7k
How to Create Impact in a Changing Tech Landscape [PerfNow 2023]
tammyeverts
53
2.9k
Producing Creativity
orderedlist
PRO
346
40k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
282
13k
Agile that works and the tools we love
rasmusluckow
329
21k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw