Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
The Hardest Problem in Data
Search
Ronnie Chen
August 24, 2017
Technology
0
230
The Hardest Problem in Data
Ronnie Chen
August 24, 2017
Tweet
Share
More Decks by Ronnie Chen
See All by Ronnie Chen
ChaosConf 2018
ronnieftw
4
1.8k
devopsdays MSP 2018: Staying Alive
ronnieftw
1
660
Luck Driven Development: Building for Serendipity in Slack's Data Platform
ronnieftw
1
500
Staying Alive: Patterns for Failure Management From the Bottom of the Ocean
ronnieftw
0
270
Scaling Data at Slack: A Series of Unfortunate Events
ronnieftw
0
1.6k
Other Decks in Technology
See All in Technology
Bill One 開発エンジニア 紹介資料
sansan33
PRO
4
17k
AI アクセラレータチップ AWS Trainium/Inferentia に 今こそ入門
yoshimi0227
1
200
名刺メーカーDevグループ 紹介資料
sansan33
PRO
0
1k
Master Dataグループ紹介資料
sansan33
PRO
1
4.2k
RALGO : AIを組織に組み込む方法 -アルゴリズム中心組織設計- #RSGT2026 / RALGO: How to Integrate AI into an Organization – Algorithm-Centric Organizational Design
kyonmm
PRO
3
1.3k
Java 25に至る道
skrb
3
220
AWSと生成AIで学ぶ!実行計画の読み解き方とSQLチューニングの実践
yakumo
2
510
戰略轉變:從建構 AI 代理人到發展可擴展的技能生態系統
appleboy
0
190
I tried making a solo advent calendar!
zzzzico
0
150
純粋なイミュータブルモデルを設計してからイベントソーシングと組み合わせるDeciderの実践方法の紹介 /Introducing Decider Pattern with Event Sourcing
tomohisa
1
1.1k
「駆動」って言葉、なんかカッコイイ_Mitz
comucal
PRO
0
140
kintone開発のプラットフォームエンジニアの紹介
cybozuinsideout
PRO
0
520
Featured
See All Featured
The Illustrated Guide to Node.js - THAT Conference 2024
reverentgeek
0
230
How to build an LLM SEO readiness audit: a practical framework
nmsamuel
1
600
Gemini Prompt Engineering: Practical Techniques for Tangible AI Outcomes
mfonobong
2
250
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
9.8k
Reflections from 52 weeks, 52 projects
jeffersonlam
355
21k
Designing for humans not robots
tammielis
254
26k
Connecting the Dots Between Site Speed, User Experience & Your Business [WebExpo 2025]
tammyeverts
11
790
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
78
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
130
Building Applications with DynamoDB
mza
96
6.9k
Discover your Explorer Soul
emna__ayadi
2
1k
Evolution of real-time – Irina Nazarova, EuRuKo, 2024
irinanazarova
9
1.1k
Transcript
The Hardest Problem in Data Ronnie Chen @rondoftw Data Engineering
Slack 1 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
2 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
→ Machine learning → Predictive modeling → Neural networks →
Artificial intelligence 3 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting ?! 4 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
5 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
A simple counting problem 6 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
The Rules: 1. Only one number 2. Convince me it's
correct 7 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
How many friends do you have? 8 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
Will I get the same number if... !"#$ I ask
every person you know if they consider you their friend? 9 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Will I get the same number if... ! " I
ask every person that knows you if they think you would consider them a friend? 10 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Is this the number of people that you'd tell a
secret to? 11 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
But it depends!! 12 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
How many users do we have? 13 — WriteSpeakCode 2017
| Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users 14 — WriteSpeakCode 2017 | Ronnie
Chen @rondoftw
user_id name email deleted 1 Alice alice@*** 2 Bob bob@***
true 3 Carol 15 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE deleted != true AND email
!= null 16 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
SELECT COUNT(*) FROM prod.users WHERE last_active > 2017-07-24 17 —
WriteSpeakCode 2017 | Ronnie Chen @rondoftw
user_id email 12334
[email protected]
38602
[email protected]
52981
[email protected]
67640
[email protected]
18 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
¯\_(ϑ)_/¯ 19 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What are you not even aware of? 20 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw
Okay, I get it. But what's the big deal? 21
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
26% of professional computing jobs were held by women in
2016 22 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
23 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Numbers give you authority and the appearance of objectivity 24
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counting is power. 25 — WriteSpeakCode 2017 | Ronnie Chen
@rondoftw
26 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Counts can determine funding, set agendas, and shift priorities 27
— WriteSpeakCode 2017 | Ronnie Chen @rondoftw
Machine learning is like money laundering for bias — Maciej
Cegłowski, founder of @Pinboard 28 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
29 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
30 — WriteSpeakCode 2017 | Ronnie Chen @rondoftw
What you count determines what is important. 31 — WriteSpeakCode
2017 | Ronnie Chen @rondoftw