Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Human Cloning: The Data Scientist Bottleneck Re...
Search
Data Science London
July 03, 2012
Technology
2
140
Human Cloning: The Data Scientist Bottleneck Resolved
Presentation by Dr. Alex Farquhar, Data Scientist @ForwardTek at Data Science London 22/02/12
Data Science London
July 03, 2012
Tweet
Share
More Decks by Data Science London
See All by Data Science London
Semi-Supervised Anomaly Detection
datasciencelondon
0
990
Hacking the Rail: Ingesting, analysing & visualising realtime streaming data
datasciencelondon
1
47k
Stateful Data-Parallel Processing
datasciencelondon
0
47k
Semantic web warmed up: Ontologies for the IoT
datasciencelondon
0
130
IoT data ingestion pipelines and Clojure transducers
datasciencelondon
0
280
TrendCalculus: A data science for trends
datasciencelondon
1
48k
Data Science in Mobile Health
datasciencelondon
1
8.3k
Large-scale Recommender Systems on Just a PC (with GraphChi)
datasciencelondon
1
17k
Taming Graph Dynamics at Scale
datasciencelondon
0
8.1k
Other Decks in Technology
See All in Technology
今だから言えるセキュリティLT_Wordpress5.7.2未満を一斉アップデートせよ
cuebic9bic
2
170
三視点LLMによる複数観点レビュー
mhlyc
0
240
組織内、組織間の資産保護に必要なアイデンティティ基盤と関連技術の最新動向
fujie
0
370
助けて! XからWaylandに移行しないと新しいGNOMEが使えなくなっちゃう 2025-07-12
nobutomurata
2
210
20250719_JAWS_kobe
takuyay0ne
1
110
データ駆動経営の道しるべ:プロダクト開発指標の戦略的活用法
ham0215
2
150
CDK Vibe Coding Fes
tomoki10
1
650
RapidPen: AIエージェントによる高度なペネトレーションテスト自動化の研究開発
laysakura
1
240
エンジニアリングマネージャー“お悩み相談”パネルセッション
ar_tama
1
240
Talk to Someone At Delta Airlines™️ USA Contact Numbers
travelcarecenter
0
160
OpenTelemetryセマンティック規約の恩恵とMackerel APMにおける活用例 / SRE NEXT 2025
mackerelio
3
2k
「現場で活躍するAIエージェント」を実現するチームと開発プロセス
tkikuchi1002
4
600
Featured
See All Featured
Intergalactic Javascript Robots from Outer Space
tanoku
271
27k
Learning to Love Humans: Emotional Interface Design
aarron
273
40k
How to train your dragon (web standard)
notwaldorf
96
6.1k
The Power of CSS Pseudo Elements
geoffreycrofte
77
5.9k
Building a Scalable Design System with Sketch
lauravandoore
462
33k
The Language of Interfaces
destraynor
158
25k
ReactJS: Keep Simple. Everything can be a component!
pedronauck
667
120k
GraphQLとの向き合い方2022年版
quramy
49
14k
Testing 201, or: Great Expectations
jmmastey
43
7.6k
Unsuck your backbone
ammeep
671
58k
Cheating the UX When There Is Nothing More to Optimize - PixelPioneers
stephaniewalter
282
13k
Build your cross-platform service in a week with App Engine
jlugia
231
18k
Transcript
HUMAN CLONING The Data Scientist bottleneck resolved Dr Alex Farquhar
Friday, 24 February 2012
0 5,000 10,000 15,000 20,000 2008 2009 2010 2011 2012
2013 2014 2015 2016 2017 exabytes data (IDC/EMC report 2008) Friday, 24 February 2012
By 2018, the United States alone could face a shortage
of 140,000 to 190,000 data people... Friday, 24 February 2012
WE’RE ALL DOOMED Friday, 24 February 2012
DATA PEOPLE? © Drew Conway Friday, 24 February 2012
MAYBE WE CAN JUST.... • 1 statistician + 1 developer
≈ 1 data scientist? Friday, 24 February 2012
HOW ABOUT.... • 4 statisticians + 4 developers ≈ 4
Data Scientists? Friday, 24 February 2012
Friday, 24 February 2012
Friday, 24 February 2012
WHAT CAN WE DO? • Train more new data scientists
(not fast enough) • Cross-train people • Cobble together different skills in teams (see above) Friday, 24 February 2012
WHAT CAN WE DO? • Do more work Friday, 24
February 2012
DOING MORE • simplify (fob the work off) • automate
(fob even more work off) • choose/build the right tools • parallelise • iterate Friday, 24 February 2012
SIMPLIFY & AUTOMATE • Counting stuff is not much fun
Friday, 24 February 2012
Hive Hadoop TSV files SIMPLIFY & AUTOMATE Friday, 24 February
2012
AUTOMATE / PARALLELISE Hadoop Job magic Friday, 24 February 2012
AUTOMATE / PARALLELISE Lots of jobs at once Job 1
Job 2 Job 3 Job 4 Hadoop magic Friday, 24 February 2012
TOOLS • something thats allows fast iteration i.e. not java
• R, ruby, python Friday, 24 February 2012
PARALLELISE Friday, 24 February 2012
ITERATE • try different things • improve what works •
dump what doesn’t • constant improvement & learning → get faster Friday, 24 February 2012
WE’RE NOT ALL DOOMED Friday, 24 February 2012