Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Human Cloning: The Data Scientist Bottleneck Re...
Search
Data Science London
July 03, 2012
Technology
2
140
Human Cloning: The Data Scientist Bottleneck Resolved
Presentation by Dr. Alex Farquhar, Data Scientist @ForwardTek at Data Science London 22/02/12
Data Science London
July 03, 2012
Tweet
Share
More Decks by Data Science London
See All by Data Science London
Semi-Supervised Anomaly Detection
datasciencelondon
0
1.1k
Hacking the Rail: Ingesting, analysing & visualising realtime streaming data
datasciencelondon
1
47k
Stateful Data-Parallel Processing
datasciencelondon
0
47k
Semantic web warmed up: Ontologies for the IoT
datasciencelondon
0
130
IoT data ingestion pipelines and Clojure transducers
datasciencelondon
0
290
TrendCalculus: A data science for trends
datasciencelondon
1
48k
Data Science in Mobile Health
datasciencelondon
1
8.3k
Large-scale Recommender Systems on Just a PC (with GraphChi)
datasciencelondon
1
17k
Taming Graph Dynamics at Scale
datasciencelondon
0
8.1k
Other Decks in Technology
See All in Technology
OCI技術資料 : OS管理ハブ 概要
ocise
2
4.3k
3分でわかる!新機能 AWS Transform custom
sato4mi
1
250
DatabricksホストモデルでAIコーディング環境を構築する
databricksjapan
0
190
書籍執筆での生成AIの活用
sat
PRO
1
220
AI時代にあわせたQA組織戦略
masamiyajiri
6
2.8k
しろおびセキュリティへ ようこそ
log0417
0
160
セキュリティ はじめの一歩
nikinusu
0
920
AWSと暗号技術
nrinetcom
PRO
1
190
Agentic Coding 実践ワークショップ
watany
41
27k
【NGK2026S】日本株のシステムトレードに入門してみた
kazuhitotakahashi
0
190
日本語テキストと音楽の対照学習の技術とその応用
lycorptech_jp
PRO
1
350
Zephyr RTOS の発表をOpen Source Summit Japan 2025で行った件
iotengineer22
0
280
Featured
See All Featured
Leo the Paperboy
mayatellez
4
1.3k
Breaking role norms: Why Content Design is so much more than writing copy - Taylor Woolridge
uxyall
0
150
We Have a Design System, Now What?
morganepeng
54
8k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
71k
A Guide to Academic Writing Using Generative AI - A Workshop
ks91
PRO
0
180
Docker and Python
trallard
47
3.7k
Heart Work Chapter 1 - Part 1
lfama
PRO
5
35k
Getting science done with accelerated Python computing platforms
jacobtomlinson
1
100
How Fast Is Fast Enough? [PerfNow 2025]
tammyeverts
3
440
Leveraging Curiosity to Care for An Aging Population
cassininazir
1
150
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
52
RailsConf 2023
tenderlove
30
1.3k
Transcript
HUMAN CLONING The Data Scientist bottleneck resolved Dr Alex Farquhar
Friday, 24 February 2012
0 5,000 10,000 15,000 20,000 2008 2009 2010 2011 2012
2013 2014 2015 2016 2017 exabytes data (IDC/EMC report 2008) Friday, 24 February 2012
By 2018, the United States alone could face a shortage
of 140,000 to 190,000 data people... Friday, 24 February 2012
WE’RE ALL DOOMED Friday, 24 February 2012
DATA PEOPLE? © Drew Conway Friday, 24 February 2012
MAYBE WE CAN JUST.... • 1 statistician + 1 developer
≈ 1 data scientist? Friday, 24 February 2012
HOW ABOUT.... • 4 statisticians + 4 developers ≈ 4
Data Scientists? Friday, 24 February 2012
Friday, 24 February 2012
Friday, 24 February 2012
WHAT CAN WE DO? • Train more new data scientists
(not fast enough) • Cross-train people • Cobble together different skills in teams (see above) Friday, 24 February 2012
WHAT CAN WE DO? • Do more work Friday, 24
February 2012
DOING MORE • simplify (fob the work off) • automate
(fob even more work off) • choose/build the right tools • parallelise • iterate Friday, 24 February 2012
SIMPLIFY & AUTOMATE • Counting stuff is not much fun
Friday, 24 February 2012
Hive Hadoop TSV files SIMPLIFY & AUTOMATE Friday, 24 February
2012
AUTOMATE / PARALLELISE Hadoop Job magic Friday, 24 February 2012
AUTOMATE / PARALLELISE Lots of jobs at once Job 1
Job 2 Job 3 Job 4 Hadoop magic Friday, 24 February 2012
TOOLS • something thats allows fast iteration i.e. not java
• R, ruby, python Friday, 24 February 2012
PARALLELISE Friday, 24 February 2012
ITERATE • try different things • improve what works •
dump what doesn’t • constant improvement & learning → get faster Friday, 24 February 2012
WE’RE NOT ALL DOOMED Friday, 24 February 2012