$30 off During Our Annual Pro Sale. View Details »
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Platforms for Data Science
Search
Deepak Singh
October 01, 2011
Technology
3
200
Platforms for Data Science
Talk given at the "Computing on the Brink" series
Deepak Singh
October 01, 2011
Tweet
Share
More Decks by Deepak Singh
See All by Deepak Singh
Changing the Calculus of Containers (Datadog Dash)
mndoci
2
110
Platforms for scientific data analysis
mndoci
3
110
FGED Keynote
mndoci
3
94
Open Mic Science - May 7, 2012
mndoci
4
1.3k
Talk at "Genome Informatics Alliance 2012" meeting
mndoci
1
260
A Platform for Data Science
mndoci
6
14k
Intel Theater Presentation @ SC11
mndoci
6
190
Talk at West Coast Association of Shared Directors meeting
mndoci
3
150
A platform for data science - Systems Bioinformatics Workshop
mndoci
3
120
Other Decks in Technology
See All in Technology
プロンプトやエージェントを自動的に作る方法
shibuiwilliam
13
11k
会社紹介資料 / Sansan Company Profile
sansan33
PRO
11
390k
AIの長期記憶と短期記憶の違いについてAgentCoreを例に深掘ってみた
yakumo
4
440
たまに起きる外部サービスの障害に備えたり備えなかったりする話
egmc
0
260
AI駆動開発の実践とその未来
eltociear
1
230
2025年 開発生産「可能」性向上報告 サイロ解消からチームが能動性を獲得するまで/ 20251216 Naoki Takahashi
shift_evolve
PRO
1
200
生成AIを利用するだけでなく、投資できる組織へ / Becoming an Organization That Invests in GenAI
kaminashi
0
110
年間40件以上の登壇を続けて見えた「本当の発信力」/ 20251213 Masaki Okuda
shift_evolve
PRO
1
140
Strands AgentsとNova 2 SonicでS2Sを実践してみた
yama3133
0
260
MLflowダイエット大作戦
lycorptech_jp
PRO
1
140
AWS Security Agentの紹介/introducing-aws-security-agent
tomoki10
0
310
CARTAのAI CoE が挑む「事業を進化させる AI エンジニアリング」 / carta ai coe evolution business ai engineering
carta_engineering
0
1.9k
Featured
See All Featured
Dealing with People You Can't Stand - Big Design 2015
cassininazir
367
27k
Rails Girls Zürich Keynote
gr2m
95
14k
Optimising Largest Contentful Paint
csswizardry
37
3.5k
Facilitating Awesome Meetings
lara
57
6.7k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.6k
実際に使うSQLの書き方 徹底解説 / pgcon21j-tutorial
soudai
PRO
196
70k
Side Projects
sachag
455
43k
Unsuck your backbone
ammeep
671
58k
Why Our Code Smells
bkeepers
PRO
340
57k
Raft: Consensus for Rubyists
vanstee
141
7.2k
Save Time (by Creating Custom Rails Generators)
garrettdimon
PRO
32
1.8k
Bootstrapping a Software Product
garrettdimon
PRO
307
120k
Transcript
There is no magic There is only awesome D e
e p a k S i n g h Platforms for data science
bioinformatics image: Ethan Hein
3
collection
curation
analysis
what’s the big deal?
None
Source: http://www.nature.com/news/specials/bigdata/index.html
Image: Yael Fitzpatrick (AAAS)
Image: Yael Fitzpatrick (AAAS)
lots of data
lots of people
lots of places
constant change
we want to make our data more effective
versioning
provenance
filter
aggregate
extend
mashup
human interfaces
None
image: Leo Reynolds
hard problem
really hard problem
so how do get there?
information platforms
Image: Drew Conway
dataspaces Further reading: Jeff Hammerbacher, Information Platforms and the rise
of the data scientist, Beautiful Data
the unreasonable effectiveness of data Halevy, et al. IEEE Intelligent
Systems, 24, 8-12 (2009)
accept all data formats
evolve APIs
beyond databases and the data warehouse
data as a programmable resource
data is a royal garden
compute is a fungible commodity
optimizing the most valuable resource
compute, storage, workflows, memory, transmission, algorithms, cost, …
people Credit: Pieter Musterd a CC-BY-NC-ND license
Image: Chris Dagdigian
my bias
cloud services
distributed systems
scale
global
consumption models
on-demand
what is the value of your data?
None
None
Credit: Angel Pizzaro, U. Penn
mapreduce for genomics http://bowtie-bio.sourceforge.net/crossbow/index.shtml http://contrail-bio.sourceforge.net http://bowtie-bio.sourceforge.net/myrna/index.shtml
None
Bioproximity http://aws.amazon.com/solutions/case-studies/bioproximity/
None
None
30,472 cores
$1279/hr
http://cloudbiolinux.org/
http://usegalaxy.org/cloud
in summary
large scale data requires a rethink
data architecture
compute architecture
distributed, programmable infrastructure
cloud services
remove constraints
can we build data science platforms?
there is no magic there is only awesome
[email protected]
Twitter:@mndoci http://slideshare.net/mndoci http://mndoci.com Inspiration and ideas from Matt Wood&
Larry Lessig Credit” Oberazzi under a CC-BY-NC-SA license