Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Web Scale Collaborations
Search
Arfon Smith
December 10, 2013
Science
84
0
Share
Web Scale Collaborations
Arfon Smith
December 10, 2013
More Decks by Arfon Smith
See All by Arfon Smith
Why Generative AI makes collaborative, versioned science more important than ever
arfon
0
59
Generative AI is here: What are we going to do about it?
arfon
0
160
Five principles for building generative AI products
arfon
0
140
Five principles for building generative AI products
arfon
0
230
Learning from NASA's commitment to open
arfon
0
110
JOSS rOpenSci presentation
arfon
0
300
Five ways to use GitHub to automate scholarly work
arfon
0
150
Journal of Open Source Software: Bot-assisted community peer-review
arfon
0
140
A vision for the future of astronomical archives
arfon
0
170
Other Decks in Science
See All in Science
データベース03: 関係データモデル
trycycle
PRO
1
450
ド文系だった私が、 KaggleのNCAAコンペでソロ金取れるまで
wakamatsu_takumu
2
2.3k
機械学習 - 授業概要
trycycle
PRO
0
460
Vibecoding for Product Managers
ibknadedeji
0
160
コンピュータビジョンによるロボットの視覚と判断:宇宙空間での適応と課題
hf149
1
640
論文紹介 音源分離:SCNET SPARSE COMPRESSION NETWORK FOR MUSIC SOURCE SEPARATION
kenmatsu4
0
620
Bリーグのショットデータを活用した得点期待値モデルの構築 / Construction of expected points model using shot data of B.LEAGUE
konakalab
0
110
データマイニング - グラフ埋め込み入門
trycycle
PRO
1
210
「遂行理論の未来」(松島斉教授最終講義記念セッションの発表資料)
shunyanoda
0
860
データマイニング - ノードの中心性
trycycle
PRO
0
390
データベース02: データベースの概念
trycycle
PRO
2
1.1k
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
240
Featured
See All Featured
Digital Ethics as a Driver of Design Innovation
axbom
PRO
1
270
The Pragmatic Product Professional
lauravandoore
37
7.2k
The Invisible Side of Design
smashingmag
303
52k
The SEO identity crisis: Don't let AI make you average
varn
0
450
What the history of the web can teach us about the future of AI
inesmontani
PRO
1
530
A Tale of Four Properties
chriscoyier
163
24k
Money Talks: Using Revenue to Get Sh*t Done
nikkihalliwell
0
210
Practical Orchestrator
shlominoach
191
11k
CoffeeScript is Beautiful & I Never Want to Write Plain JavaScript Again
sstephenson
162
16k
Stop Working from a Prison Cell
hatefulcrawdad
274
21k
The Mindset for Success: Future Career Progression
greggifford
PRO
0
310
Highjacked: Video Game Concept Design
rkendrick25
PRO
1
350
Transcript
Web Scale Collaborations Arfon Smith @arfon
Citizen Science
Distributed Computing
None
Distributed Data Collection
None
None
Distributed Analysis
None
None
None
None
http://www.novacelestia.com
None
None
None
None
None
0 250,000 500,000 750,000 1,000,000 Professor Paper PhD SDSS
Classifications per hour 0 10,000 20,000 30,000 40,000 50,000 60,000
70,000 Hours 0 6 12 18 24 30 36 42 48 1 Kevin months Fukugita et al. 2007
None
None
None
None
None
None
None
None
None
SDSS HST Starforming pea Narrow-line Seyfert pea
None
None
None
None
None
None
None
Motivations
None
None
None
1,000,000,000,000 hours / year
Spectrum of cognitive surplus
None
None
Begins with open data
Open Source
Not treating code and data as first class research objects
GitHub
What is a GitHub?
None
None
None
None
Easier to work together than alone
Open Source collaboration
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
Open Public ≠
Open (within your team, department or institution)
Electronic
Available
Asynchronous
Lock-free
None
None
None
Low friction collaboration
“open source is… reproducible by necessity” Fernando Perez http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html
Better at collaborating because they have to be
Towards Collaborative Versioned Science
How do we make this behaviour the norm?
Incentive model (it’s broken)
Credit
http://dx.doi.org/10.6084/m9.figshare.828487
http://dx.doi.org/10.6084/m9.figshare.828487
None
None
Derive meaningful metrics from open contributions
“Academic environments of today do not reward tool builders” Ed
Lazowska, OSTP event http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
A VISION AND STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND
EDUCATION
What can we do today?
Take data management plans seriously
Try versioning your research
Share more than just data
If you’re going to share it then you better put
a licence on it
Thanks.
[email protected]
@arfon "