Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Web Scale Collaborations
Search
Arfon Smith
December 10, 2013
Science
0
79
Web Scale Collaborations
Arfon Smith
December 10, 2013
Tweet
Share
More Decks by Arfon Smith
See All by Arfon Smith
Generative AI is here: What are we going to do about it?
arfon
0
57
Five principles for building generative AI products
arfon
0
74
Five principles for building generative AI products
arfon
0
160
Learning from NASA's commitment to open
arfon
0
67
JOSS rOpenSci presentation
arfon
0
230
Five ways to use GitHub to automate scholarly work
arfon
0
85
Journal of Open Source Software: Bot-assisted community peer-review
arfon
0
78
A vision for the future of astronomical archives
arfon
0
120
Journal of Open Source Software: When collaborative open source meets peer review
arfon
2
330
Other Decks in Science
See All in Science
多次元展開法を用いた 多値バイクラスタリング モデルの提案
kosugitti
0
200
理論計算機科学における 数学の応用: 擬似ランダムネス
nobushimi
1
380
位相的データ解析とその応用例
brainpadpr
1
780
証明支援系LEANに入門しよう
unaoya
0
520
MoveItを使った産業用ロボット向け動作作成方法の紹介 / Introduction to creating motion for industrial robots using MoveIt
ry0_ka
0
220
重複排除・高速バックアップ・ランサムウェア対策 三拍子そろったExaGrid × Veeam連携セミナー
climbteam
0
150
The thin line between reconstruction, classification, and hallucination in brain decoding
ykamit
1
1.1k
FOGBoston2024
lcolladotor
0
130
JSol'Ex : traitement d'images solaires en Java
melix
0
130
小杉考司(専修大学)
kosugitti
2
580
Inductive-bias Learning: 大規模言語モデルによる予測モデルの生成
fuyu_quant0
0
130
事業会社における 機械学習・推薦システム技術の活用事例と必要な能力 / ml-recsys-in-layerx-wantedly-2024
yuya4
3
270
Featured
See All Featured
Java REST API Framework Comparison - PWX 2021
mraible
28
8.3k
Testing 201, or: Great Expectations
jmmastey
41
7.2k
Adopting Sorbet at Scale
ufuk
74
9.1k
Automating Front-end Workflow
addyosmani
1366
200k
Why Our Code Smells
bkeepers
PRO
335
57k
Designing Experiences People Love
moore
139
23k
Building an army of robots
kneath
302
44k
Documentation Writing (for coders)
carmenintech
67
4.5k
10 Git Anti Patterns You Should be Aware of
lemiorhan
PRO
656
59k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
10
850
Principles of Awesome APIs and How to Build Them.
keavy
126
17k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
230
52k
Transcript
Web Scale Collaborations Arfon Smith @arfon
Citizen Science
Distributed Computing
None
Distributed Data Collection
None
None
Distributed Analysis
None
None
None
None
http://www.novacelestia.com
None
None
None
None
None
0 250,000 500,000 750,000 1,000,000 Professor Paper PhD SDSS
Classifications per hour 0 10,000 20,000 30,000 40,000 50,000 60,000
70,000 Hours 0 6 12 18 24 30 36 42 48 1 Kevin months Fukugita et al. 2007
None
None
None
None
None
None
None
None
None
SDSS HST Starforming pea Narrow-line Seyfert pea
None
None
None
None
None
None
None
Motivations
None
None
None
1,000,000,000,000 hours / year
Spectrum of cognitive surplus
None
None
Begins with open data
Open Source
Not treating code and data as first class research objects
GitHub
What is a GitHub?
None
None
None
None
Easier to work together than alone
Open Source collaboration
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
Open Public ≠
Open (within your team, department or institution)
Electronic
Available
Asynchronous
Lock-free
None
None
None
Low friction collaboration
“open source is… reproducible by necessity” Fernando Perez http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html
Better at collaborating because they have to be
Towards Collaborative Versioned Science
How do we make this behaviour the norm?
Incentive model (it’s broken)
Credit
http://dx.doi.org/10.6084/m9.figshare.828487
http://dx.doi.org/10.6084/m9.figshare.828487
None
None
Derive meaningful metrics from open contributions
“Academic environments of today do not reward tool builders” Ed
Lazowska, OSTP event http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
A VISION AND STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND
EDUCATION
What can we do today?
Take data management plans seriously
Try versioning your research
Share more than just data
If you’re going to share it then you better put
a licence on it
Thanks.
[email protected]
@arfon "