Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Web Scale Collaborations
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Arfon Smith
December 10, 2013
Science
85
0
Share
Embed
Copy iframe code
Copy JS code
Copy link
Start on current slide
Web Scale Collaborations
Arfon Smith
December 10, 2013
More Decks by Arfon Smith
See All by Arfon Smith
Why Generative AI makes collaborative, versioned science more important than ever
arfon
0
74
Generative AI is here: What are we going to do about it?
arfon
0
190
Five principles for building generative AI products
arfon
0
150
Five principles for building generative AI products
arfon
0
240
Learning from NASA's commitment to open
arfon
0
110
JOSS rOpenSci presentation
arfon
0
310
Five ways to use GitHub to automate scholarly work
arfon
0
150
Journal of Open Source Software: Bot-assisted community peer-review
arfon
0
140
A vision for the future of astronomical archives
arfon
0
170
Other Decks in Science
See All in Science
20260220 OpenIDファウンデーション・ジャパン ご紹介 / 20260220 OpenID Foundation Japan Intro
oidfj
0
360
コミュニティサイエンスの実践@日本認知科学会2025
hayataka88
0
170
AkarengaLT vol.40
hashimoto_kei
0
110
データベース08: 実体関連モデルとは?
trycycle
PRO
0
1.1k
力学系から見た現代的な機械学習
hanbao
4
4.2k
データベース04: SQL (1/3) 単純質問 & 集約演算
trycycle
PRO
0
1.5k
Conversation is the New Dashboard: 属人性を排除する第4世代BIツールの勢力図
shomaekawa
1
590
会社でMLモデルを作るとは @電気通信大学 データアントレプレナーフェロープログラム
yuto16
1
710
不動産業界における業界特化のデータ整備とAI活用 ─Vertical DataとVertical AI─
estie
1
540
[NLP2026 参加報告会] AI for Science まとめ / NLP2026
lychee1223
0
1.9k
Testing the Longevity Bottleneck Hypothesis
chinson03
0
310
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
270
Featured
See All Featured
Unlocking the hidden potential of vector embeddings in international SEO
frankvandijk
0
840
Getting science done with accelerated Python computing platforms
jacobtomlinson
2
220
Effective software design: The role of men in debugging patriarchy in IT @ Voxxed Days AMS
baasie
0
400
Large-scale JavaScript Application Architecture
addyosmani
515
110k
The Art of Delivering Value - GDevCon NA Keynote
reverentgeek
16
2k
Building Adaptive Systems
keathley
44
3k
[RailsConf 2023 Opening Keynote] The Magic of Rails
eileencodes
31
10k
Primal Persuasion: How to Engage the Brain for Learning That Lasts
tmiket
0
360
The Anti-SEO Checklist Checklist. Pubcon Cyber Week
ryanjones
0
160
The Impact of AI in SEO - AI Overviews June 2024 Edition
aleyda
5
1.1k
Stop Working from a Prison Cell
hatefulcrawdad
274
21k
Optimizing for Happiness
mojombo
378
71k
Transcript
Web Scale Collaborations Arfon Smith @arfon
Citizen Science
Distributed Computing
None
Distributed Data Collection
None
None
Distributed Analysis
None
None
None
None
http://www.novacelestia.com
None
None
None
None
None
0 250,000 500,000 750,000 1,000,000 Professor Paper PhD SDSS
Classifications per hour 0 10,000 20,000 30,000 40,000 50,000 60,000
70,000 Hours 0 6 12 18 24 30 36 42 48 1 Kevin months Fukugita et al. 2007
None
None
None
None
None
None
None
None
None
SDSS HST Starforming pea Narrow-line Seyfert pea
None
None
None
None
None
None
None
Motivations
None
None
None
1,000,000,000,000 hours / year
Spectrum of cognitive surplus
None
None
Begins with open data
Open Source
Not treating code and data as first class research objects
GitHub
What is a GitHub?
None
None
None
None
Easier to work together than alone
Open Source collaboration
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
Open Public ≠
Open (within your team, department or institution)
Electronic
Available
Asynchronous
Lock-free
None
None
None
Low friction collaboration
“open source is… reproducible by necessity” Fernando Perez http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html
Better at collaborating because they have to be
Towards Collaborative Versioned Science
How do we make this behaviour the norm?
Incentive model (it’s broken)
Credit
http://dx.doi.org/10.6084/m9.figshare.828487
http://dx.doi.org/10.6084/m9.figshare.828487
None
None
Derive meaningful metrics from open contributions
“Academic environments of today do not reward tool builders” Ed
Lazowska, OSTP event http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
A VISION AND STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND
EDUCATION
What can we do today?
Take data management plans seriously
Try versioning your research
Share more than just data
If you’re going to share it then you better put
a licence on it
Thanks.
[email protected]
@arfon "