Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
Web Scale Collaborations
Search
Sponsored
·
Your Podcast. Everywhere. Effortlessly.
Share. Educate. Inspire. Entertain. You do you. We'll handle the rest.
→
Arfon Smith
December 10, 2013
Science
0
83
Web Scale Collaborations
Arfon Smith
December 10, 2013
Tweet
Share
More Decks by Arfon Smith
See All by Arfon Smith
Why Generative AI makes collaborative, versioned science more important than ever
arfon
0
44
Generative AI is here: What are we going to do about it?
arfon
0
140
Five principles for building generative AI products
arfon
0
130
Five principles for building generative AI products
arfon
0
210
Learning from NASA's commitment to open
arfon
0
96
JOSS rOpenSci presentation
arfon
0
290
Five ways to use GitHub to automate scholarly work
arfon
0
140
Journal of Open Source Software: Bot-assisted community peer-review
arfon
0
130
A vision for the future of astronomical archives
arfon
0
160
Other Decks in Science
See All in Science
Kaggle: NeurIPS - Open Polymer Prediction 2025 コンペ 反省会
calpis10000
0
370
Accelerated Computing for Climate forecast
inureyes
PRO
0
150
ド文系だった私が、 KaggleのNCAAコンペでソロ金取れるまで
wakamatsu_takumu
2
1.9k
学術講演会中央大学学員会府中支部
tagtag
PRO
0
350
HDC tutorial
michielstock
1
360
主成分分析に基づく教師なし特徴抽出法を用いたコラーゲン-グリコサミノグリカンメッシュの遺伝子発現への影響
tagtag
PRO
0
180
データマイニング - グラフ構造の諸指標
trycycle
PRO
0
250
AI(人工知能)の過去・現在・未来 —AIは人間を超えるのか—
tagtag
PRO
0
140
データマイニング - ウェブとグラフ
trycycle
PRO
0
230
俺たちは本当に分かり合えるのか? ~ PdMとスクラムチームの “ずれ” を科学する
bonotake
2
1.6k
あなたに水耕栽培を愛していないとは言わせない
mutsumix
1
250
サイコロで理解する原子核崩壊と拡散現象 〜単純化されたモデルで本質を理解する〜
syotasasaki593876
0
150
Featured
See All Featured
JavaScript: Past, Present, and Future - NDC Porto 2020
reverentgeek
52
5.8k
The State of eCommerce SEO: How to Win in Today's Products SERPs - #SEOweek
aleyda
2
9.5k
Color Theory Basics | Prateek | Gurzu
gurzu
0
190
Sharpening the Axe: The Primacy of Toolmaking
bcantrill
46
2.7k
Speed Design
sergeychernyshev
33
1.5k
Music & Morning Musume
bryan
47
7.1k
The innovator’s Mindset - Leading Through an Era of Exponential Change - McGill University 2025
jdejongh
PRO
1
88
Making the Leap to Tech Lead
cromwellryan
135
9.7k
Beyond borders and beyond the search box: How to win the global "messy middle" with AI-driven SEO
davidcarrasco
1
49
Producing Creativity
orderedlist
PRO
348
40k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
31
3.1k
Responsive Adventures: Dirty Tricks From The Dark Corners of Front-End
smashingmag
254
22k
Transcript
Web Scale Collaborations Arfon Smith @arfon
Citizen Science
Distributed Computing
None
Distributed Data Collection
None
None
Distributed Analysis
None
None
None
None
http://www.novacelestia.com
None
None
None
None
None
0 250,000 500,000 750,000 1,000,000 Professor Paper PhD SDSS
Classifications per hour 0 10,000 20,000 30,000 40,000 50,000 60,000
70,000 Hours 0 6 12 18 24 30 36 42 48 1 Kevin months Fukugita et al. 2007
None
None
None
None
None
None
None
None
None
SDSS HST Starforming pea Narrow-line Seyfert pea
None
None
None
None
None
None
None
Motivations
None
None
None
1,000,000,000,000 hours / year
Spectrum of cognitive surplus
None
None
Begins with open data
Open Source
Not treating code and data as first class research objects
GitHub
What is a GitHub?
None
None
None
None
Easier to work together than alone
Open Source collaboration
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
None
Open Public ≠
Open (within your team, department or institution)
Electronic
Available
Asynchronous
Lock-free
None
None
None
Low friction collaboration
“open source is… reproducible by necessity” Fernando Perez http://blog.fperez.org/2013/11/an-ambitious-experiment-in-data-science.html
Better at collaborating because they have to be
Towards Collaborative Versioned Science
How do we make this behaviour the norm?
Incentive model (it’s broken)
Credit
http://dx.doi.org/10.6084/m9.figshare.828487
http://dx.doi.org/10.6084/m9.figshare.828487
None
None
Derive meaningful metrics from open contributions
“Academic environments of today do not reward tool builders” Ed
Lazowska, OSTP event http://lazowska.cs.washington.edu/MS/MS.OSTP.pdf
A VISION AND STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND
EDUCATION
What can we do today?
Take data management plans seriously
Try versioning your research
Share more than just data
If you’re going to share it then you better put
a licence on it
Thanks.
[email protected]
@arfon "