Collaborative platforms for streamlining
workflows in Open Science
Konrad U. F¨
orstner, Gregor Hagedorn, Claudia Koltzenburg,
M. Fabiana Kubke, Daniel Mietchen
July 30th, 2011 – OKCon 2011, Berlin
Slide 2
Slide 2 text
About this work
Wiki base version of the manuscript:
http://is.gd/openworkflows
Slide 3
Slide 3 text
Problem: There are many gaps in the scientific process
Time consuming and often annoying
Loss of information
http://www.flickr.com/photos/eirikref/403363597 – CC-BY by flick user eirikref
Slide 4
Slide 4 text
A proposal for improved scientific workflow
Seamless transition from bench to publication
Based on Virtual Research Environments (VRE)
Transparency, reproducibility & reusability
Formalization
Reputation system included
http://commons.wikimedia.org/wiki/File:Future73nb.jpg – PD
Slide 5
Slide 5 text
Conception and project planning
Utilizing collective intelligence
Management tools can help to handle complex projects
http://www.flickr.com/photos/marksurman/3604105727/ – CC-BY by flick user marksurman
Slide 6
Slide 6 text
Experiments and data generation
More automation needed (ideally via Open Hardware)
Formal language to design/program experiments
http://www.flickr.com/photos/kaibara/2072160194/ – CC-BY by flick user kaibara
Slide 7
Slide 7 text
Data storage = data release
Publish data immediately in a machine-readable form
Every entity gets an unique identifiers (⇒ referable)
http://www.flickr.com/photos/wilhei/109404222/ – CC-BY by flick user wilhei
Slide 8
Slide 8 text
Data analysis
Scripting / programming or recording of GUI-tool actions
Good examples: Taverna or Galaxy
Grid computing if needed / possible
http://commons.wikimedia.org/wiki/File:Plastic_tape_measure.jpg – CC-BY by Wimedia Commons user Pastorius
Slide 9
Slide 9 text
Knowledge generation
Again: Collaborative - increase the number of brains involved
Again: Formalization - e.g. argument maps which link to
results and literature
http://www.flickr.com/photos/diana_blackwell/2597258115/ – CC-BY by flick user diana blackwell
Slide 10
Slide 10 text
Final publication
Little effort: linking to the major outcomes and putting them
into the scientific context
http://www.flickr.com/photos/yorkjason/3265889476/ – CC-BY by flickr user yorkjason
Slide 11
Slide 11 text
Implementation - Technology
Most needed building blocks are available (as FLOSS) –
“just” need to be connected
Open standards needed
Domain specific solutions should be created by the
communities
http://www.flickr.com/photos/tallkev/256810217/ – CC-BY by flickr user tallkev
Slide 12
Slide 12 text
Implementation - Reputation
Microcontribution ⇒ Microattribution (e.g. ORCID based)
http://www.flickr.com/photos/tallkev/256810217/ – CC-BY by flickr user tallkev
Slide 13
Slide 13 text
Implementation - Licenses
Ideally: Public domain / CC0 (see Panton Principle)
http://www.flickr.com/photos/subcircle/500995147 – CC-BY by flickr user subcircel
Slide 14
Slide 14 text
Implementation - Funding
Long term aim: Funding agencies require the usage of such
Open Science workflows (worked for OA)
– CC-BY by flickr user therichbrooks
Slide 15
Slide 15 text
Dealing with the cultural clash via a gradual approach
Proposed infrastructure but with fine granular access control and a
smaller number of participants.
http://www.everystockphoto.com/photo.php?imageId=1761122 – source: The Library of Congress
Slide 16
Slide 16 text
Take home messages in a nut shell
All steps of the research process can be represented in /
connected to VREs
Gaps between the steps are minimized
Gain of transparency, reproducibility & reusability
Main problems are not technical but cultural/political
http://www.flickr.com/photos/marcoarment/3129076932 – CC-BY by flickr user marcoarment
Slide 17
Slide 17 text
http://www.flickr.com/photos/nateone/3768979925/ – CC-BY by flick user nateone