Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Wf4Ever: Scientific Workflows and Research Objects as tools for scientific insight and methodology curation

Wf4Ever: Scientific Workflows and Research Objects as tools for scientific insight and methodology curation

Astronomers are being drowned in data: facilities like ALMA currently provide datasets in the Gigabyte range, and increasing, while facilities like the LSST and the SKA will generate datasets large enough so that data download, even of the reduced datasets, will not be feasible. In this talk we will introduce the concept of Scientific Workflows, as software tools that allow for the easy exploration of both local and remote datasets and processing services, and of Research Objects, which encapsulate all relevant aspects of a scientific experiment, and allow for its quantitative and qualitative assessment, enable reuse with proper attribution, and linkage to publications, among others. The AstroTaverna plugin, with astronomy-specific tools for workflow creation, will also be presented in this ALMA weekly seminar.

D6c83d5d20c63b8e421a7966b04cfedb?s=128

Juande Santander-Vela

July 04, 2013
Tweet

More Decks by Juande Santander-Vela

Other Decks in Science

Transcript

  1. Wf4Ever: Scientific Workflows and Research Objects as tools for scientific

    insight and methodology curation Juande Santander-Vela jdsant@iaa.es Instituto de Astrofísica de Andalucía-CSIC
  2. Talk Outline Introduction Current challenges for radio astronomy and science

    Potential e-Science solutions: Workflows and Research Objects Final points
  3. Introduction

  4. Who am I? Member of the AMIGA international collaboration, based

    at IAA-CSIC Ph.D. on bringing Radio Astronomical data archives and tools into the VO Applied Scientist at ESO VLT archive, Software Engineer/Astronomy Specialist at ALMA archive (May 2009-Dec 2011) Back to IAA-CSIC as VIA-SKA Project Manager, Radio Astroinformatician GROUP INTEREST IN TECH DEVELOPMENTS FOR BETTER SCIENCE
  5. Why I’m here? Collaboration with Stephane Leon and the ALMA

    Data Management Group Helping bring the ALMA Science Archive to the VO ‏ Modelling radio data cubes Finding use cases for workflow technology (see later)
  6. AMIGA Analysis of the interstellar Medium of Isolated GAlaxies Multi-wavelength,

    multi-object study on isolated galaxies with strict isolation criteria Careful curation of data Very careful processing of new parameters from Group’s own observation programs and data reduction Literature table scanning Virtual Observatory table harvesting and parsing Emphasis on marrying astronomy and computer science, and buy-in of the VO E-SCIENCE USERS
  7. AMIGA Analysis of the interstellar Medium of Isolated GAlaxies Multi-wavelength,

    multi-object study on isolated galaxies with strict isolation criteria Careful curation of data Very careful processing of new parameters from Group’s own observation programs and data reduction Literature table scanning Virtual Observatory table harvesting and parsing Emphasis on marrying astronomy and computer science, and buy-in of the VO E-SCIENCE DEVELOPERS!
  8. AMIGA Project goal: providing a baseline for galaxy properties to

    compare with other environments Interaction-free sample, ideal for tracing HI infall: we can use CIG galaxies to detect the cosmic web Need for very sensitive telescopes able to resolve faint HI ➡ Square Kilometre Array & pathfinders PARTICIPATING IN SKA.TEL.SDP CONSORTIUM WE NEED TOOLS FOR OUR OWN SCIENCE ANALYSIS ⤷
  9. Current challenges for radio astronomy and science

  10. Data over-abundance Moore’s Law for Detectors ➡ Exponential increase of

    individual and accumulated data sets We have more data than ever… but we can’t use it: Because we can’t: Difficult to set up (for sharing) Difficult to find (for using) Difficult to document (both using and sharing) Difficult to deal with (because of size, formatting, purpose…) Because it is not in our best interest FULLY ?
  11. Courtesy J.E. Ruiz (AMIGA, Wf4Ever)

  12. Courtesy J.E. Ruiz (AMIGA, Wf4Ever) Tools!

  13. Data sharing Search Go Advanced search Home News & Comment

    Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIALS See all specials Editorial Feature Opinion Elsewhere in Nature DATA SHARING Sharing data is good. But sharing your own data? That can get complicated. As two research communities who held meetings in May on the issue report their proposals to promote data sharing in biology, a special issue of Nature examines the cultural and technical hurdles that can get in the way of good intentions. Data Sharing Specials & supplements archive Archive DATA FLIRTING DATA HOARDING IRREPRODUCIBLE RESEARCH ?
  14. Irreproducible research Search Go Advanced search Home News & Comment

    Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible research Specials & supplements archive Archive nature.com Sitemap Cart Login Register Search Go Advanced search Home News & Comment Research Careers & Jobs Current Issue Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible research Specials & supplements archive Archive
  15. Irreproducible research Search Go Advanced search Home News & Comment

    Research Careers & Jobs Current Issue Archive Audio & Video For Authors SPECIAL See all specials Editorial News and analysis Comment Perspectives and reviews CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access Challenges in irreproducible research Specials & supplements archive Archive CHALLENGES IN IRREPRODUCIBLE RESEARCH No research paper can ever be considered to be the final word, and the replication and corroboration of research results is key to the scientific process. In studying complex entities, especially animals and human beings, the complexity of the system and of the techniques can all too easily lead to results that seem robust in the lab, and valid to editors and referees of journals, but which do not stand the test of further studies. Nature has published a series of articles about the worrying extent to which research results have been found wanting in this respect. The editors of Nature and the Nature life sciences research journals have also taken substantive steps to put our own houses in order, in improving the transparency and robustness of what we publish. Journals, research laboratories and institutions and funders all have an interest in tackling issues of irreproducibility. We hope that the articles contained in this collection will help. Free full access
  16. Tool over-abundance ++

  17. Starship Asterisk* APOD and General Astronomy Discussion Forum Board index

    ‹ Learning & Resources ‹ The Engineering Deck: Astrophysics Source Code Library FAQ Register Login Search this forum… Search 671 topics • Page 1 of 7 • 1 2 3 4 5 ... 7 The Engineering Deck: Astrophysics Source Code Library Search… Search Advanced search Post a new topic ANNOUNCEMENTS REPLIES VIEWS LAST POST Welcome & Rules (please read before posting) by RJN » Mon Jan 18, 2010 7:40 pm 0 15666 by RJN Mon Jan 18, 2010 7:40 pm TOPICS REPLIES VIEWS LAST POST Guide to the Astrophysics Source Code Library by RJN » Sat Jul 24, 2010 8:01 pm 13 17027 by owlice Mon Jul 01, 2013 3:32 am 1 2 Papers of Possible Interest to Astronomical Software Users by owlice » Tue Oct 12, 2010 7:02 am 27 7056 by owlice Wed May 15, 2013 1:31 pm 1 2 The Astrophysics Source Code Library: New codes welcome by RJN » Sat Jul 24, 2010 8:01 pm 26 5273 by Eran Ofek Thu Dec 13, 2012 9:20 pm *Web Resources and Tools for Astrophysicists/Astronomers* by owlice » Sat Jul 16, 2011 12:01 pm 22 2750 by owlice Fri May 10, 2013 12:12 pm 2011 and 2012 Additions to the ASCL by owlice » Thu Feb 24, 2011 11:26 pm 23 1693 by owlice Sat Dec 08, 2012 8:09 pm 21cmFAST: Simulation of the High-Redshift 21-cm Signal by owlice » Thu Feb 17, 2011 10:47 pm 0 3443 by owlice Thu Feb 17, 2011 10:47 pm 2LPTIC: 2nd-order Lagrangian Perturbation Theory Initial Con by owlice » Tue Jan 03, 2012 5:27 am 0 855 by owlice Tue Jan 03, 2012 5:27 am 2MASS Kit: 2MASS Catalog Server Kit by owlice » Sun Mar 17, 2013 5:16 pm 0 214 by owlice Sun Mar 17, 2013 5:16 pm 3DEX: Fast Fourier-Bessel Decomposition of Spherical 3D Surv by owlice » Sat Nov 26, 2011 4:00 pm 0 741 by owlice Sat Nov 26, 2011 4:00 pm AAOGlimpse: Three-dimensional Data Viewer by owlice » Sat Oct 15, 2011 11:29 am 0 1034 by owlice Sat Oct 15, 2011 11:29 am ACORNS-ADI: Calibration, Registration and Nulling in Imaging by kcd » Sat Mar 30, 2013 7:40 am 0 177 by kcd Sat Mar 30, 2013 7:40 am ACS: ALMA Common Software by kcd » Sat Feb 09, 2013 3:44 am 0 269 by kcd Sat Feb 09, 2013 3:44 am 671 topics • Page 1 of 7 •
  18. Services too!

  19. How to deal with all this? ++ All of this

    compounds the problems of reproducibility, methodology assessment, result dissemination…
  20. How to deal with all this? AND THE CODE? WHAT

    SOFTWARE DOES IT DEPEND ON? WHICH CODE DID WHAT? NOT A GOOD SOLUTION TRADITIONALLY…
  21. How to deal with all this? ++ ORCHESTATION, ENCAPSULATION, DATA

    ACCESS, PROVENANCE, ANNOTATION…
  22. Why Workflows? SCIENTIFIC

  23. Workflows define computations Events & Processes Dependencies Resources Local &

    Remote Processes Sequences Concurrences Triggers FORMALLY, OR AT LEAST MACHINE READABLE ➡ WORKFLOW DEFINITION LANGUAGES
  24. Workflows enable distributed computing Distributed computing paradigm Move computation to

    the data Computing services Collaborative environments Linked data ʩ FOR SCIENTIFIC DISCUSSION & SCIENCE EXTRACTION ➡ Science-computing
  25. Workflows enable distributed computing Data can be anywhere Workflows can

    be constructed hierarchicaly Each workflow does useful work on its own The data flow can be easily followed
  26. Workflows enable interactive computing Each workflow run records it’s inputs,

    outputs, and intermediate results You can build and run workflows incrementally You can get (almost) immediate feedback on changes
  27. Tools for workflow storage and discovery About | Give us

    Feedback | Publications Juandesant New Workflow GO Workflows Search View Download (v7) Taverna 2 Original Uploader Paul Fisher Sort by: Rank « Previous 1 2 3 4 5 … 221 Next » 1111 562 243 43 34 26 24 23 18 13 223 Search filter terms Filter by type Taverna 2 Taverna 1 RapidMiner Kepler Bioclipse Scri… LONI Pipeline GWorkflowDL KNIME BioExtract Ser… Galaxy Filter by tag example Home Users Groups Workflows Files Packs Topics Home > Workflows Workflows Showing 2207 results. Use the filters on the left and the search box below to refine the results. Search Pathways and Gene annotations for QTL region (7) Created: 19/11/09 @ 18:18:52 | Last updated: 07/09/12 @ 18:23:36 Credits: Paul Fisher License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow searches for genes which reside in a QTL (Quantitative Trait Loci) region in the mouse, Mus musculus. The workflow requires an input of: a chromosome name or number; a QTL start base pair position; QTL end base pair position. Data is then extracted from BioMart to annotate each of the genes found in this region. The Entrez and UniProt identifiers are then sent to KEGG to obtain KEGG gene identifiers. The KEGG gene identifiers are then used to searcg for pathways in the KEGG path...
  28. Tools for workflow storage and discovery About | Give us

    Feedback | Publications Juandesant New Workflow GO Astrotaverna Workflows Search View Download (v3) Taverna 2 Original Uploader Julian Garrido Sort by: Relevance « Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Home Users Groups Workflows Files Packs Topics Home > Workflows Workflows Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0
  29. View Download (v3) Taverna 2 Original Uploader Julian Garrido View

    Download (v1) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Sort by: Relevance « Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 27 17 40 4 16 4 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Filter by user Jose Enrique … Julian Garrido Filter by licence by-sa BSD Filter by group AMIGA Wf4Ever Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 26 times | Downloaded: 12 times Tags (4): astronomy | astrotaverna | cat | votable Create configuration files from a template... (1) Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are replaced by data from a votable. A configuration file is created for every row in the votable. The keys must appear also in the vocabulary file and match column names in the votable. A column in the votable must contain the name of the result configuration file. Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 14 times | Downloaded: 15 times Tags (4): astronomy | astrotaverna | local processes | votable Simulates the physical, dynamical, and che... (1) Created: 17/05/13 @ 08:03:13 Credits: Julian Garrido
  30. View Download (v3) Taverna 2 Original Uploader Julian Garrido View

    Download (v1) Taverna 2 Original Uploader Julian Garrido View Download (v1) Taverna 2 Original Uploader Sort by: Relevance « Previous 1 2 3 4 5 Next » 44 43 42 40 26 23 9 9 9 5 5 27 17 40 4 16 4 Search filter terms Filter by type Taverna 2 Filter by tag astronomy astrotaverna votable virtual observ… starter pack local processes taverna workflow galfit sextractor Filter by user Jose Enrique … Julian Garrido Filter by licence by-sa BSD Filter by group AMIGA Wf4Ever Showing 44 results. Use the filters on the left and the search box below to refine the results. Astrotaverna Search Remove search query Cocatenates several VOTables into one (3) Created: 30/08/12 @ 10:05:29 | Last updated: 22/04/13 @ 16:52:00 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Rating: 0.0 / 5 (0 ratings) | Versions: 3 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 26 times | Downloaded: 12 times Tags (4): astronomy | astrotaverna | cat | votable Create configuration files from a template... (1) Created: 26/07/12 @ 10:56:46 | Last updated: 04/09/12 @ 07:30:55 Credits: Julian Garrido License: Creative Commons Attribution-Share Alike 3.0 Unported License This workflow uses astrotaverna artifacts. It creates files by using a template whose keys are replaced by data from a votable. A configuration file is created for every row in the votable. The keys must appear also in the vocabulary file and match column names in the votable. A column in the votable must contain the name of the result configuration file. Rating: 0.0 / 5 (0 ratings) | Versions: 1 | Reviews: 0 | Comments: 0 | Citations: 0 Viewed: 14 times | Downloaded: 15 times Tags (4): astronomy | astrotaverna | local processes | votable Simulates the physical, dynamical, and che... (1) Created: 17/05/13 @ 08:03:13 Credits: Julian Garrido
  31. About | Give us Feedback | Publications Juandesant New Workflow

    GO All Search Version 3 (latest) (of 3) View version: 3 (latest) Version created on: 22/04/13 @ 16:52:00 by: Julian Garrido Title: Cocatenates several VOTables into one Type: Taverna 2 Preview (Click on the image to get the full size) Workflow Type Taverna 2 Original Uploader Julian Garrido License All versions of this Workflow are licensed under: Credits (1) (People/Groups) Julian Garrido Attributions (0) (Workflows/Files) None Home Users Groups Workflows Files Packs Topics Home > Workflows > Cocatenates several VOTables into one Workflow Entry: Cocatenates several VOTables into one Created at: 30/08/12 @ 10:05:29 Last updated: 22/04/13 @ 16:52:00 | License | Credits (1) | Attributions (0) | Tags (4) | Featured in Packs (1) | Ratings (0) | Attributed By (0) | Favourited By (0) | | Citations (0) | Version History | Reviews (0) | Comments (0) |
  32. Version 3 (latest) (of 3) View version: 3 (latest) Version

    created on: 22/04/13 @ 16:52:00 by: Julian Garrido Title: Cocatenates several VOTables into one Type: Taverna 2 Preview (Click on the image to get the full size) Download Scalable Diagram (SVG) Description Snippet showing how to use AstroTaverna tool for concatenating several VOTables. The input is four VOTables with the same number of columns. The result if using sample values provided will be a four times vertically duplicated VOTable. Download Download Workflow File/Package (T2FLOW) Workflow Type Taverna 2 Original Uploader Julian Garrido License All versions of this Workflow are licensed under: Credits (1) (People/Groups) Julian Garrido Attributions (0) (Workflows/Files) None Tags (4) Original Uploader tags astronomy | astrotaverna | cat | votable Add Tags Shared with Groups (1) AMIGA Featured In Packs (1) AstroTaverna Starter Pack Ratings (0)
  33. Download Download Workflow File/Package (T2FLOW) Download Workflow as a Galaxy

    tool Run Run this Workflow in the Taverna Workbench... Option 1: Copy and paste this link into File > 'Open workflow location...' http://www.myexperiment.org/workflows/3130/download?version=3 [ More Info ] Workflow Components Authors (1) Titles (1) Descriptions (1) Dependencies (0) Inputs (4) Processors (1) Beanshells (0) Outputs (1) Datalinks (5) Coordinations (0) Featured In Packs (1) AstroTaverna Starter Pack Ratings (0) Hover and click to rate Current: 0.0 / 5 (0 ratings) You haven't rated yet Breakdown Attributed By (0) (Workflows/Files) None Favourited By (0) No one Add to your Favourites Statistics 53 viewings 75 downloads [ see breakdown ] More
  34. That’s not enough! FOR ASTRONOMERS FOR REPRODUCIBILITY AND REUSE

  35. 3 7 4 1 6 5 2 1. Intelligent Software

    Components (iSOCO, Spain) 2. University of Manchester (UNIMAN, UK) 3. Universidad Politécnica de Madrid (UPM, Spain) 4. Poznan Supercomputing and Networking Centre (PSNC, Poland) 5. University of Oxford (OXF, UK) 6. Instituto de Astrofísica de Andalucía (IAA, Spain) 7. Leiden University Medical Centre (LUMC, NL) EU FUNDED FP7 STREP PROJECT DECEMBER 2010 – DECEMBER 2013
  36. • Astronomy (IAA-CSIC) • Genome-wide Analysis and Biobanking Case Studies

    Archival, classification, and indexing of scientific workflows and their associated materials in scalable semantic repositories, providing advanced access and recommendation capabilities Creation of scientific communities to collaboratively share, reuse, and evolve workflows and their parts, stimulating the development of new scientific knowledge Goals • Digital Libraries • Workflow Management • Semantic Web • Integrity & Authenticity • Provenance • Information Quality Core Competencies (Tech) • One SME • Six public organisations Partners Technological infrastructure for the preservation and efficient retrieval and reuse of scientific workflows in a range of disciplines TARGETING ALREADY ESTABLISHED COMMUNITIES: MYEXPERIMENT, VIRTUAL OBSERVATORY
  37. 3 What is a Scientific Workflow? Workflows to Access and

    Massage VO Data »  A mechanism for coordinating the execution of services and codes, and linking together resources. »  The combination of data and processes into a configurable, modular, structured set of steps that implement semi-automated computational solutions in scientific problem-solving. »  The implementation of a scientific method. COURTESY J.E. RUIZ NOT A PIPELINE!
  38. AMIGA4GAS 3D KINEMATICAL MODELING INPUT FILES ROTCUR 12 RUNS POSSIBLE

    COMBINATIONS IN INPUT PARAMETERS 12 ASCII FILES GALMOD 12 CUBES 4 APPROACHING 4 RECEEDING 4 BOTH COPY 8 CUBES 4 APPROACHING + RECEEDING 4 BOTH MOMENTS 8 VELOCITY MAPS 1 DATACUBE 1 VELOCITY MAP 1 CONFIG FILE ROTCUR 1 CONFIG FILE GALMOD SUB 8 RESIDUAL CUBES 8 RESIDUAL MAPS SUB MNMX 8 VALUES FOR PEAKS IN CUBES 8 VALUES FOR PEAKS IN MAPS VARIABLE PARAMS INSET RADII, WIDTHS WEIGHT TOLERANCE DENS NV Z0 VDISP
  39. How do we build workflows?

  40. AstroTaverna Taverna plugin for retrieving and manipulating VO Data +

    Catalogs on HTML Pages VO Services: ConeSearch, SIA, SSA, TAP coming soon Tabular Data (VOTables, converters from other formats) Crossmatching, Filtering, NameResolving, Coordinates and reference system transformation, Data massage.. (STILTS) Source catalog overplotting on Images and filtering, overplot circles, ellipses, etc. as a function of physical magnitude. Resampling, crops, blinks, mosaics, movies, blinks, RGBs, fusion, diff.. (through Aladin) VO Table rendering, SAMP for final inspection Image support, Spectra not yet PLUS ADDITIONAL ANALYSIS USING SCRIPTS
  41. Service discovery

  42. Data massaging

  43. Data massaging X-Matching Calculation Additions Filtering Access

  44. Data curation X-Matching Calculation Additions Filtering Access

  45. Data curation X-Matching Calculation Additions Filtering Access

  46. Data curation X-Matching Calculation Additions Filtering Access

  47. Aladin scripting

  48. Interactive data inspection

  49. Interactive data inspection

  50. Learning examples

  51. Not yet enough! FOR REPRODUCIBILITY AND REUSE

  52. Home RO at 5000 feet Examples Ontologies Tools Collaboration Publications

    History About Search Research Objects
  53. Research Objects Content Process (workflows), data, external resources and bibliography

    Execution environment set-up and local software dependencies Experimental protocol followed Roles, types and relationships among all digital components Provenance of intermediate and final results Decomposable attribution and authoring Fine-grained access control and permissions Example datasets for demonstration, reproducibility, monitoring, etc Templates Placeholders to ease the aggregation process Completeness checking/quality assessment
  54. Research Objects Target Audiencies Scientists [producers] who want to share

    their research outcomes so that they are more reusable and reproducible – ease of sharing and citation. Scientists [consumers] who want to understand, reuse, validate and further extend existing RO’s. Publishers can adopt the concept and principles of Research Object to enable the sharing of and access to the actual data and methods. Librarians who want to support research preservation.
  55. Semantic annotations Author of an annotation Author and co-authors of

    a workflow; reference link to a re-used workflow and its author Who has performed the execution of a workflow leading to the results provided in the RO Computing execution environment of the RO and local software dependencies Special access requirements to web services Datasets provider: person, webpage, survey, data release, etc. How much time does it take to run a workflow using the full data and the provided subsample The number of elements of the sample dataset where one workflow and/or RO iterates Previous and subsequent workflows to be executed, as in the experimental protocol Research institution, country, and scientific domain of the RO The actual size of the RO and/or a folder
  56. Semantic model DataLink MULTI DISCIPLINARY

  57. RO data organisation Recommended organisation provides automatic semantics for some

    items It makes it easier for both people and machines to understand the RO
  58. ROs in Astronomy ADSLabs Research Objects Authors Publications Journals Objects

    SIMBAD Tabular data behind the plots CDS ASCL reference of used software Observing time Proposals Used facilities, surveys or missions NOT JUST FROM WORKFLOWS POTENTIAL FOR RESEARCH OBJECT INDEXING IN ADS
  59. RO Incentive PAPERS WITH DATA LINKS ARE CITED MORE THAN

    THOSE WITHOUT Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al.
  60. RO Incentive PAPERS WITH DATA LINKS ARE CITED MORE THAN

    THOSE WITHOUT Effect of E-printing on Citation Rates in Astronomy and Physics 2006. Edwin A. Henneken et al. NOW YOU CAN CITE DATA AND PROCESSES, TOO
  61. Roadmap AstroTaverna, mostly ready: you can publish workflows and packs

    to myExperiment from Taverna myExperiment, building support for ROs ADS will populate myExperiment with literature- ROs Taverna will be able to publish ROs to myExperiment
  62. Final points We need something like workflows to describe computations

    in a distributed environment Workflows are not enough for supporting reuse and methodology preservation Research Objects are meaningful associations of data, operations, provenance, which can also be cited CAN EMBED COMPUTATIONS IN SCIENCE ARCHIVES
  63. Useful Links http://www.wf4ever-project.org http://www.myexperiment.org http://www.researchobject.org http://wf4ever.github.io/astrotaverna/ http://amiga.iaa.es

  64. Thank you!