Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Yevis: System to support building a workflow registry with automated quality control

Yevis: System to support building a workflow registry with automated quality control

ELIXIR-Cloud AAI meeting 2022-08-31

Tazro Inutano Ohta

August 31, 2022

More Decks by Tazro Inutano Ohta

Other Decks in Research


  1. Yevis: System to support building a workflow registry with automated

    quality control Tazro Ohta, Ph.D. Database Center for Life Science, Japan 2022/08/31
  2. Challenges on sharing workflows "Sharing resources helps science" Have you

    ever reused a public workflow? Had no issues? Really?? Issues on reusing workflows Missing license Language syntax error Missing dependent materials Missing example input/output Missing test github.com/sapporo-wes/yevis-cli 2
  3. Workflows as processes challenge the FAIR principles by their structure,

    forms, versioning, executability, and reuse. Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020): FAIR Computational Workflows. Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033 3
  4. Case: WorkflowHub A FAIR workflow registry (in BETA) Asking submitters

    to maintain the workflow metadata https://workflowhub.eu 5
  5. Case: nf-core A Nextflow pipeline collection curated by community Ask

    submitter to join the community and maintain the uploads Not for 'bespoke' pipelines https://nf-co.re/ 6
  6. Trade-off: reliability vs diversity When published workflows are maintained by:

    Submitter Lower registry operation cost, expect higher diversity Reliability depends on developers' expertise Registry Expect better reliability With limitation on the coverage of workflow types github.com/sapporo-wes/yevis-cli 7
  7. Minor workflows matter Researchers need workflows specific to A biological

    species An experimental method A laboratory instrument A computing environment Smaller communities often have more difficulties on building common resources How we can make and keep minor but valuable resources reusable? github.com/sapporo-wes/yevis-cli 8
  8. Solution: lowering the cost of workflow maintenance Workflows are best

    maintained by their developers Domain knowledge is essential for managing contents BUT: Developers are often without extra time, skill, knowledge Provide an assistant for developers to lower the maintenance cost github.com/sapporo-wes/yevis-cli 9
  9. Yevis: A system to support building a registry Easily build

    one's own registry to share workflows in a reliable manner. Yevis metadata: assist developers to satisfy the requirements of 'reusable with confidence' Automatic validation and testing with CLI client and GitHub Actions Fully based on GitHub/Zenodo: no dedicated computing required Adapted the GA4GH TRS spec: promoting a distributed registry model github.com/sapporo-wes/yevis-cli 10
  10. Proposing criteria for workflows 'reusable with confident' Perspective Requirements Availability

    Main workflow description, Dependent materials, Testing materials, Open source license Validity Language type, Language version, Language syntax Traceability Authors and maintainers, Documentation, Workflow ID, Workflow metadata version github.com/sapporo-wes/yevis-cli 11
  11. Resources Examples Registry: https://github.com/ddbj/workflow-registry WebUI: https://ddbj.github.io/workflow-registry-browser/ Repos CLI to deploy

    Yevis registry: https://github.com/sapporo-wes/yevis-cli WebUI for a Yevis registry: https://github.com/sapporo-wes/yevis-web Document https://sapporo-wes.github.io/yevis-cli/getting_started Publication https://www.biorxiv.org/content/10.1101/2022.07.08.499265v1 github.com/sapporo-wes/yevis-cli 14