Slide 1

Slide 1 text

Yevis: System to support building a workflow registry with automated quality control Tazro Ohta, Ph.D. Database Center for Life Science, Japan 2022/08/31

Slide 2

Slide 2 text

Challenges on sharing workflows "Sharing resources helps science" Have you ever reused a public workflow? Had no issues? Really?? Issues on reusing workflows Missing license Language syntax error Missing dependent materials Missing example input/output Missing test github.com/sapporo-wes/yevis-cli 2

Slide 3

Slide 3 text

Workflows as processes challenge the FAIR principles by their structure, forms, versioning, executability, and reuse. Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020): FAIR Computational Workflows. Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033 3

Slide 4

Slide 4 text

Question: Who should take care? github.com/sapporo-wes/yevis-cli 4

Slide 5

Slide 5 text

Case: WorkflowHub A FAIR workflow registry (in BETA) Asking submitters to maintain the workflow metadata https://workflowhub.eu 5

Slide 6

Slide 6 text

Case: nf-core A Nextflow pipeline collection curated by community Ask submitter to join the community and maintain the uploads Not for 'bespoke' pipelines https://nf-co.re/ 6

Slide 7

Slide 7 text

Trade-off: reliability vs diversity When published workflows are maintained by: Submitter Lower registry operation cost, expect higher diversity Reliability depends on developers' expertise Registry Expect better reliability With limitation on the coverage of workflow types github.com/sapporo-wes/yevis-cli 7

Slide 8

Slide 8 text

Minor workflows matter Researchers need workflows specific to A biological species An experimental method A laboratory instrument A computing environment Smaller communities often have more difficulties on building common resources How we can make and keep minor but valuable resources reusable? github.com/sapporo-wes/yevis-cli 8

Slide 9

Slide 9 text

Solution: lowering the cost of workflow maintenance Workflows are best maintained by their developers Domain knowledge is essential for managing contents BUT: Developers are often without extra time, skill, knowledge Provide an assistant for developers to lower the maintenance cost github.com/sapporo-wes/yevis-cli 9

Slide 10

Slide 10 text

Yevis: A system to support building a registry Easily build one's own registry to share workflows in a reliable manner. Yevis metadata: assist developers to satisfy the requirements of 'reusable with confidence' Automatic validation and testing with CLI client and GitHub Actions Fully based on GitHub/Zenodo: no dedicated computing required Adapted the GA4GH TRS spec: promoting a distributed registry model github.com/sapporo-wes/yevis-cli 10

Slide 11

Slide 11 text

Proposing criteria for workflows 'reusable with confident' Perspective Requirements Availability Main workflow description, Dependent materials, Testing materials, Open source license Validity Language type, Language version, Language syntax Traceability Authors and maintainers, Documentation, Workflow ID, Workflow metadata version github.com/sapporo-wes/yevis-cli 11

Slide 12

Slide 12 text

The Yevis way github.com/sapporo-wes/yevis-cli 12

Slide 13

Slide 13 text

yevis-web: registry browser github.com/sapporo-wes/yevis-cli 13

Slide 14

Slide 14 text

Resources Examples Registry: https://github.com/ddbj/workflow-registry WebUI: https://ddbj.github.io/workflow-registry-browser/ Repos CLI to deploy Yevis registry: https://github.com/sapporo-wes/yevis-cli WebUI for a Yevis registry: https://github.com/sapporo-wes/yevis-web Document https://sapporo-wes.github.io/yevis-cli/getting_started Publication https://www.biorxiv.org/content/10.1101/2022.07.08.499265v1 github.com/sapporo-wes/yevis-cli 14