Yevis: System to support building a workflow registry with automated quality control Tazro Ohta, Ph.D. Database Center for Life Science, Japan 2022/08/31
Challenges on sharing workflows "Sharing resources helps science" Have you ever reused a public workflow? Had no issues? Really?? Issues on reusing workflows Missing license Language syntax error Missing dependent materials Missing example input/output Missing test github.com/sapporo-wes/yevis-cli 2
Workflows as processes challenge the FAIR principles by their structure, forms, versioning, executability, and reuse. Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020): FAIR Computational Workflows. Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033 3
Case: nf-core A Nextflow pipeline collection curated by community Ask submitter to join the community and maintain the uploads Not for 'bespoke' pipelines https://nf-co.re/ 6
Trade-off: reliability vs diversity When published workflows are maintained by: Submitter Lower registry operation cost, expect higher diversity Reliability depends on developers' expertise Registry Expect better reliability With limitation on the coverage of workflow types github.com/sapporo-wes/yevis-cli 7
Minor workflows matter Researchers need workflows specific to A biological species An experimental method A laboratory instrument A computing environment Smaller communities often have more difficulties on building common resources How we can make and keep minor but valuable resources reusable? github.com/sapporo-wes/yevis-cli 8
Solution: lowering the cost of workflow maintenance Workflows are best maintained by their developers Domain knowledge is essential for managing contents BUT: Developers are often without extra time, skill, knowledge Provide an assistant for developers to lower the maintenance cost github.com/sapporo-wes/yevis-cli 9
Yevis: A system to support building a registry Easily build one's own registry to share workflows in a reliable manner. Yevis metadata: assist developers to satisfy the requirements of 'reusable with confidence' Automatic validation and testing with CLI client and GitHub Actions Fully based on GitHub/Zenodo: no dedicated computing required Adapted the GA4GH TRS spec: promoting a distributed registry model github.com/sapporo-wes/yevis-cli 10
Proposing criteria for workflows 'reusable with confident' Perspective Requirements Availability Main workflow description, Dependent materials, Testing materials, Open source license Validity Language type, Language version, Language syntax Traceability Authors and maintainers, Documentation, Workflow ID, Workflow metadata version github.com/sapporo-wes/yevis-cli 11