ELIXIR-Cloud AAI meeting 2022-08-31
Yevis: System to support building a workflow registry
with automated quality control
Tazro Ohta, Ph.D.
Database Center for Life Science, Japan
Challenges on sharing workflows
"Sharing resources helps science"
Have you ever reused a public workflow?
Had no issues?
Issues on reusing workflows
Language syntax error
Missing dependent materials
Missing example input/output
Workflows as processes challenge the FAIR principles by their structure, forms,
versioning, executability, and reuse.
Carole Goble, Sarah Cohen-Boulakia, Stian Soiland-Reyes, Daniel Garijo, Yolanda Gil, Michael R. Crusoe, Kristian Peters, Daniel Schober (2020): FAIR
Computational Workflows. Data Intelligence 2(1):108–121 https://doi.org/10.1162/dint_a_00033 3
Question: Who should take care?
A FAIR workflow registry (in BETA)
Asking submitters to maintain the workflow metadata
A Nextflow pipeline collection curated by community
Ask submitter to join the community and maintain the uploads
Not for 'bespoke' pipelines
Trade-off: reliability vs diversity
When published workflows are maintained by:
Lower registry operation cost, expect higher diversity
Reliability depends on developers' expertise
Expect better reliability
With limitation on the coverage of workflow types
Minor workflows matter
Researchers need workflows specific to
A biological species
An experimental method
A laboratory instrument
A computing environment
Smaller communities often have more difficulties on building common resources
How we can make and keep minor but valuable resources reusable?
Solution: lowering the cost of workflow maintenance
Workflows are best maintained by their developers
Domain knowledge is essential for managing contents
BUT: Developers are often without extra time, skill, knowledge
Provide an assistant for developers to lower the maintenance cost
Yevis: A system to support building a registry
Easily build one's own registry to share workflows in a reliable manner.
Yevis metadata: assist developers to satisfy the requirements of 'reusable with
Automatic validation and testing with CLI client and GitHub Actions
Fully based on GitHub/Zenodo: no dedicated computing required
Adapted the GA4GH TRS spec: promoting a distributed registry model
Proposing criteria for workflows 'reusable with
Availability Main workflow description, Dependent materials, Testing materials, Open
Validity Language type, Language version, Language syntax
Traceability Authors and maintainers, Documentation, Workflow ID, Workflow
The Yevis way
yevis-web: registry browser
CLI to deploy Yevis registry: https://github.com/sapporo-wes/yevis-cli
WebUI for a Yevis registry: https://github.com/sapporo-wes/yevis-web