Design, develop, deploy, and expand a na:onal cyberinfrastructure for life science research, and train scien:sts Funding: Na:onal Science Founda:on Usage: More than 38K users, PB’s of data, and hundreds of publica:ons, courses, and discoveries http://www.cyverse.org/
Work in an on-demand Linux environments • Collaborate with students and colleagues on the same instance • Overcome usability challenges of cloud plaTorm • Mul:core high memory images to run mul:threading applica:ons Move your analyses from your laptop to the cloud • Make data, workflows, and analyses available in a public image • Access previous soXware version and images
Jetstream Background • Jetstream funded as NSF’s first produc:on cloud facility Jetstream: A Distributed Cloud Infrastructure for Under-resourced Higher Educa=on Communi=es Fischer, Jeremy; Tuecke, Steven; Foster, Ian; Stewart, Craig A. • Part of the NSF eXtreme Digital (XD) program and supported by XSEDE – Small and under-resourced colleges and universi:es – Provide on-demand interac(ve compu:ng and analysis – Increase effort efficiency - perceived and read ease of use
and scalable genome annota:on pipeline – Denovo genome annota:on – Upda:ng exis:ng genome annota:on – Combining evidence with genome • Limita:ons of MAKER – Installa:on of MAKER is challenging and complex – MAKER runs are not :me efficient • WQ-MAKER is a modified MAKER annota:on pipeline capable of being run on distributed compu:ng resources using Work Queue • WQ-MAKER is configured to run on Jetstream
WQ-MAKER image Augustus SNAP Exonerate BLAST RepeatMasker cctools icommands MAKER Ansible Scaling up genome annotation using MAKER and work queue Andrew Thrasher, Zachary Musgrave, Brian Kachmarck, Douglas Thain, and Scott Emrich International Journal of Bioinformatics Research and Applications 2014 10:4-5, 447-460
of workers MPI Cores Sprobolus species A Plant 11789 con:gs 144 hours 22 Y 6 Sprobolus species B Plant 6615 con:gs 108 hours 21-35 Y 6 Sclero:nia homoeocarpa isolate 10 Fungi 231 con:gs 6 hours 10 N 1 Sclero:nia homoeocarpa isolate 11 Fungi 257 con:gs 6 hours 10 N 1 Calypte_anna Humming bird 265 super scaffolds 8 hours 10 N 1 Brassica rapa Plant 10 + 44,000 scaffolds 4 hours 10 N 1 Kochia Plant 19,671 scaffolds 72 hours 21 N 1 WQ-MAKER run :mes on Jetstream* *WQ-MAKER achieves a speed-up of 45x using 50 workers using a 180MB Caenorhabditis japonica test case on Amazon AWS
instances/ workers • Ansible to automate Jetstream VM/Instance management • R Shiny App integra:on to inform the amount of computa:onal resources are needed • JBrowse to visualize the genome annota:ons • Thorough tes:ng of WQ-MAKER with MPI op:on • Manuscript for WQ-MAKER on Jetstream