in scale Massive increases in throughput Massive increases in performance Researchers need it all, and all need it now! Computational growth is not going away We need new systems, teams and methods to effectively support our scholarly and scientific research
Bioinformatics Institute Inpharmatica Wellcome Trust Sanger Institute Whitehead Genome Center Broad Institute of MIT and Harvard Harvard University Cycle Computing
From centralized to decentralized, collaborative to independent and right back again! The 10’s Mainframes VAX The PC Beowulf Clusters Central Clusters Centers provide access to compute The supercomputing famine, funding gap Individual computing Computing is too big to fit under desk, Linux explodes Clouds/VMware IaaS, SaaS, PaaS 100% 60% 0% 40% ???% SHARING ~ 0Mbit ~ 1Mbit ~ 10Mbit ~ 1000 Mbit ~ 10,000 Mbit Bigger, better but further and further away from the scientist’s lab
data access across Utility HPC NIMBUS Discovery 12 years of compute in 3 hours $20M of infrastructure for < $3,000 Big 10 Pharma Built 10,600 server cluster ($44M) in 2 hours, 40 years of compute in 11 hours for $4,372 Genomics Research Institute: 1 million hours or 115 years of compute in 1 week for $19,555
for molecular modeling Solution 30,000 CPU run across US/EU Cloud (AWS) 10 years of compute in 8 hours for $10,000 Found 3 compounds now in the wetlab!
Run a comparison of 78TB stem cell RNA samples to build a unique gene expression database Make it easier to replicate disease in petri dishes w/induced stem cells Solution Enable massive RNAseq run using BowTie that was impossible before
GPU-Years of computing in 1.5 months for $150,000 vs. 5 months of CPU for $450,000 Local Data Corporate Firewall 3x the science, ¼ the cost Secure HPC Cluster 8 TB FS External Cloud 128 GPU cluster Scheduled Data Drug designer