N I V E R S I T Y U N I V E R S I T Y Locally Developed Software Publicly Available Software Local storage and compute resources Network Download Public Data https://www.genome.gov/multimedia/slides/tcga4/23_davidsen.pdf
Access Core Data (TCGA) User Data Computational Capacity Standard tools User uploaded tools https://www.genome.gov/multimedia/slides/tcga4/23_davidsen.pdf σʔλͷμϯϩʔυͷඞཁ͕ͳ͘ͳΓɺ୭͕େنήϊϜσʔλʹΞΫηεՄೳʹʂ
͏ʹɻɻɻ www.isb-cgc.org Institute for Systems Biology The goals of the NCI Cloud Pilots are to democratize access to NCI-generated genomic and related data, and to create a cost-effective way to provide scalable computational capacity to the cancer research community. The Institute for Systems Biology (ISB) Cloud provides interactive and programmatic access to data, leveraging many aspects of the Google Cloud Platform. The interactive ISB-CGC web-app allows scientists to interactively define and compare cohorts, examine underlying molecular data for specific genes or pathways of interest, and share insights with collaborators. For computational users, programmatic interfaces and GCP tools such as BigQuery, Genomics, and Compute Engine allow users to perform complex queries from R or Python scripts, or run Dockerized workflows on sequence data available in cloud storage. www.isb-cgc.org Institute for Systems Biology Seven Bridges Genomics www.cancergenomicscloud.org The goals of the NCI Cloud Pilots are to democratiz genomic and related data, and to create a cost-effec computational capacity to the cancer rese The Institute provides inte data, leveragi Cloud Platfor allows scienti compare coh data for speci and share ins computationa and GCP tool Compute Eng queries from Dockerized w in cloud stora Seven Bridge Cloud enable analysis of lar secure, repro rich query sy exact data of own private d Common Wo makes it easy bench biologi reproducible genomics dat www.cancergenomicscloud.org Broad Institute www.firecloud.org own private Common W makes it ea bench biolo reproducib genomics d Broad Insti Firehose an facilitates c scalable pla at-large. Us Google Clou tool develo perform lar curation, an upload thei workspaces tools and p
the cloud—as we see enormous potential for cloud commons research to improve the precision, transparency and reproducibility of research publications that provide periodic key results from and updated guides to the con- tinuous knowledge production within the data commons. The pub- lications also provide incentive and credit within the wider scientific community, above and beyond the reputation researchers can gain for coding and data deposition within their own commons. In the interest of refining the idea of a publishable unit and using expert review judiciously, some new peer refereeing conventions, tools and cloud pilots are therefore a priority. Unlike supplementary data summaries and disparate data resources, Recent funding initiatives to improve cancer diagnosis and treatment have been likened to a ‘moonshot’ (Nat. Biotechnol. 34, 119, 2016). Although we do not think that the metaphor of a single engineering feat to achieve a defined goal is entirely appropriate to the aim of controlling cancers, the cloud computing infrastructure for the upcoming Genomic Data Commons (https:// gdc.nci.nih.gov/index.html) and the three recently launched cancer cloud pilots (https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics- cloud-pilots) is very much equivalent to building Mission Control to coordinate multifaceted and coherent programs. Not only does a cloud commons give broad access to petabyte data sets, which are beyond the capacity of many research institutes to even download, Peer review in the cloud The migration of cancer genomics data to cloud computing is a great encouragement for data reuse and integration by bioinformaticians and other data symbionts. Because the cloud allows rapid, transparent and reproducible research on large data sets, we are keen to consider articles and analyses submitted to the journal that provide peer referee access to their constituent cloud projects. Ϋϥυ࣌ʹ͓͚ΔϨϏϡʔ • ΫϥυڥͰจͰͷղੳΛɺϋʔυΣΞɺOSɺιϑτ ΣΞɺόʔδϣϯͷґଘؔΛશʹύοέʔδԽ͕ՄೳʹͳΔɻ ReproducibilityΛϨϏϡΞʔ͕֬ೝ͢Δ͜ͱ͕ՄೳʹͳΔɻ