Package and automate
Amazon machine images,
VM import
Slide 67
Slide 67 text
Package and automate
Amazon machine images,
VM import
Deployment scripts,
CloudFormation, Chef, Puppet
Slide 68
Slide 68 text
Expert-as-a-service
Slide 69
Slide 69 text
No content
Slide 70
Slide 70 text
No content
Slide 71
Slide 71 text
1000 Genomes
Cloud BioLinux
Slide 72
Slide 72 text
No content
Slide 73
Slide 73 text
Your HiSeq data
Illumina BaseSpace
Slide 74
Slide 74 text
Architectural freedom
Slide 75
Slide 75 text
Freedom of abstraction
Slide 76
Slide 76 text
3. Reuse is as important as
reproduction
5 PRINCIPLES
REPRODUCIBILITY
OF
Slide 77
Slide 77 text
Seven Deadly sins of Bioinformatics: http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics
Slide 78
Slide 78 text
Seven Deadly sins of Bioinformatics: http://www.slideshare.net/dullhunk/the-seven-deadly-sins-of-bioinformatics
Slide 79
Slide 79 text
Infonauts are hackers
Slide 80
Slide 80 text
They have their own way of
working
Slide 81
Slide 81 text
The ‘Big Red Button’
Slide 82
Slide 82 text
Fire and forget reproduction
is a good first step, but limits
longer term value.
Slide 83
Slide 83 text
Monolithic, one-stop-shop
Slide 84
Slide 84 text
Work well for intended purpose
Slide 85
Slide 85 text
Challenging to install,
dependency heavy
Slide 86
Slide 86 text
Di cult to grok
Slide 87
Slide 87 text
Inflexible
Slide 88
Slide 88 text
Infonauts are hackers:
embrace it.
Slide 89
Slide 89 text
Small things. Loosely coupled.
Slide 90
Slide 90 text
Easier to grok
Slide 91
Slide 91 text
Easier to reuse
Slide 92
Slide 92 text
Easier to integrate
Slide 93
Slide 93 text
Lower barrier to entry
Slide 94
Slide 94 text
Scale out
Slide 95
Slide 95 text
Build for reuse.
Be remix friendly.
Maximize value.
Slide 96
Slide 96 text
4. Build for collaboration
5 PRINCIPLES
REPRODUCIBILITY
OF
Slide 97
Slide 97 text
Workflows are memes
Slide 98
Slide 98 text
Reproduction is just the first step
Slide 99
Slide 99 text
Bill of materials:
code, data, configuration,
infrastructure
Slide 100
Slide 100 text
Full definition for reproduction
Slide 101
Slide 101 text
Utility computing provides a
playground for bioinformatics
Slide 102
Slide 102 text
Code + AMI +
custom datasets + public datasets +
databases + compute + result data
Slide 103
Slide 103 text
Code + AMI +
custom datasets + public datasets +
databases + compute + result data
Slide 104
Slide 104 text
Code + AMI +
custom datasets + public datasets +
databases + compute + result data
Slide 105
Slide 105 text
Code + AMI +
custom datasets + public datasets +
databases + compute + result data
Slide 106
Slide 106 text
Package, automate, contribute.
Slide 107
Slide 107 text
Utility platform provides
scale for production runs
Slide 108
Slide 108 text
Drug discovery on 50k cores:
Less than $1000
Slide 109
Slide 109 text
5. Provenance is a first class object
5 PRINCIPLES
REPRODUCIBILITY
OF
Slide 110
Slide 110 text
Versioning becomes really important
Slide 111
Slide 111 text
Especially in an active community
Slide 112
Slide 112 text
Doubly so with loosely coupled tools
Slide 113
Slide 113 text
Provenance metadata is a
first class entity
Slide 114
Slide 114 text
Distributed provenance
Slide 115
Slide 115 text
1. Data has gravity
2. Ease of use is a pre-requisite
3. Reuse is as important as reproduction
4. Build for collaboration
5. Provenance is a first class object
5PRINCIPLES
REPRODUCIBILITY
OF