1990 2000 2010 2020
The Data Analysis Gap
Enterprise Data Data in Warehouse
Generated data
Available for analysis
Data volume
Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011
IDC: Worldwide Business Analytics Software 2012–2016 Forecast and 2011 Vendor Shares
READS, WRITES, UPDATES
AMAZON DYNAMODB
Item level transactions only.
Conditional and atomic updates.
Counts. Top/bottom n values.
Results paged to 1MB in size.
Slide 40
Slide 40 text
THROUGHPUT
Provisioned
Slide 41
Slide 41 text
PROVISIONED THROUGHPUT
AMAZON DYNAMODB
Provision the IO your application needs.
Pay per unit of provisioned capacity.
Consistent predictable performance,
irrespective of scale.
Designed for uniform workload.
Slide 42
Slide 42 text
YOUR APP
DYNAMODB
Slide 43
Slide 43 text
YOUR APP
DYNAMODB
READ THROUGHPUT
Slide 44
Slide 44 text
READ THROUGHPUT
AMAZON DYNAMODB
IO per 4kb item.
Strong and eventual consistency.
Mix and match consistency.
Slide 45
Slide 45 text
YOUR APP
DYNAMODB
READ THROUGHPUT WRITE THROUGHPUT
Slide 46
Slide 46 text
WRITE THROUGHPUT
AMAZON DYNAMODB
IO per 1kb item.
Atomic increment and decrement.
Optimistic concurrency control.
Slide 47
Slide 47 text
YOUR APP
DYNAMODB
READ THROUGHPUT WRITE THROUGHPUT
Slide 48
Slide 48 text
YOUR APP
DYNAMODB
READ THROUGHPUT WRITE THROUGHPUT
COLUMNAR STORE
REDSHIFT
Designed for columnar access.
Automatic data compression.
Large block size.
Best practices for data loading.
Continual incremental backup to S3.
Amazon S3
http://www.youtube.com/watch?v=oGcZ7WVx6EI
Legacy data warehousing
Cassandra Aegisthus Hadoop, Hive, Pig
Microstrategy
Sting
R
Slide 113
Slide 113 text
No content
Slide 114
Slide 114 text
No content
Slide 115
Slide 115 text
98% time saved for clinical trial simulations
Internal System AWS
Individual Clinical Trial Simulation Run Time (Min) 56 56
Total Number of Clinical Trial Simulations 2000 2000
No. Servers 2 256
No. CPU’s 32 2048
Total Analysis Run Time (hr) 60 1.2
Cost ?? $336
Slide 116
Slide 116 text
Reduced burden on pediatric subjects
Traditional Design
Design Optimized Using Clinical Trial
Simulation
# of subjects 60 40
# of blood samples per subject 12 5
Length of stay per subject 72 hours 26 hours
Length of study 2.5 years 1.7 years
Total study cost $700K $250K
Length and cost projected based on historical data in pediatric subjects