Time series data is the worst and
best use case in distributed
databases
Paul Dix
CEO @InfluxDB
@pauldix
paul@influxdb.com
Slide 2
Slide 2 text
What is time series data?
Slide 3
Slide 3 text
Stock trades and quotes
Slide 4
Slide 4 text
Metrics
Slide 5
Slide 5 text
Analytics
Slide 6
Slide 6 text
Events
Slide 7
Slide 7 text
Sensor data
Slide 8
Slide 8 text
Two kinds of time series data…
Slide 9
Slide 9 text
Regular time series
t0 t1 t2 t3 t4 t6 t7
Samples at regular intervals
Slide 10
Slide 10 text
Irregular time series
t0 t1 t2 t3 t4 t6 t7
Events whenever they come in
Slide 11
Slide 11 text
Inducing a regular time series
from an irregular one
query: select count(customer_id) from events
where time > now() - 1h
group by time(1m), customer_id
Slide 12
Slide 12 text
Data that you ask questions
about over time
Slide 13
Slide 13 text
No content
Slide 14
Slide 14 text
1. Databases
Slide 15
Slide 15 text
2. Distributed Systems
Slide 16
Slide 16 text
Access properties suck for
databases
Slide 17
Slide 17 text
High write throughput
Slide 18
Slide 18 text
Example from DevOps
• 2,000 servers, VMs, containers, or sensor units
• 200 measurements per server/unit
• every 10 seconds
• = 3,456,000,000 distinct points per day
Slide 19
Slide 19 text
Use LSM Tree, optimized for
writes!
Slide 20
Slide 20 text
Even higher read throughput
Slide 21
Slide 21 text
Aggregation and downsampling
Slide 22
Slide 22 text
Queries for dashboards
Slide 23
Slide 23 text
Queries for monitoring systems
Slide 24
Slide 24 text
LSM Tree optimized for writes
Slide 25
Slide 25 text
Use COW B+Tree, it’s optimized
for reads!
Slide 26
Slide 26 text
Write throughput goes to hell
Slide 27
Slide 27 text
No compression
Slide 28
Slide 28 text
Large scale deletes
Slide 29
Slide 29 text
Aggregate, down-sample and
phase out raw data
Slide 30
Slide 30 text
If clearing out point-by-point
# of deletes = # of writes
Slide 31
Slide 31 text
LSM Tree deletes are wildly
expensive
Slide 32
Slide 32 text
COW B+Tree deletes expensive
if we want to reclaim disk
Slide 33
Slide 33 text
No perfect storage engine for
these properties
Slide 34
Slide 34 text
Time series data + databases =
great sadness
Slide 35
Slide 35 text
Access properties suck for
distributed systems
Slide 36
Slide 36 text
Range scans of many keys
Slide 37
Slide 37 text
series: cpu region=uswest, host=serverA
Slide 38
Slide 38 text
series: cpu region=uswest, host=serverA
query: select max(value) from cpu
where time > now() - 6h
group by time(5m)
Slide 39
Slide 39 text
series: cpu region=uswest, host=serverA
query: select max(value) from cpu
where region = ‘uswest’
AND time > now() - 6h
group by time(5m)
Series from all hosts from uswest merged into one
Slide 40
Slide 40 text
How to distribute the data?
Slide 41
Slide 41 text
By measurement?
cpu
Slide 42
Slide 42 text
By measurement?
cpu
BOTTLENECK
Slide 43
Slide 43 text
By measurement + tags?
cpu region=uswest, host=serverA
Slide 44
Slide 44 text
By measurement + tags?
cpu region=uswest, host=serverA
SERIES GROWS INDEFINITELY
Slide 45
Slide 45 text
By measurement + tags, time?
cpu region=uswest, host=serverA, time
Slide 46
Slide 46 text
By measurement + tags, time?
cpu region=uswest, host=serverA, time
WHICH TIMES/KEYS EXIST?
Slide 47
Slide 47 text
By measurement + tags, time?
cpu region=uswest, host=serverA, time
NO DATA LOCALITY
Slide 48
Slide 48 text
High throughput
Slide 49
Slide 49 text
CAP Theorem
Slide 50
Slide 50 text
CAP Theorem
C: Consistency
Slide 51
Slide 51 text
CAP Theorem
C: Consistency
A: Availability
Slide 52
Slide 52 text
CAP Theorem
C: Consistency
A: Availability
P: In the face of Partitions
Slide 53
Slide 53 text
Pick either C or A
Slide 54
Slide 54 text
P is happening whether you have
perfect network hardware or not
Slide 55
Slide 55 text
Pauses under load look like
partitions
Slide 56
Slide 56 text
High throughput = load
Slide 57
Slide 57 text
Consistency under high write
throughput
Slide 58
Slide 58 text
Time series queries do range scans
of recent data that is always moving
Slide 59
Slide 59 text
Some sensors sample many
times per second
Slide 60
Slide 60 text
Event streams can be even
more frequent
Slide 61
Slide 61 text
Consistent view?
Slide 62
Slide 62 text
No content
Slide 63
Slide 63 text
Time series data + distributed
systems = great sadness
Slide 64
Slide 64 text
but…
Slide 65
Slide 65 text
Time series data has great
properties for databases
Slide 66
Slide 66 text
No updates
Slide 67
Slide 67 text
Large ranges cold for writes
Slide 68
Slide 68 text
Immutable data structures and
files
Slide 69
Slide 69 text
Like LSM, but more specific
Slide 70
Slide 70 text
Deletes mostly against ranges
of old data
Slide 71
Slide 71 text
We partition data by ranges of
time
e.g. all data for a day or hour together
Slide 72
Slide 72 text
Drop entire files
Slide 73
Slide 73 text
Tombstone the one-offs
Slide 74
Slide 74 text
New storage engine
Slide 75
Slide 75 text
Great properties for distributed
systems
Slide 76
Slide 76 text
No updates
Slide 77
Slide 77 text
Large scale deletes on cold
areas of keyspace
Slide 78
Slide 78 text
Perfect for an AP system
Slide 79
Slide 79 text
Conflict resolution made easy
i.e. no updates = no contention
Slide 80
Slide 80 text
Partition key space by ranges of
time
i.e. old data vs. new
Slide 81
Slide 81 text
Old data generally doesn’t
change
Slide 82
Slide 82 text
Consistent view on new data is
the union
Slide 83
Slide 83 text
Deletes against ranges that are
cold for writes and queries
Slide 84
Slide 84 text
Cluster growth to increase storage
capacity doesn’t require rebalancing
Slide 85
Slide 85 text
Data locality
i.e. how we ship the code to where the data lives when scanning large ranges of
data
Slide 86
Slide 86 text
Evenly distribute across cluster, per day
cpu region=uswest, host=serverA
cpu region=uswest, host=serverB
cpu region=useast, host=serverC
cpu region=useast, host=serverD
Shard 1
Shard 1
Shard 2
Shard 2
Slide 87
Slide 87 text
Each shard lives on a server
and # of replicas
Slide 88
Slide 88 text
Hits one shard
query: select mean(value) from cpu
where region = ‘uswest’
AND host = ‘serverB’
AND time > now() - 6h
group by time(5m)
Slide 89
Slide 89 text
Decompose into map/reduce job
query: select mean(value) from cpu
where region = ‘uswest’
AND time > now() - 6h
group by time(5m)
Many series match this criteria,
many shards to query
Slide 90
Slide 90 text
func MapMean(itr Iterator) interface{} {
out := &meanMapOutput{}
for _, k, v := itr.Next(); k != 0; _, k, v = itr.Next() {
out.Count++
out.Mean += (v.(float64) - out.Mean) / float64(out.Count)
}
if out.Count > 0 {
return out
}
return nil
}
Slide 91
Slide 91 text
func ReduceMean(values []interface{}) interface{} {
out := &meanMapOutput{}
var countSum int
for _, v := range values {
if v == nil {
continue
}
val := v.(*meanMapOutput)
countSum = out.Count + val.Count
out.Mean = val.Mean*(float64(val.Count)/float64(countSum)) +
out.Mean*(float64(out.Count)/float64(countSum))
out.Count = countSum
}
if out.Count > 0 {
return out.Mean
}
return nil
}
Slide 92
Slide 92 text
We only transmit the summary
ticks across the cluster
one per 5 minute interval