OpenTSDB
A Distributed, Scalable, Time Series Database
“Monitoring at an unprecedented level of granularity”
Benoît “tsuna” Sigoure
tsuna@stumbleupon.com
Slide 2
Slide 2 text
Old Plow
Where’s my
Paradigm
Shift?
Slide 3
Slide 3 text
Get Real-Time Data
from your Infrastructure
Backbone #2 by AndiH
Slide 4
Slide 4 text
Working at Scale
Finding needle in haystack by Bindaas Madhavi
Slide 5
Slide 5 text
No SPoF
Slide 6
Slide 6 text
HBase
Bright Ideas by purplemattfish
Distributed Scalable
Reliable Efficient
Slide 7
Slide 7 text
Design Goals
• Distributed storage of monitoring data
• No Single Point of Failure
• Pulling custom graphs must be trivial & fast
• Scale to:
• Thousands of machines
• Many billions of data points
Slide 8
Slide 8 text
Key concepts
• Data Points
(time, value)
• Metrics
proc.loadavg.1m
• Tags
host=web42 pool=static
• Metric + Tags = Time Series
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
Slide 9
Slide 9 text
Key concepts
• Data Points
(time, value)
• Metrics
proc.loadavg.1m
• Tags
host=web42 pool=static
• Metric + Tags = Time Series
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
Slide 10
Slide 10 text
Key concepts
• Data Points
(time, value)
• Metrics
proc.loadavg.1m
• Tags
host=web42 pool=static
• Metric + Tags = Time Series
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
Slide 11
Slide 11 text
Key concepts
• Data Points
(time, value)
• Metrics
proc.loadavg.1m
• Tags
host=web42 pool=static
• Metric + Tags = Time Series
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
Slide 12
Slide 12 text
The Big Picture™
Applications
Metrics
tcollector
Periodic
polling
TSD TSD TSD
HBase
Push data points
(telnet-style proto,
RPC soon)
Put / Scan
(Hadoop RPC)
Browser
Graph request
(HTTP)
Slide 13
Slide 13 text
12 Bytes Per Datapoint
4TB per year for 1000 machines
Slide 14
Slide 14 text
OpenTSDB @
150 Million Datapoints/Day
in a typical datacenter
Slide 15
Slide 15 text
Demo Time!
Slide 16
Slide 16 text
Set it up in 15 minutes
• JDK + Gnuplot 1 minute (1 command)
• Single-node HBase 4 minutes (3 commands)
• OpenTSDB 5 minutes (5 commands)
• Deploy tcollector 5 minutes
With zero prior experience
Slide 17
Slide 17 text
Under the Hood
Netty
su async
hbase async
TSD
core
Local Disk
(cache)
Slide 18
Slide 18 text
Under the Hood
Netty
su async
hbase async
TSD
core
Local Disk
(cache)
Write Path
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
HBase
1s delay
max.
>2000 data points / sec / core
Slide 19
Slide 19 text
Under the Hood
Netty
su async
hbase async
TSD
core
Local Disk
(cache)
Write Path
put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static
HBase
>2000 data points / sec / core
Slide 20
Slide 20 text
Under the Hood
Netty
su async
hbase async
TSD
core
Local Disk
(cache)
HBase
Read Path
scan
GET /q?...
Slide 21
Slide 21 text
Under the Hood
Netty
su async
hbase async
TSD
core
Local Disk
(cache)
HBase
Read Path
scan
write to cache
GET /q?...
Gnuplot
Slide 22
Slide 22 text
100% Natural, Organic
Free & Open-Source
Danger in the Corn by Roger Smith