Slide 1

Slide 1 text

OpenTSDB A Distributed, Scalable, Time Series Database “Monitoring at an unprecedented level of granularity” Benoît “tsuna” Sigoure tsuna@stumbleupon.com

Slide 2

Slide 2 text

Old Plow Where’s my Paradigm Shift?

Slide 3

Slide 3 text

Get Real-Time Data from your Infrastructure Backbone #2 by AndiH

Slide 4

Slide 4 text

Working at Scale Finding needle in haystack by Bindaas Madhavi

Slide 5

Slide 5 text

No SPoF

Slide 6

Slide 6 text

HBase Bright Ideas by purplemattfish Distributed Scalable Reliable Efficient

Slide 7

Slide 7 text

Design Goals • Distributed storage of monitoring data • No Single Point of Failure • Pulling custom graphs must be trivial & fast • Scale to: • Thousands of machines • Many billions of data points

Slide 8

Slide 8 text

Key concepts • Data Points (time, value) • Metrics proc.loadavg.1m • Tags host=web42 pool=static • Metric + Tags = Time Series put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static

Slide 9

Slide 9 text

Key concepts • Data Points (time, value) • Metrics proc.loadavg.1m • Tags host=web42 pool=static • Metric + Tags = Time Series put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static

Slide 10

Slide 10 text

Key concepts • Data Points (time, value) • Metrics proc.loadavg.1m • Tags host=web42 pool=static • Metric + Tags = Time Series put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static

Slide 11

Slide 11 text

Key concepts • Data Points (time, value) • Metrics proc.loadavg.1m • Tags host=web42 pool=static • Metric + Tags = Time Series put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static

Slide 12

Slide 12 text

The Big Picture™ Applications Metrics tcollector Periodic polling TSD TSD TSD HBase Push data points (telnet-style proto, RPC soon) Put / Scan (Hadoop RPC) Browser Graph request (HTTP)

Slide 13

Slide 13 text

12 Bytes Per Datapoint 4TB per year for 1000 machines

Slide 14

Slide 14 text

OpenTSDB @ 150 Million Datapoints/Day in a typical datacenter

Slide 15

Slide 15 text

Demo Time!

Slide 16

Slide 16 text

Set it up in 15 minutes • JDK + Gnuplot 1 minute (1 command) • Single-node HBase 4 minutes (3 commands) • OpenTSDB 5 minutes (5 commands) • Deploy tcollector 5 minutes With zero prior experience

Slide 17

Slide 17 text

Under the Hood Netty su async hbase async TSD core Local Disk (cache)

Slide 18

Slide 18 text

Under the Hood Netty su async hbase async TSD core Local Disk (cache) Write Path put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static HBase 1s delay max. >2000 data points / sec / core

Slide 19

Slide 19 text

Under the Hood Netty su async hbase async TSD core Local Disk (cache) Write Path put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static HBase >2000 data points / sec / core

Slide 20

Slide 20 text

Under the Hood Netty su async hbase async TSD core Local Disk (cache) HBase Read Path scan GET /q?...

Slide 21

Slide 21 text

Under the Hood Netty su async hbase async TSD core Local Disk (cache) HBase Read Path scan write to cache GET /q?... Gnuplot

Slide 22

Slide 22 text

100% Natural, Organic Free & Open-Source Danger in the Corn by Roger Smith

Slide 23

Slide 23 text

¿ Questions ? Benoît “tsuna” Sigoure tsuna@stumbleupon.com opentsdb.net

Slide 24

Slide 24 text

Inside HBase put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static } } } 0 5 2 0 0 1 0 2 8 } } 0 4 7 0 0 1 Row Key Column Family: name Column Family: name Column Family: name Column Family: id Column Family: id Column Family: id Row Key metrics tagk tagv metrics tagk tagv host static proc.loadavg .1m host proc.loadavg. 1m 0 5 2 0 5 2 0 0 1 0 0 1 Table: tsdb-uid

Slide 25

Slide 25 text

Inside HBase put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static Row Key Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Row Key +0 +15 +20 ... +90 +600 0.69 0.51 0.42 0.99 0.72 } } } } 0 5 2 0 0 1 0 2 8 } } 0 4 7 0 0 1 =1234567800+90 } 73 -106 2 120 Table: tsdb

Slide 26

Slide 26 text

Inside HBase put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static Row Key Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Row Key +0 +15 +20 ... +90 +600 0.69 0.51 0.42 0.99 0.72 } } } } 0 5 2 0 0 1 0 2 8 } } 0 4 7 0 0 1 =1234567800+90 } 73 -106 2 120 Table: tsdb

Slide 27

Slide 27 text

Inside HBase put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static Row Key Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Row Key +0 +15 +20 ... +90 +600 0.69 0.51 0.42 0.99 0.72 } } } } 0 5 2 0 0 1 0 2 8 } } 0 4 7 0 0 1 =1234567800+90 } 73 -106 2 120 Table: tsdb

Slide 28

Slide 28 text

Inside HBase put proc.loadavg.1m 1234567890 0.42 host=web42 pool=static Row Key Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Column Family: t Row Key +0 +15 +20 ... +90 +600 0.69 0.51 0.42 0.99 0.72 } } } } 0 5 2 0 0 1 0 2 8 } } 0 4 7 0 0 1 =1234567800+90 } 73 -106 2 120 Table: tsdb