Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Everything You Never Knew You Wanted to Ask about Time Series Databases

Everything You Never Knew You Wanted to Ask about Time Series Databases

This talk covers the Graphite ecosystem. It is intended to serve as an introduction to core concepts. I also highlight some gotchas I see people run into when they start using time series databases.

8d96f5c273062cb617255e630fe0705c?s=128

Brad Lhotsky

October 17, 2015
Tweet

Transcript

  1. Everything you never knew you wanted to ask about Time

    Series Data Presented by: Brad Lhotsky
  2. You gotta know where I’ve been to know where we’re

    going.
  3. None
  4. http://oss.oetiker.ch/rrdtool/

  5. Reverse Polish Notation!!!

  6. Where are we now?

  7. Open Source Solutions Graphite OpenTSDB InfluxDB Hosted Solutions Circonus Librato

    Datadog New Relic
  8. What do you use?

  9. None
  10. • Pros • Easy to send metrics • Support for

    “Metrics 2.0” • SQL-ish interface to the data •Cons • Read scalability is lacking • Still quite young, good things to come here! InfluxDB
  11. • Pros • Easy to send metrics • Support for

    “Metrics 2.0” • Hbase backend • No “roll up” all points stored for eternity! •Cons • Read scalability is lacking • Hbase backend? OpenTSDB
  12. YMMV, and that’s O.K.

  13. Why do you use Graphite?

  14. It’s Open Source.

  15. It’s scalable.

  16. It’s easy.

  17. It’s composable.

  18. It’s FUN!

  19. What is Time Series Data?

  20. Measurements at fixed regular intervals, impossible to have two values

    for the same metric at the same point in time. Graphite’s rules are that the last value for an interval wins. Time Series Data
  21. Graphite does not care, it just stores a value at

    a point in time. It’s up to you to store what you want and understand how to retrieve it. Gauge v. Counter
  22. Gauges usually fit within a fixed range, but only represent

    state at the time of reading, meaning you can miss spikes. Counters allow more complete history, but can overflow. Use nonNegativeDerivative() to view the changes between points. Gauge v. Counter
  23. How does Graphite work?

  24. • Dot separated namespaces: • sys.datacenter.zone.host.class.metric • Created automatically the

    first time it’s updated • All storage pre-allocated • Multiple storage engines • Whisper (Flat Files) • Ceres • Cyanite (based on Cassandra) Metrics
  25. • Ask for a metric • sys.datacenter.zone.host.class.metric • Ask for

    all the metrics • sys.datacenter.zone.*.class.metric • Ask for a combination, mutation, or selection • sumSeries(sys.datacenter.zone.*.class .metric) • Returns PNG, SVG, JSON, CSV, … Queries / API
  26. • carbon • Route and store metrics • whisper •

    Storage file format and utilities • graphite-web • User-facing interface to Graphite Components
  27. How does Graphite scale?

  28. • Cluster using relays • Use SSD’s for fast writes

    • Use redundancy because SSD’s fail • Read Jason Dixon’s book With Knowledge, Well.
  29. • https://github.com/grobian/carbon-c-relay • Pass and route metrics to storage •

    https://github.com/dgryski/carbonzipper • Map/Reduce metric queries • https://github.com/dgryski/carbonapi • Intelligent caching layer for JSON/CSV/Raw outputs With Help, WebScale 2.0!!
  30. Would you deploy Graphite today?

  31. Yes.

  32. How does Graphite store data?

  33. Every metric is a flat file on disk that’s pre-

    allocated at creation time to a fixed size. Size is based on the the defined retention periods, which we’ll discuss shortly. Whisper Files
  34. The dots in the metric names are directory separators on

    the file system. Whisper Files
  35. Time series databases allow for prolonged storage. It’s common for

    metrics to remain for two or more years. To cut costs, aggregations are performed as the data ages. Data Compression
  36. What do I need to know about roll up?

  37. Backend configuration for allocating on disk storage for metrics. Can

    only be set at metric creation. Define retentions. Storage: Schema [mysql] pattern = ^mysql\. retentions = 10s:2d,60s:14d,30m:2y [default] pattern = .* retentions = 60s:14d,30m:2y
  38. Configuration for handling how metrics traverse the retention boundaries. Storage:

    AGGREGATIONS [default] pattern = .* xFilesFactor = 0.5 aggregationMethod = average
  39. Float between 0 and 1 representing the percentage of non-null

    points required to roll up to a non-null value. xFilesFactor [default] pattern = .* xFilesFactor = 0.5 aggregationMethod = average
  40. Functions available to turn multiple values into a single value

    for retention roll ups. Aggregators ‣ average - Average all values ‣ min - minimum of set ‣ max - maximum of set ‣ sum - sum of set ‣ last - take the last value
  41. Storage: AGGREGATIONS [alerts] pattern = ^alerts\. xFilesFactor = 0 aggregationMethod

    = sum
  42. We lose resolution as data rolls up. Rolling Data 1

    Minute 5 Minutes 25 Minutes
  43. How do I get data in?

  44. Getting your data into Graphite is as simple as sending

    the metric string to the relevant carbon host and port! Sending Data echo “metric.name.as.dotted.path value epoch” \ nc graphite 2003
  45. There are a lot of libraries that encapsulate most to

    all of this incredibly complicated task for every web-scale programming language.
  46. Pretty pictures, please?

  47. None
  48. http://obfuscurity.com/2012/04/Unhelpful-Graphite-Tip-2

  49. Autocomplete can’t be disabled, use Esc key to close it.

    Interface
  50. security.logging.indexer.*.total Metrics: Wildcards

  51. security.logging.indexer.logproc-[12]01.total Metrics: Character Classes

  52. security.logging.indexer.logproc-{202,102}.total Metrics: Word Groups

  53. aliasByNode(security.logging.indexer.*.total,3) Metrics: Aliases

  54. sumSeries(security.logging.indexer.*.total) Combining Metrics

  55. sumSeriesWithWildcards(security.logging.indexer.*.*,3) Combining & Grouping Metrics

  56. averageSeries(security.logging.indexer.*.total) Combining Metrics

  57. averageSeriesWithWildcards( security.logging.indexer.*.{total,ignore}, 3) Combining & Grouping Metrics

  58. averageSeriesWithWildcards( security.logging.indexer.*.*,3) Combining Metrics

  59. nPercentile(security.logging.indexer.*.total,95) Combining Metrics

  60. percentileOfSeries(security.logging.indexer.*.total,95) Combining Metrics

  61. mostDeviant(2,security.logging.indexer.*.total) Selecting Metrics

  62. highestCurrent(security.logging.indexer.*.total,2) Selecting Metrics

  63. highestAverage(security.logging.indexer.*.total,2) Selecting Metrics

  64. general.es.logsearch-208.jvm.gc.collectors.old.collection_ms Transforming Metrics general.es.logsearch-201.indices.indexing.index_ms

  65. general.es.logsearch-208.jvm.gc.collectors.old.collection_ms Side Bar: Y-Minimun as 0

  66. nonNegativeDerivative( general.es.logsearch-208.jvm.gc.collectors.old.collection_ms) Transforming Metrics nonNegativeDerivative( general.es.logsearch-201.indices.indexing.index_ms)

  67. removeAbovePercentile( nonNegativeDerivative( general.es.logsearch-201.indices.indexing.index_ms ), 95) Selecting Data

  68. scaleToSeconds( removeAbovePercentile( nonNegativeDerivative( general.es.logsearch-201.indices.indexing.index_ms ), 95), 1) Per Second

  69. nonNegativeDerivative( general.es.logsearch-208.jvm.gc.collectors.old.collection_ms) Transforming Metrics

  70. offset( nonNegativeDerivative( general.es.logsearch-208.jvm.gc.collectors.old.collection_ms ), -30) Transforming Metrics

  71. drawAsInfinite( offset( nonNegativeDerivative( general.es.logsearch-208.jvm.gc.collectors.old.collection_ms ), -30) ) Transforming Metrics

  72. alias(sumSeries(security.logging.indexer.*.total),”Today") alias( timeShift( sumSeries(security.logging.indexer.*.total), “7d"), "Last Week") Comparing Metrics

  73. color(constantLine(0),"red") diffSeries( sumSeries(security.logging.indexer.*.total), timeShift(sumSeries(security.logging.indexer.*.total),”7d") ) Comparing Metrics

  74. alias(alpha(color(areaBetween( holtWintersConfidenceBands( maxSeries(general.es.logsearch-20*.jvm.mem.heap_used_bytes) ) ),“gray"),0.1),"Hot Winter Confidence Bands”) color(alias( maxSeries(general.es.logsearch-20*.jvm.mem.heap_used_bytes),

    "Max Heap Size"),"red") Advanced Tricks
  75. # Same as last slide & set Y-Minimum to 0

    color(alias( secondYAxis( maxSeries(general.es.logsearch-20*.indices.docs.count) ), "Max Docs per Node"),"green") Advanced Tricks
  76. drawAsInfinite( removeBelowValue( offset( nonNegativeDerivative( general.es.logsearch-*.jvm.gc.collectors.old.collection_ms ), -250), 0) ) Advanced

    Tricks
  77. Do you even Dashboard?

  78. Grafana http://grafana.org/

  79. GraphExplorer https://vimeo.github.io/graph-explorer/

  80. Cubism https://square.github.io/cubism/

  81. Rubics Cubism https://github.com/reyjrar/rubics-cubism

  82. Thank you! brad.lhotsky@gmail.com https://twitter.com/reyjrar https://github.com/reyjrar https://speakerdeck.com/reyjrar