$30 off During Our Annual Pro Sale. View Details »

Time Series Data Models and the future of InfluxDB's Query Language

Paul Dix
April 20, 2017

Time Series Data Models and the future of InfluxDB's Query Language

A talk I gave at the Austin Monitoring Meetup about different open source projects' time series data models and some ideas for a new InfluxDB query language.

Paul Dix

April 20, 2017
Tweet

More Decks by Paul Dix

Other Decks in Technology

Transcript

  1. Time Series Data Models & the
    Query Languages That Love Them
    Paul Dix
    paul@influxdb.com
    @pauldix

    View Slide

  2. Future of InfluxDB’s Data Model
    & Query Language

    View Slide

  3. CTO & co-founder
    makers of

    View Slide

  4. Founder of NYC Machine
    Learning Meetup

    View Slide

  5. View Slide

  6. Editor Addison Wesley’s Data & Analytics

    View Slide

  7. Recovering Rubyist

    View Slide

  8. Data Models

    View Slide

  9. Graphite
    apps.backend.server_01.counters.requests.count

    View Slide

  10. Graphite
    apps.backend.server_01.counters.requests.count
    Hierarchy

    View Slide

  11. Graphite
    apps.backend.server_01.counters.requests.count
    Hierarchy

    View Slide

  12. Graphite
    apps.backend.server_01.counters.requests.count
    Value - double precision float
    Time - second precision epoch

    View Slide

  13. Graphite
    apps.backend.server_01.counters.requests.count
    Value - double precision float
    Time - second precision epoch
    regular series only!

    View Slide

  14. Regular time series
    t0 t1 t2 t3 t4 t6 t7
    Samples at regular intervals

    View Slide

  15. Irregular time series
    t0 t1 t2 t3 t4 t6 t7
    Events whenever they come in

    View Slide

  16. OpenTSDB
    sys.cpu.user host=webserver01,cpu=3 1356998400 1.2

    View Slide

  17. OpenTSDB
    sys.cpu.user host=webserver01,cpu=3 1356998400 1.2
    Metric

    View Slide

  18. OpenTSDB
    sys.cpu.user host=webserver01,cpu=3 1356998400 1.2
    Tags
    (string key/value pairs)

    View Slide

  19. OpenTSDB
    sys.cpu.user host=webserver01,cpu=3 1356998400 1.2
    millisecond precision epoch

    View Slide

  20. OpenTSDB
    sys.cpu.user host=webserver01,cpu=3 1356998400 1.2
    Value
    int64 or float64 (2.4)

    View Slide

  21. Prometheus
    http_requests_total{method="post",code="200"} 1027 1395066363000

    View Slide

  22. Prometheus
    http_requests_total{method="post",code="200"} 1027 1395066363000
    Metric

    View Slide

  23. Prometheus
    http_requests_total{method="post",code="200"} 1027 1395066363000
    Labels
    (string key/value pairs)

    View Slide

  24. Prometheus
    http_requests_total{method="post",code="200"} 1027 1395066363000
    Value - float64

    View Slide

  25. Prometheus
    http_requests_total{method="post",code="200"} 1027 1395066363000
    millisecond precision epoch

    View Slide

  26. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200

    View Slide

  27. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200
    Measurement

    View Slide

  28. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200
    Tags
    (string key/value pairs)

    View Slide

  29. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200
    Fields
    (key/value pairs)

    View Slide

  30. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200
    float64 value

    View Slide

  31. InfluxDB 1.x
    cpu,host=serverA,region=west foo=23i 1465839830100400200
    int64 value

    View Slide

  32. InfluxDB 1.x
    cpu,host=serverA,region=west bar=t 1465839830100400200
    bool value

    View Slide

  33. InfluxDB 1.x
    cpu,host=serverA,region=west line=“some text here” 1465839830100400200
    string value

    View Slide

  34. InfluxDB 1.x
    cpu,host=serverA,region=west user=23.2,system=54.1 1465839830100400200
    nanosecond precision epoch

    View Slide

  35. Differences
    organization data types precision
    Graphite hierarchical float64 seconds
    OpenTSDB metric, tags float64 milliseconds
    Prometheus metric, tags float64 milliseconds
    InfluxDB 1.x metric, tags, fields
    float64, int64, bool,
    string
    nanoseconds

    View Slide

  36. Querying

    View Slide

  37. Data Exploration
    what series do I have

    View Slide

  38. Retrieval & Computation
    raw data, transforms, materialized series, aggregates, samples

    View Slide

  39. Organization
    matters with thousands of series or more

    View Slide

  40. Hierarchy
    Tree!

    View Slide

  41. Lookup metrics/measurements
    OpenTSDB
    /api/search/lookup?query=
    (all series)
    Prometheus
    {__name__=~“.+”}
    (all series?)
    InfluxDB SHOW MEASUREMENTS

    View Slide

  42. Lookup tag/label keys
    OpenTSDB
    /api/search/lookup?query=
    (all series)
    Prometheus
    {__name__=~“.+”}
    (all series?)
    InfluxDB SHOW TAG KEYS

    View Slide

  43. Lookup tag/label values
    OpenTSDB
    /api/search/lookup?query={host=*}
    (all series with host?)
    Prometheus
    {host=~“.+”}
    (all series with host?)
    InfluxDB
    SHOW TAG VALUES WITH KEY = “host”
    SHOW TAG VAVUES FROM “cpu” with KEY = “host”

    View Slide

  44. Drill Down
    OpenTSDB
    /api/search/lookup?query={host=*,service=mysql}
    (all metrics on the hosts)
    Prometheus
    {host=~“.+”,service=mysql}
    (all metrics on the hosts)
    InfluxDB SHOW TAG VALUES WITH KEY = “host” WHERE “service” = ‘mysql’

    View Slide

  45. Why to care
    about drill down
    (faceted search)

    View Slide

  46. Why to care
    about drill down
    (faceted search)

    View Slide

  47. Why to care
    about drill down
    (faceted search)
    select nonstop,
    LGA, JFK

    View Slide

  48. Facets
    __name__
    host
    service
    region
    group

    View Slide

  49. Facets
    __name__
    host
    service
    region
    group

    go_goroutines
    go_memstats_alloc_bytes
    go_memstats_alloc_bytes_total
    go_memstats_gc_sys_bytes
    go_memstats_other_sys_bytes

    View Slide

  50. Facets
    __name__
    host
    service
    region
    group

    go_goroutines
    go_memstats_alloc_bytes
    go_memstats_alloc_bytes_total
    go_memstats_gc_sys_bytes
    go_memstats_other_sys_bytes

    host
    service
    region
    group

    View Slide

  51. Facets
    __name__
    host
    service
    region
    group

    go_goroutines
    go_memstats_alloc_bytes
    go_memstats_alloc_bytes_total
    go_memstats_gc_sys_bytes
    go_memstats_other_sys_bytes

    host
    service
    region
    group

    dynamic hierarchy!
    (name already selected)

    View Slide

  52. Labels/Tags > Hierarchy

    View Slide

  53. Up Front Design

    View Slide

  54. Powerful Discovery

    View Slide

  55. Slicing, dicing, grouping

    View Slide

  56. Query Languages
    Query Language Example
    Graphite functional
    target, from, until
    sumSeries(summarize(water.level.h2o.feet.*, '1hour', 'max'))
    OpenTSDB http params
    startTime, endTime, metric, aggregationFunction, filter,
    functions, expressions
    Prometheus functional-ish increase(http_requests_total{job=“prometheus”}[5m])
    InfluxDB 1.x SQL-ish
    select mean(system) from cpu
    where time > now() - 6h
    group by time(10m)

    View Slide

  57. Functional > SQL or API

    View Slide

  58. Time series are streams

    View Slide

  59. Apply Functions!

    View Slide

  60. Selection
    what series (streams) are we working with?

    View Slide

  61. Timing
    what time range are we interested in?

    View Slide

  62. Merging
    multiple streams into 1

    View Slide

  63. Joining
    /, *, +, -, &, |, ^, filter

    View Slide

  64. Partitioning
    do we slice the stream into blocks of time?

    View Slide

  65. Sampling
    first, last, min, max, filters

    View Slide

  66. Transforming
    time shift, derivative, rate, interpolate

    View Slide

  67. Summarizing
    count, percentile, mean, median, mode, histogram

    View Slide

  68. Future InfluxDB!

    View Slide

  69. Subject to Change!
    *disclaimer

    View Slide

  70. Requirements
    • Support InfluxDB 1.x Data Model
    • Support InfluxDB 1.x QL
    • Support Prometheus Data Model
    • Functional Query Language
    • Rich Query Builder UI
    • Query Completion CLI
    • PromQL?

    View Slide

  71. InfluxDB 2.0 Data Model
    • Tags
    • non-string values?
    • Value
    • int64
    • uint64
    • float64
    • bool
    • string
    • bytes
    • Timestamp (nanosecond)

    View Slide

  72. No More Measurement!

    View Slide

  73. No more fields?!
    yep, but remember joining and merging!

    View Slide

  74. SIMPLE

    View Slide

  75. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816

    View Slide

  76. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    Tags

    View Slide

  77. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    Key

    View Slide

  78. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    Value

    View Slide

  79. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    Separators

    View Slide

  80. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    spaces, /, and : must be escaped

    View Slide

  81. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 1491675816
    float64 value

    View Slide

  82. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23 1491675816
    float64 value

    View Slide

  83. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23 1491675816
    time
    (precision assumed closest to now)

    View Slide

  84. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23 1491675816000000us
    time
    (precision specified)

    View Slide

  85. InfluxDB 2.0 Line Protocol
    system:cpu,region:cpu,host:a,metric:user_idle 23.2 2017-04-08T14:23:54Z
    time
    (RFC3339Nano)

    View Slide

  86. InfluxDB 2.0 Line Protocol
    name:foo 2i 1491675816
    int64 value

    View Slide

  87. InfluxDB 2.0 Line Protocol
    name:foo 2u 1491675816
    uint64 value

    View Slide

  88. InfluxDB 2.0 Line Protocol
    name:foo [234, 21, 9, 23, 87, 90, 11, 54] 1491675816
    bytes value

    View Slide

  89. InfluxDB 2.0 Line Protocol
    name:foo “it’s a string, yo!” 2017-04-18T14:58:00Z
    string value

    View Slide

  90. InfluxDB 2.0 Line Protocol
    name:foo f 2017-04-18T14:58:00Z
    bool value

    View Slide

  91. InfluxQL 2.0
    functional!

    View Slide

  92. f1(f2(f3(f4(streams))))
    Lisp?

    View Slide

  93. View Slide

  94. View Slide

  95. Paul Graham, Rich Hickey

    View Slide

  96. D3
    d3.select("body")
    .selectAll("p")
    .data([4, 8, 15, 16, 23, 42])
    .enter().append("p")
    .text(function(d) { return "I’m number " + d + "!"; });

    View Slide

  97. Function Chaining!

    View Slide

  98. Series
    {
    "id": 24,
    "meta": {
    "dataType": "float64",
    "metricType": "gauge"
    },
    "tagset": {
    "host": "A",
    "region": "B"
    },
    "vector": [
    {"value":23.2, "epoch":1491499253},
    {"value":78.1, "epoch":1491499263, "tagset":{"host":"B"}}
    ]
    }

    View Slide

  99. Matrix
    [
    {
    "tagset": {
    "host": "A",
    },
    "vector": [{"value":23.1, "epoch":1491499253}, {"value":56.2, "epoch":1491499263}]
    },
    {
    "tagset": {
    "host": "B"
    },
    "vector": [{"value":23.1, "epoch":1491499253}, {"value":56.2, "epoch":1491499263}]
    }
    ]

    View Slide

  100. Example
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)
    .range(startOffset:"-1h")

    View Slide

  101. Named Parameters
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)
    .range(startOffset:"-1h")
    Named parameters!

    View Slide

  102. Example Queries
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)
    .range(startOffset:"-1h")
    Wrap strings in back ticks to avoid pesky escaping

    View Slide

  103. Database returns Matrix
    database(name:"testdb")
    [
    {
    "tagset": {
    "host": "A",
    },
    "vector": [
    {"value":23.1, “epoch":1491499253},
    {"value":56.2, "epoch":1491499263}]
    },
    {
    "tagset": {
    "host": "B"
    },
    "vector": [
    {"value":23.1, “epoch":1491499253},
    {"value":56.2, "epoch":1491499263}]
    }
    ]

    View Slide

  104. Select filters vectors
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)

    View Slide

  105. Select filters vectors
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)
    Tag keys

    View Slide

  106. Select filters vectors
    database(name:"testdb")
    .select(criteria:`"host" = 'A' and "system" = 'cpu'`)
    Tag values

    View Slide

  107. Complex Criteria
    database(name:"testdb")
    .select(criteria:`”t1” = ‘foo’ AND (“t2” = ‘bar’ OR “t3” = ‘asdf’)`)

    View Slide

  108. Criteria Operators
    • =
    • !=
    • =~
    • !~
    • <
    • >
    • startsWith
    • in
    • notIn

    View Slide

  109. What hosts do we have?
    database(name:"testdb")
    .values(key:"host")
    .sort()
    .limit(n:20)

    View Slide

  110. How many hosts?
    database(name:"testdb")
    .values(key:”host")
    .count()

    View Slide

  111. // get the cpu load of hosts that have mysql running
    var db = database(name:"testdb")
    db.select(
    criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{
    db.select(`"service" = 'mysql'`).values(key:"host")
    }`)
    .range(startOffset:"-4h")

    View Slide

  112. // get the cpu load of hosts that have mysql running
    var db = database(name:"testdb")
    db.select(
    criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{
    db.select(`"service" = 'mysql'`).values(key:"host")
    }`)
    .range(startOffset:"-4h")
    Variables

    View Slide

  113. // get the cpu load of hosts that have mysql running
    var db = database(name:"testdb")
    db.select(
    criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{
    db.select(`"service" = 'mysql'`).values(key:"host")
    }`)
    .range(startOffset:"-4h")
    string interpolation

    View Slide

  114. // get the count in 10m periods in the last 24h from an event stream
    // and filter that to only include those periods that were 2 sigma above the average
    var m = database(name:”testdb”).select(criteria:"\"event\" = 'pageview'")
    .range(startOffset:"-24h")
    .merge()
    .window(func:count(),duration:"10m")

    View Slide

  115. // get the count in 10m periods in the last 24h from an event stream
    // and filter that to only include those periods that were 2 sigma above the average
    var m = database(name:”testdb”).select(criteria:”\”event\” = 'pageview'")
    .range(startOffset:"-24h")
    .merge()
    .window(func:count(),duration:"10m")
    // this is shorthand for m.stddev.join(op:"*", right:2)
    var sigma = m.stddev() * 2

    View Slide

  116. // get the count in 10m periods in the last 24h from an event stream
    // and filter that to only include those periods that were 2 sigma above the average
    var m = database(name:”testdb”).select(criteria:”\”event\” = 'pageview'")
    .range(startOffset:"-24h")
    .merge()
    .window(func:count(),duration:"10m")
    // this is shorthand for m.stddev.join(op:"*", right:2)
    var sigma = m.stddev() * 2
    // return only the counts 1 sigma above
    m.filter(exp:"$ > #{sigma}")

    View Slide

  117. // return the last hour of time series of the top 10 host cpu utilizations by
    // their average load over last 10 minutes
    var topTen = db.select(criteria:”\”metric\” = 'load' and system = 'cpu'")
    .range(startOffset:"-10m")
    .mean()
    .sort(func:first())
    .slice(end:10)
    .values(key:"host")
    db.select(criteria:”\"metric\" = 'load' and system = 'cpu' and host in #{topTen}")
    .range(startOffset:"-1h")

    View Slide

  118. Functions
    • interpolate
    • join
    • merge
    • timeShift
    • window
    • rate
    • first, last, min, max, mean, percentile, etc.

    View Slide

  119. Public docs PR in two weeks!
    please to give feedback :)

    View Slide

  120. Thank you.
    Paul Dix
    paul@influxdb.com
    @pauldix

    View Slide