Upgrade to Pro — share decks privately, control downloads, hide ads and more …

InfluxDB 2.0 and Flux

39b7a68b6cbc43ec7683ad0bcc4c9570?s=47 Paul Dix
March 28, 2019

InfluxDB 2.0 and Flux

Presented at CloudConf 2019, this talk is an introduction to InfluxDB 2.0 and the new programming and query language, Flux (#fluxlang).

39b7a68b6cbc43ec7683ad0bcc4c9570?s=128

Paul Dix

March 28, 2019
Tweet

Transcript

  1. InfluxDB 2.0 and #fluxlang Paul Dix paul@influxdata.com @pauldix

  2. an open source time series database

  3. What is time series data?

  4. Stock trades and quotes

  5. Metrics

  6. Analytics

  7. Events

  8. Sensor data

  9. Two kinds of time series data…

  10. Regular time series t0 t1 t2 t3 t4 t6 t7

    Samples at regular intervals
  11. Irregular time series t0 t1 t2 t3 t4 t6 t7

    Events whenever they come in
  12. Data that you ask questions about over time

  13. Solve common problems

  14. data collector

  15. processing, ETL, monitoring, alerting

  16. UI, visualization, management

  17. TICK for time series data

  18. Common Schema

  19. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

  20. Line Protocol Measurement cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000

  21. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Tags

  22. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 Fields

  23. float64, int64, bool, string

  24. Line Protocol cpu,host=serverA,num=1,region=west idle=1.667,system=2342.2 1492214400000000000 nanosecond epoch

  25. Query Language

  26. SQL-ish select percentile(90, value) from cpu where time > now()

    - 1d group by time(10m)
  27. No Common API

  28. Different Languages for Query & Monitoring

  29. 2.0

  30. • MIT Licensed • TSDB (write, query) • UI &

    Visualizations, Dashboards • Pull Metrics (Prometheus & OpenMetrics) • Tasks (background processing, ETL, monitoring/alerting)
  31. > DB

  32. None
  33. None
  34. None
  35. Officially Supported Client Libraries Go, Node.js, Ruby, Python, PHP, Java,

    C#, C, Kotlin
  36. Visualization Libraries

  37. Data Model • Organization • Dashboards • Tasks • Buckets

    • Scrapers & Telegraf configs • Labels • Users
  38. None
  39. • Query planner • Query optimizer • Turing complete language,

    VM, and query engine • Multi-language support in Engine • Multi-data source support • InfluxDB, CLI, REPL, Go library
  40. Flux Language Elements

  41. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
  42. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Comments
  43. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Named Arguments
  44. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") String Literals
  45. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Buckets, not DBs
  46. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Duration Literal
  47. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:2018-11-07T00:00:00Z) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Time Literal
  48. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Pipe forward operator
  49. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Anonymous Function
  50. // get all data from the telegraf db from(bucket:”telegraf/autogen”) //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => (r._measurement == "cpu" or r._measurement == “cpu") and r.host == “serverA") Predicate Function
  51. // variables some_int = 23

  52. // variables some_int = 23 some_float = 23.2

  53. // variables some_int = 23 some_float = 23.2 some_string =

    “cpu"
  54. // variables some_int = 23 some_float = 23.2 some_string =

    “cpu" some_duration = 1h
  55. // variables some_int = 23 some_float = 23.2 some_string =

    “cpu" some_duration = 1h some_time = 2018-10-10T19:00:00
  56. // variables some_int = 23 some_float = 23.2 some_string =

    “cpu" some_duration = 1h some_time = 2018-10-10T19:00:00 some_array = [1, 6, 20, 22]
  57. // variables some_int = 23 some_float = 23.2 some_string =

    “cpu" some_duration = 1h some_time = 2018-10-10T19:00:00 some_array = [1, 6, 20, 22] some_object = {foo: "hello" bar: 22}
  58. // defining a pipe forwardable function square = (tables=<-) =>

    tables |> map(fn: (r) => {r with _value: r._value * r._value})
  59. // defining a pipe forwardable function square = (tables=<-) =>

    tables |> map(fn: (r) => {r with _value: r._value * r._value}) Accepts a pipe forward assigns to tables variable
  60. // defining a pipe forwardable function square = (tables=<-) =>

    tables |> map(fn: (r) => {r with _value: r._value * r._value}) from(bucket:"foo") |> range(start: -1h) |> filter(fn: (r) => r._measurement == "samples") |> square() |> filter(fn: (r) => r._value > 23.2)
  61. // defining a pipe forwardable function square = (tables=<-) =>

    tables |> map(fn: (r) => {r with _value: r._value * r._value}) from(bucket:"foo") |> range(start: -1h) |> filter(fn: (r) => r._measurement == "samples") |> square() |> filter(fn: (r) => r._value > 23.2) Calling the function
  62. Data Model & Working with Tables

  63. Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user

  64. Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user Measurement

  65. Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user Field

  66. Table _measurement host region _field _time _value mem A west

    free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10
  67. _measurement host region _field _time _value mem A west free

    2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Column
  68. _measurement host region _field _time _value mem A west free

    2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Record
  69. _measurement host region _field _time _value mem A west free

    2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Group Key _measurement=mem,host=A,region=west,_field=free
  70. _measurement host region _field _time _value mem A west free

    2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Every record has the same value! _measurement=mem,host=A,region=west,_field=free
  71. Table Per Series _measurement host region _field _time _value mem

    A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 11 _measurement host region _field _time _value mem B west free 2018-06-14T09:15:00 20 mem B west free 2018-06-14T09:14:50 22 _measurement host region _field _time _value cpu A west usage_user 2018-06-14T09:15:00 45 cpu A west usage_user 2018-06-14T09:14:50 49 _measurement host region _field _time _value cpu A west usage_system 2018-06-14T09:15:00 35 cpu A west usage_system 2018-06-14T09:14:50 38
  72. input tables -> function -> output tables

  73. input tables -> function -> output tables // example query

    from(bucket:"telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()
  74. input tables -> function -> output tables What to sum

    on? // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()
  75. input tables -> function -> output tables Default columns argument

    // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum(columns: [“_value”])
  76. input tables -> function -> output tables _meas ureme host

    region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 Input in table form // example query from(bucket:”telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()
  77. input tables -> function -> output tables _meas ureme host

    region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 sum() // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()
  78. input tables -> function -> output tables // example query

    from(bucket:"telegraf") |> range(start:2018-06-14T09:14:50, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum() _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 sum() _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 21 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 42
  79. N to N table mapping (1 to 1 mapping)

  80. N to M table mapping

  81. window // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn:

    r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) 30s of data (4 samples)
  82. window // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn:

    r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) split into 20s windows
  83. window _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) Input
  84. window _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s)
  85. window _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem A west free …14:40 11 _meas ureme host region _field _time _valu e mem B west free …14:50 23 mem B west free …15:00 24 _meas ureme host region _field _time _valu e mem B west free …14:30 20 mem B west free …14:40 22 _meas ureme host region _field _time _valu e mem A west free …14:50 12 mem A west free …15:00 13
  86. window _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem A west free …14:40 11 _meas ureme host region _field _time _valu e mem B west free …14:50 23 mem B west free …15:00 24 _meas ureme host region _field _time _valu e mem B west free …14:30 20 mem B west free …14:40 22 _meas ureme host region _field _time _valu e mem A west free …14:50 12 mem A west free …15:00 13 N to M tables
  87. Window based on time _start and _stop columns

  88. group // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn:

    r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"])
  89. group // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn:

    r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"]) new group key
  90. group _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"])
  91. group _meas host region _field _time _valu mem A west

    free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 group( keys: [“region”]) // example query from(bucket:"telegraf") |> range(start:2018-06-14T09:14:30, stop:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"]) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem B west free …14:30 20 mem A west free …14:40 11 mem B west free …14:40 21 mem A west free …14:50 12 mem B west free …14:50 22 mem B west free …15:00 13 mem B west free …15:00 23 N to M tables M == cardinality(group keys)
  92. Group based on columns

  93. New Language?

  94. 4GL

  95. Domain Specific Languages

  96. JavaScript?

  97. GUI

  98. Many Data Sources

  99. Optimize for each

  100. Cross compilation

  101. None
  102. AST = API

  103. Distributed Engine

  104. None
  105. Tables Everywhere

  106. from(bucket: "foo") |> range(start: -10m) |> filter(fn: (r) => r._measurement

    == "cpu") |> group(columns: ["_measurement"]) |> sort(columns: ["_value"]) Sorting by value!
  107. Group by anything

  108. Measurements, tags, fields don’t matter

  109. Beyond Queries

  110. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message)
  111. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) tasks
  112. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) cron scheduling
  113. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) packages & imports
  114. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) String interpolation
  115. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) Ship data elsewhere
  116. option task = { name: "email alert digest", cron: "0

    5 * * 0" } import "smtp" body = "" from(bucket: "alerts") |> range(start: -24h) |> filter(fn: (r) => (r.level == "warn" or r.level == "critical") and r._field == "message") |> group(columns: ["alert"]) |> count() |> group() |> map(fn: (r) => body = body + "Alert {r.alert} triggered {r._value} times\n") smtp.to( config: loadSecret(name: "smtp_digest"), to: "alerts@influxdata.com", title: "Alert digest for {now()}", body: message) Store secrets in a store like Vault
  117. Monitoring as Code

  118. None
  119. • Finalizing Spec • Error Handling • Test Runner &

    CLI • User Packages • Flow Control (if/else) Status
  120. Status • Alpha 7 this week • API, Tasks, Dashboards

    • Client Libraries (soon) • Monitoring & Alerting (soon)
  121. https://influxdata.com/download 2.0

  122. Thank you Paul Dix @pauldix paul@influxdata.com