Querying Prometheus with Flux

39b7a68b6cbc43ec7683ad0bcc4c9570?s=47 Paul Dix
August 09, 2018

Querying Prometheus with Flux

Talk given at PromCon 2018 where I introduce Flux (#fluxlang) and show how it can be used to query Prometheus servers.

39b7a68b6cbc43ec7683ad0bcc4c9570?s=128

Paul Dix

August 09, 2018
Tweet

Transcript

  1. Querying Prometheus with Flux (#fluxlang) Paul Dix @pauldix paul@influxdata.com

  2. None
  3. • Data-scripting language • Functional • MIT Licensed • Language

    & Runtime/Engine
  4. Prometheus users: so what?

  5. High availability?

  6. Sharded Data?

  7. Federation?

  8. None
  9. None
  10. None
  11. None
  12. subqueries

  13. None
  14. subqueries recording rules

  15. Ad hoc exporation

  16. None
  17. Focus is Strength

  18. Saying No is an Asset

  19. None
  20. Liberate the silo!

  21. None
  22. Language Elements

  23. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
  24. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Comments
  25. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Functions
  26. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system") Pipe forward operator
  27. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Named Arguments
  28. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") String Literal
  29. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Duration Literal (relative time)
  30. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:”2018-08-09T14:00:00Z“) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Time Literal
  31. // get all data from the telegraf db from(db:"telegraf") //

    filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") Anonymous Function
  32. Operators + == != ( ) - < !~ [

    ] * > =~ { } / <= = , : % >= <- . |>
  33. Types • int • uint • float64 • string •

    duration • time • regex • array • object • function • namespace • table • table stream
  34. Ways to run Flux - (interpreter, fluxd api server, InfluxDB

    1.7 & 2.0)
  35. Flux builder in Chronograf

  36. Flux builder in Grafana

  37. Flux is about:

  38. Time series in Prometheus

  39. None
  40. None
  41. // get data from Prometheus on http://localhost:9090 fromProm(query:`node_cpu_seconds_total{cpu=“0”,mode=“idle”}`) // filter

    that by the last minute |> range(start:-1m)
  42. None
  43. None
  44. None
  45. None
  46. None
  47. None
  48. None
  49. Multiple time series in Prometheus

  50. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=~”idle|user”}`) |> range(start:-1m) |> keep(columns: [“name”, “cpu”, “host”, “mode”,

    “_value”, “_time”])
  51. None
  52. None
  53. None
  54. None
  55. None
  56. Tables are the base unit

  57. Not tied to a specific data model/schema

  58. Filter function

  59. fromProm() |> range(start:-1m) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total”

    and r.mode == “idle” and r.cpu == “0”) |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])
  60. None
  61. None
  62. None
  63. None
  64. fromProm() |> range(start:-1m) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total”

    and r.mode in [“idle”, “user”] and r.cpu == “0”) |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])
  65. None
  66. None
  67. None
  68. Aggregate functions

  69. fromProm() |> range(start:-30s) |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total”

    and r.mode == “idle” and r.cpu =~ /0|1/) |> count() |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])
  70. None
  71. None
  72. None
  73. None
  74. None
  75. None
  76. _start and _stop are about windows of data

  77. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m)

  78. None
  79. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)

  80. None
  81. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j |> min()

  82. None
  83. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j |> min()

    |> window(every:inf)
  84. None
  85. Window converts N tables to M tables based on time

    boundaries
  86. Group converts N tables to M tables based on values

  87. fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`) |> range(start: -1m)

  88. None
  89. fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`) |> range(start: -1m) |> group(columns: [“__name__”, “mode”])

  90. None
  91. None
  92. None
  93. Nested range vectors fromProm(host:”http://localhost:9090") |> filter(fn: (r) => r.__name__ ==

    "node_disk_written_bytes_total") |> range(start:-1h) // transform into non-negative derivative values |> derivative() // break those out into tables for each 10 minute block of time |> window(every:10m) // get the max rate of change in each 10 minute window |> max() // and put everything back into a single table |> window(every:inf) // and now let’s convert to KB |> map(fn: (r) => r._value / 1024.0)
  94. Multiple Servers dc1 = fromProm(host:”http://prom.dc1.local:9090") |> filter(fn: (r) => r.__name__

    == “node_network_receive_bytes_total”) |> range(start:-1h) |> insertGroupKey(key: “dc”, value: “1”) dc2 = fromProm(host:”http://prom.dc2.local:9090") |> filter(fn: (r) => r.__name__ == “node_network_receive_bytes_total”) |> range(start:-1h) |> insertGroupKey(key: “dc”, value: “2”) dc1 |> union(streams: [dc2]) |> limit(n: 2) |> derivative() |> group(columns: [“dc”]) |> sum()
  95. Work with data from many sources • from() // influx

    • fromProm() • fromMySQL() • fromCSV() • fromS3() • …
  96. Defining Functions fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> window(every: 20s)j

    |> min() |> window(every:inf)
  97. Defining Functions windowAgg = (every, fn, <-stream) => { return

    stream |> window(every: every) |> fn() |> window(every:inf) } fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> windowAgg(every:20s, fn: min)
  98. Packages & Namespaces package “flux-helpers” windowAgg = (every, fn, <-stream)

    => { return stream |> window(every: every) |> fn() |> window(every:inf) } // in a new script import helpers “github.com/pauldix/flux-helpers" fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}` |> range(start: -1m) |> helpers.windowAgg(every:20s, fn: min)
  99. Project Status • Everything in this talk is prototype (as

    of 2018-08-09) • Proposed Final Language Spec • Release flux, fluxd, InfluxDB 1.7, InfluxDB 2.0 alpha • Iterate with community to finalize spec • Optimizations! • https://github.com/influxdata/flux
  100. Future work

  101. More complex Flux compilations to PromQL?

  102. PromQL parser for Flux engine?

  103. Add Flux into Prometheus?

  104. Arrow API for Prometheus

  105. Apache Arrow

  106. Stream from Prometheus

  107. Pushdown matcher and range

  108. Later pushdown more?

  109. Standardized Remote Read API?

  110. Arrow is becoming the lingua franca in data science and

    big data
  111. fromProm(query: `{__name__=~/node_.*/}`) |> range(start:-1h) |> toCSV(file: “node-data.csv”) |> toFeather(file: “node-data.feather”)

  112. Much more work to be done…

  113. Prometheus + Flux = Possibilities

  114. Thank you Paul Dix @pauldix paul@influxdata.com