Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Querying Prometheus with Flux

Paul Dix
August 09, 2018

Querying Prometheus with Flux

Talk given at PromCon 2018 where I introduce Flux (#fluxlang) and show how it can be used to query Prometheus servers.

Paul Dix

August 09, 2018
Tweet

More Decks by Paul Dix

Other Decks in Technology

Transcript

  1. Querying Prometheus with Flux
    (#fluxlang)
    Paul Dix

    @pauldix

    paul@influxdata.com

    View full-size slide

  2. • Data-scripting language

    • Functional

    • MIT Licensed

    • Language & Runtime/Engine

    View full-size slide

  3. Prometheus users: so what?

    View full-size slide

  4. High availability?

    View full-size slide

  5. Sharded Data?

    View full-size slide

  6. subqueries
    recording rules

    View full-size slide

  7. Ad hoc exporation

    View full-size slide

  8. Focus is Strength

    View full-size slide

  9. Saying No is an Asset

    View full-size slide

  10. Liberate the silo!

    View full-size slide

  11. Language Elements

    View full-size slide

  12. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")

    View full-size slide

  13. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Comments

    View full-size slide

  14. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Functions

    View full-size slide

  15. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system")
    Pipe forward operator

    View full-size slide

  16. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Named Arguments

    View full-size slide

  17. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    String Literal

    View full-size slide

  18. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Duration Literal (relative time)

    View full-size slide

  19. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:”2018-08-09T14:00:00Z“)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Time Literal

    View full-size slide

  20. // get all data from the telegraf db
    from(db:"telegraf")
    // filter that by the last hour
    |> range(start:-1h)
    // filter further by series with a specific measurement and field
    |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system")
    Anonymous Function

    View full-size slide

  21. Operators
    + == != ( )
    - < !~ [ ]
    * > =~ { }
    / <= = , :
    % >= <- . |>

    View full-size slide

  22. Types
    • int

    • uint

    • float64

    • string

    • duration

    • time
    • regex

    • array

    • object

    • function

    • namespace

    • table

    • table stream

    View full-size slide

  23. Ways to run Flux - (interpreter,
    fluxd api server, InfluxDB 1.7 & 2.0)

    View full-size slide

  24. Flux builder in Chronograf

    View full-size slide

  25. Flux builder in Grafana

    View full-size slide

  26. Flux is about:

    View full-size slide

  27. Time series in Prometheus

    View full-size slide

  28. // get data from Prometheus on http://localhost:9090
    fromProm(query:`node_cpu_seconds_total{cpu=“0”,mode=“idle”}`)
    // filter that by the last minute
    |> range(start:-1m)

    View full-size slide

  29. Multiple time series in
    Prometheus

    View full-size slide

  30. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=~”idle|user”}`)
    |> range(start:-1m)
    |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

    View full-size slide

  31. Tables are the base unit

    View full-size slide

  32. Not tied to a specific data
    model/schema

    View full-size slide

  33. Filter function

    View full-size slide

  34. fromProm()
    |> range(start:-1m)
    |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and
    r.mode == “idle” and
    r.cpu == “0”)
    |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

    View full-size slide

  35. fromProm()
    |> range(start:-1m)
    |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and
    r.mode in [“idle”, “user”] and
    r.cpu == “0”)
    |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

    View full-size slide

  36. Aggregate functions

    View full-size slide

  37. fromProm()
    |> range(start:-30s)
    |> filter(fn: (r) => r.__name__ == “node_cpu_seconds_total” and
    r.mode == “idle” and
    r.cpu =~ /0|1/)
    |> count()
    |> keep(columns: [“name”, “cpu”, “host”, “mode”, “_value”, “_time”])

    View full-size slide

  38. _start and _stop are about
    windows of data

    View full-size slide

  39. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)

    View full-size slide

  40. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> window(every: 20s)

    View full-size slide

  41. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> window(every: 20s)j
    |> min()

    View full-size slide

  42. fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> window(every: 20s)j
    |> min()
    |> window(every:inf)

    View full-size slide

  43. Window converts N tables to M
    tables based on time boundaries

    View full-size slide

  44. Group converts N tables to M
    tables based on values

    View full-size slide

  45. fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`)
    |> range(start: -1m)

    View full-size slide

  46. fromProm(query: `node_cpu_seconds_total{cpu=~“0|1”,mode=“idle”}`)
    |> range(start: -1m)
    |> group(columns: [“__name__”, “mode”])

    View full-size slide

  47. Nested range vectors
    fromProm(host:”http://localhost:9090")
    |> filter(fn: (r) => r.__name__ == "node_disk_written_bytes_total")
    |> range(start:-1h)
    // transform into non-negative derivative values
    |> derivative()
    // break those out into tables for each 10 minute block of time
    |> window(every:10m)
    // get the max rate of change in each 10 minute window
    |> max()
    // and put everything back into a single table
    |> window(every:inf)
    // and now let’s convert to KB
    |> map(fn: (r) => r._value / 1024.0)

    View full-size slide

  48. Multiple Servers
    dc1 = fromProm(host:”http://prom.dc1.local:9090")
    |> filter(fn: (r) => r.__name__ == “node_network_receive_bytes_total”)
    |> range(start:-1h)
    |> insertGroupKey(key: “dc”, value: “1”)
    dc2 = fromProm(host:”http://prom.dc2.local:9090")
    |> filter(fn: (r) => r.__name__ == “node_network_receive_bytes_total”)
    |> range(start:-1h)
    |> insertGroupKey(key: “dc”, value: “2”)
    dc1 |> union(streams: [dc2])
    |> limit(n: 2)
    |> derivative()
    |> group(columns: [“dc”])
    |> sum()

    View full-size slide

  49. Work with data from many sources
    • from() // influx

    • fromProm()

    • fromMySQL()

    • fromCSV()

    • fromS3()

    • …

    View full-size slide

  50. Defining Functions
    fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> window(every: 20s)j
    |> min()
    |> window(every:inf)

    View full-size slide

  51. Defining Functions
    windowAgg = (every, fn, <-stream) => {
    return stream |> window(every: every) |> fn() |> window(every:inf)
    }
    fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> windowAgg(every:20s, fn: min)

    View full-size slide

  52. Packages & Namespaces
    package “flux-helpers”
    windowAgg = (every, fn, <-stream) => {
    return stream |> window(every: every) |> fn() |> window(every:inf)
    }
    // in a new script
    import helpers “github.com/pauldix/flux-helpers"
    fromProm(query: `node_cpu_seconds_total{cpu=“0”,mode=“idle”}`
    |> range(start: -1m)
    |> helpers.windowAgg(every:20s, fn: min)

    View full-size slide

  53. Project Status
    • Everything in this talk is prototype (as of 2018-08-09)

    • Proposed Final Language Spec

    • Release flux, fluxd, InfluxDB 1.7, InfluxDB 2.0 alpha

    • Iterate with community to finalize spec

    • Optimizations!

    • https://github.com/influxdata/flux

    View full-size slide

  54. More complex Flux
    compilations to PromQL?

    View full-size slide

  55. PromQL parser for Flux
    engine?

    View full-size slide

  56. Add Flux into Prometheus?

    View full-size slide

  57. Arrow API for Prometheus

    View full-size slide

  58. Apache Arrow

    View full-size slide

  59. Stream from Prometheus

    View full-size slide

  60. Pushdown matcher and range

    View full-size slide

  61. Later pushdown more?

    View full-size slide

  62. Standardized Remote Read
    API?

    View full-size slide

  63. Arrow is becoming the lingua
    franca in data science and big data

    View full-size slide

  64. fromProm(query: `{__name__=~/node_.*/}`)
    |> range(start:-1h)
    |> toCSV(file: “node-data.csv”)
    |> toFeather(file: “node-data.feather”)

    View full-size slide

  65. Much more work to be done…

    View full-size slide

  66. Prometheus + Flux =
    Possibilities

    View full-size slide

  67. Thank you
    Paul Dix

    @pauldix

    paul@influxdata.com

    View full-size slide