Slide 1

Slide 1 text

Flux (#fluxlang): a new (time series) data scripting language Paul Dix @pauldix [email protected]

Slide 2

Slide 2 text

IFQL -> Flux

Slide 3

Slide 3 text

Data scripting language?

Slide 4

Slide 4 text

MIT License Language & Engine written in Go

Slide 5

Slide 5 text

Talk Structure • Why Flux? • Design & Structure • Motivating Examples

Slide 6

Slide 6 text

Why not SQL?

Slide 7

Slide 7 text

Relational Algebra

Slide 8

Slide 8 text

SQL isn’t the only interpretation!

Slide 9

Slide 9 text

QUEL & POSTGRESQUEL range of E is EMPLOYEE retrieve into W (COMP = E.Salary / (E.Age - 18)) where E.Name = "Jones" select (e.salary / (e.age - 18)) as comp from employee as e where e.name = "Jones" SQL

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

Inertia

Slide 12

Slide 12 text

Additions & Semantics

Slide 13

Slide 13 text

Functional FTW!

Slide 14

Slide 14 text

Rethink Programmer Productivity

Slide 15

Slide 15 text

Language > Query

Slide 16

Slide 16 text

Change Reality

Slide 17

Slide 17 text

Existing Language?

Slide 18

Slide 18 text

Haskell or Lisp!

Slide 19

Slide 19 text

Flux Design Principles

Slide 20

Slide 20 text

Useable

Slide 21

Slide 21 text

Make Everyone a Data Programmer!

Slide 22

Slide 22 text

Readable

Slide 23

Slide 23 text

Flexible

Slide 24

Slide 24 text

Composable

Slide 25

Slide 25 text

Testable

Slide 26

Slide 26 text

Contributable

Slide 27

Slide 27 text

Shareable

Slide 28

Slide 28 text

Beginning Examples

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

showMeasurements(db: "telegraf")

Slide 32

Slide 32 text

showMeasurements(db: "telegraf") Function

Slide 33

Slide 33 text

showMeasurements(db: "telegraf") Named Argument

Slide 34

Slide 34 text

showMeasurements(db: "telegraf") String Literal

Slide 35

Slide 35 text

showTagKeys(db: "telegraf", measurement: "cpu")

Slide 36

Slide 36 text

showTagKeys(db: "telegraf", measurement: "cpu") Named Arguments

Slide 37

Slide 37 text

showTagKeys(db: "telegraf", measurements: ["redis", "mysql"])

Slide 38

Slide 38 text

showTagKeys(db: "telegraf", measurements: ["redis", "mysql"]) Passing an array

Slide 39

Slide 39 text

showTagValues(db: "telegraf", tag: "host")

Slide 40

Slide 40 text

showFieldKeys(db:"telegraf", measurement:"cpu")

Slide 41

Slide 41 text

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system")

Slide 42

Slide 42 text

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system") Comments

Slide 43

Slide 43 text

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system") Duration Literal

Slide 44

Slide 44 text

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system") Pipe forward operator

Slide 45

Slide 45 text

// get all data from the telegraf db from(db:"telegraf") // filter that by the last hour |> range(start:-1h) // filter further by series with a specific measurement and field |> filter(fn: r => r._measurement == "cpu" and r._field == "usage_system") Anonymous Function

Slide 46

Slide 46 text

Operators + == != ( ) - < !~ [ ] * > =~ { } / <= = , : % >= <- . |>

Slide 47

Slide 47 text

Types • int • uint • float64 • string • duration • time • regex • array • object • function • namespace

Slide 48

Slide 48 text

Functions Overview

Slide 49

Slide 49 text

Inputs from, fromKafka, fromFile, fromS3, fromPrometheus, fromMySQL, etc.

Slide 50

Slide 50 text

Outputs to, toKafka, toFile, toS3, toPrometheus, toMySQL, etc.

Slide 51

Slide 51 text

Functions • count • covariance • cumulativeSum • derivative • difference • distinct • filter • first • from • group • integral • mean • min • percentile • range • sample • set • shift • skew • sort • spread • stateTracking • limit • map • max • window • yield • cov • highestMax • highestAverage • highestCurrent • lowestMin • join • last • stddev • sum • lowestAverage • lowestCurrent • pearsonR • stateCount • stateDuration • top • bottom

Slide 52

Slide 52 text

Flux ⊇ Graphite

Slide 53

Slide 53 text

Data Model

Slide 54

Slide 54 text

Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user

Slide 55

Slide 55 text

Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user Measurement

Slide 56

Slide 56 text

Example Series _measurement=mem,host=A,region=west,_field=free _measurement=mem,host=B,region=west,_field=free _measurement=cpu,host=A,region=west,_field=usage_system _measurement=cpu,host=A,region=west,_field=usage_user Field

Slide 57

Slide 57 text

Table _measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10

Slide 58

Slide 58 text

_measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Column

Slide 59

Slide 59 text

_measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Record

Slide 60

Slide 60 text

_measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Group Key _measurement=mem,host=A,region=west,_field=free

Slide 61

Slide 61 text

_measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 10 Every record has the same value! _measurement=mem,host=A,region=west,_field=free

Slide 62

Slide 62 text

Table Per Series _measurement host region _field _time _value mem A west free 2018-06-14T09:15:00 10 mem A west free 2018-06-14T09:14:50 11 _measurement host region _field _time _value mem B west free 2018-06-14T09:15:00 20 mem B west free 2018-06-14T09:14:50 22 _measurement host region _field _time _value cpu A west usage_user 2018-06-14T09:15:00 45 cpu A west usage_user 2018-06-14T09:14:50 49 _measurement host region _field _time _value cpu A west usage_system 2018-06-14T09:15:00 35 cpu A west usage_system 2018-06-14T09:14:50 38

Slide 63

Slide 63 text

input tables -> function -> output tables

Slide 64

Slide 64 text

input tables -> function -> output tables // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()

Slide 65

Slide 65 text

input tables -> function -> output tables DateTime Literal // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()

Slide 66

Slide 66 text

input tables -> function -> output tables What to sum on? // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()

Slide 67

Slide 67 text

input tables -> function -> output tables Default columns argument // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum(columns: [“_value”])

Slide 68

Slide 68 text

input tables -> function -> output tables _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 Input in table form // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()

Slide 69

Slide 69 text

input tables -> function -> output tables _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 sum() // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum()

Slide 70

Slide 70 text

input tables -> function -> output tables // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:50, start:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> sum() _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 10 mem A west free 2018-06- 14T09:1 11 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 20 mem B west free 2018-06- 14T09:14 22 sum() _meas ureme host region _field _time _valu e mem A west free 2018-06- 14T09:1 21 _meas ureme host region _field _time _valu e mem B west free 2018-06- 14T09:15 42

Slide 71

Slide 71 text

N to N table mapping (1 to 1 mapping)

Slide 72

Slide 72 text

N to M table mapping

Slide 73

Slide 73 text

window // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) 30s of data (4 samples)

Slide 74

Slide 74 text

window // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) split into 20s windows

Slide 75

Slide 75 text

window _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) Input

Slide 76

Slide 76 text

window _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s)

Slide 77

Slide 77 text

window _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem A west free …14:40 11 _meas ureme host region _field _time _valu e mem B west free …14:50 23 mem B west free …15:00 24 _meas ureme host region _field _time _valu e mem B west free …14:30 20 mem B west free …14:40 22 _meas ureme host region _field _time _valu e mem A west free …14:50 12 mem A west free …15:00 13

Slide 78

Slide 78 text

window _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 window( every:20s) // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> window(every:20s) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem A west free …14:40 11 _meas ureme host region _field _time _valu e mem B west free …14:50 23 mem B west free …15:00 24 _meas ureme host region _field _time _valu e mem B west free …14:30 20 mem B west free …14:40 22 _meas ureme host region _field _time _valu e mem A west free …14:50 12 mem A west free …15:00 13 N to M tables

Slide 79

Slide 79 text

Window based on time _start and _stop columns

Slide 80

Slide 80 text

group // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"])

Slide 81

Slide 81 text

group // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"]) new partition key

Slide 82

Slide 82 text

group _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"])

Slide 83

Slide 83 text

group _meas host region _field _time _valu mem A west free …14:30 10 mem A west free …14:40 11 mem A west free …14:50 12 mem A west free …15:00 13 _meas host region _field _time _valu mem B west free …14:30 20 mem B west free …14:40 22 mem B west free …14:50 23 mem B west free …15:00 24 group( keys: [“region”]) // example query from(db:"telegraf") |> range(start:2018-06-14T09:14:30, end:2018-06-14T09:15:01) |> filter(fn: r => r._measurement == “mem" and r._field == “free”) |> group(keys:[“region"]) _meas ureme host region _field _time _valu e mem A west free …14:30 10 mem B west free …14:30 20 mem A west free …14:40 11 mem B west free …14:40 21 mem A west free …14:50 12 mem B west free …14:50 22 mem B west free …15:00 13 mem B west free …15:00 23 N to M tables M == cardinality(group keys)

Slide 84

Slide 84 text

Group based on columns

Slide 85

Slide 85 text

Composable & Flexible

Slide 86

Slide 86 text

showTagValues(db: "telegraf", tag: "host")

Slide 87

Slide 87 text

New argument, same function definition showTagValues(db: "telegraf", tag: "host", startTime: 2018-06-14T09:15:00)

Slide 88

Slide 88 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"])

Slide 89

Slide 89 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"]) Assign function to variable

Slide 90

Slide 90 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"]) Specify default argument value to make optional

Slide 91

Slide 91 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"]) now function

Slide 92

Slide 92 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"]) pass function as argument

Slide 93

Slide 93 text

showTagValues = (db, tag, start=-1h, stop=now(), predicate=(r) => true) => from(db:db) |> range(start:start, stop:stop) |> filter(fn: predicate) |> group(by:[tag]) // get the distinct values for the tag |> distinct(column:tag) // collapse all tables into one |> group(none:true) // drop all columns except _value |> keep(columns: ["_value"])

Slide 94

Slide 94 text

showTagValues( db:"telegraf", tag:"host", predicate: (r) => r._measurement == "redis")

Slide 95

Slide 95 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value))

Slide 96

Slide 96 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value)) map function

Slide 97

Slide 97 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value)) float function

Slide 98

Slide 98 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value)) only named arguments!

Slide 99

Slide 99 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value)) make this a function?

Slide 100

Slide 100 text

Defining functions that take inputs castToFloat = (table=<-) { return table |> map(fn: (r) => float(v:r._value)) } user defined pipe forwardable function

Slide 101

Slide 101 text

Defining functions that take inputs // calling it from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> castToFloat()

Slide 102

Slide 102 text

Defining functions that take inputs // convert all values into floats from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> map(fn: (r) => float(v:r._value)) castToFloat = (table=<-) { return table |> map(fn: (r) => float(v:r._value)) } from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> castToFloat()

Slide 103

Slide 103 text

Any pipe forward function can use arguments min(table: range(start: -1h, table: filter(fn: (r) => r.host == "server0", table: from(db: "mydb"))))

Slide 104

Slide 104 text

Make you a Lisp

Slide 105

Slide 105 text

New Query Functionality finally getting to those feature requests!

Slide 106

Slide 106 text

No content

Slide 107

Slide 107 text

Math across measurements foo = from(db: "mydb") |> filter(fn: (r) => r._measurement == "foo") |> range(start: -1h) bar = from(db: "mydb") |> filter(fn: (r) => r._measurement == "bar") |> range(start: -1h) join( tables: {foo:foo, bar:bar}, on: [“foobar”, “_time”], fn: (t) => t.foo._value + t.bar._value, ) |> yield(name: "foobar")

Slide 108

Slide 108 text

No content

Slide 109

Slide 109 text

Having Query from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu" and r._field == “usage_system”) |> range(start:-1h) |> window(every:10m) |> mean() // this is the having part |> filter(fn: (r) => r._value > 90)

Slide 110

Slide 110 text

Shareable

Slide 111

Slide 111 text

Imports and Namespaces import "math" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> math.square()

Slide 112

Slide 112 text

Imports and Namespaces import "math" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> math.square() namespace

Slide 113

Slide 113 text

Package Manager

Slide 114

Slide 114 text

Imports and Namespaces import “pauldix/math" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> math.square() Username like RubyGems

Slide 115

Slide 115 text

Public Package Repository (like RubyGems, npm, etc.)

Slide 116

Slide 116 text

Imports and Namespaces import “github.com/pauldix/math” from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> math.square() Or from Github

Slide 117

Slide 117 text

Difficult SQL Queries

Slide 118

Slide 118 text

Exponential Moving Average from(db:"telegraf") |> range(start:-1h) |> filter(fn: (r) => r._measurement == "foo") |> exponentialMovingAverage(size:-10s)

Slide 119

Slide 119 text

SQL rolling average select id, temp, avg(temp) over (partition by group_nr order by time_read) as rolling_avg from ( select id, temp, time_read, interval_group, id - row_number() over (partition by interval_group order by time_read) as group_nr from ( select id, time_read, 'epoch'::timestamp + '900 seconds'::interval * (extract(epoch from time_read)::int4 / 900) as interval_group, temp from readings ) t1 ) t2 order by time_read;

Slide 120

Slide 120 text

Alerting import “alert” from(db:"telegraf") |> range(start: -1m) |> filter(fn: (r) => r._measurement == "work_queue" and r._field == "depth") |> mean() |> alert.track( warn: (r) => r._value > 200, crit: (r) => r._value > 500) |> alert.limit(duration:1m) |> toSlack(config: loadConfig(key: "slack"))

Slide 121

Slide 121 text

Wrap up

Slide 122

Slide 122 text

Get the nightlies! InfluxDB, Flux, Chronograf http://influxdata.com/download

Slide 123

Slide 123 text

Get the code, file issues! https://github.com/influxdata/platform

Slide 124

Slide 124 text

SQL is a great thing

Slide 125

Slide 125 text

But it’s not the only thing

Slide 126

Slide 126 text

Thank you Paul Dix @pauldix paul@influxdata.com