Slide 1

Slide 1 text

IFQL and the future of InfluxData Paul Dix Founder & CTO @pauldix paul@influxdata.com

Slide 2

Slide 2 text

Evolution of a query language…

Slide 3

Slide 3 text

REST API

Slide 4

Slide 4 text

SQL-ish

Slide 5

Slide 5 text

Vaguely Familiar select percentile(90, value) from cpu where time > now() - 1d and “host” = ‘serverA’ group by time(10m)

Slide 6

Slide 6 text

0.8 -> 0.9 Breaking API change, addition of tags

Slide 7

Slide 7 text

Functional or SQL?

Slide 8

Slide 8 text

Afraid to switch…

Slide 9

Slide 9 text

No content

Slide 10

Slide 10 text

No content

Slide 11

Slide 11 text

No content

Slide 12

Slide 12 text

No content

Slide 13

Slide 13 text

No content

Slide 14

Slide 14 text

No content

Slide 15

Slide 15 text

No content

Slide 16

Slide 16 text

Difficult to improve & change

Slide 17

Slide 17 text

It’s not SQL!

Slide 18

Slide 18 text

Kapacitor Fall of 2015

Slide 19

Slide 19 text

Kapacitor’s TICKscript stream |from() .database('telegraf') .measurement('cpu') .groupBy(*) |window() .period(5m) .every(5m) .align() |mean('usage_idle') .as('usage_idle') |influxDBOut() .database('telegraf') .retentionPolicy('autogen') .measurement('mean_cpu_idle') .precision('s')

Slide 20

Slide 20 text

Hard to debug

Slide 21

Slide 21 text

Steep learning curve

Slide 22

Slide 22 text

Not Recomposable

Slide 23

Slide 23 text

Second Language

Slide 24

Slide 24 text

Rethinking Everything

Slide 25

Slide 25 text

Kapacitor is Background Processing Stream or Batch

Slide 26

Slide 26 text

InfluxDB is batch interactive

Slide 27

Slide 27 text

IFQL and unified API Building towards 2.0

Slide 28

Slide 28 text

Project Goals Photo by Glen Carrie on Unsplash

Slide 29

Slide 29 text

One Language to Unite!

Slide 30

Slide 30 text

Feature Velocity

Slide 31

Slide 31 text

Decouple storage from compute

Slide 32

Slide 32 text

Iterate & deploy more frequently

Slide 33

Slide 33 text

Scale independently

Slide 34

Slide 34 text

Workload Isolation

Slide 35

Slide 35 text

No content

Slide 36

Slide 36 text

Decouple language from engine

Slide 37

Slide 37 text

{ "operations": [ { "id": "select0", "kind": "select", "spec": { "database": "foo", "hosts": null } }, { "id": "where1", "kind": "where", "spec": { "expression": { "root": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "==", "left": { "type": "reference", "name": "_measurement", "kind": "tag" }, "right": { "type": "stringLiteral", "value": "cpu" } }, Query represented as DAG in JSON

Slide 38

Slide 38 text

No content

Slide 39

Slide 39 text

A Data Language

Slide 40

Slide 40 text

Design Philosophy

Slide 41

Slide 41 text

UI for Many because no one wants to actually write a query

Slide 42

Slide 42 text

Readability over terseness

Slide 43

Slide 43 text

Flexible add to language easily

Slide 44

Slide 44 text

Testable new functions and user queries

Slide 45

Slide 45 text

Easy to Contribute inspiration from Telegraf

Slide 46

Slide 46 text

Code Sharing & Reuse no code > code

Slide 47

Slide 47 text

A few examples

Slide 48

Slide 48 text

// get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last()

Slide 49

Slide 49 text

// get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_system cpu server0 east 60.6284 Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_user cpu server0 east 39.3716

Slide 50

Slide 50 text

// get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m)

Slide 51

Slide 51 text

// get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:01:45.677502014Z, 2018-02-12T16:02:45.677502014Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T16:01:50.000000000Z usage_user cpu server0 east 50.549 2018-02-12T16:02:00.000000000Z usage_user cpu server0 east 35.4458 2018-02-12T16:02:10.000000000Z usage_user cpu server0 east 30.0493 2018-02-12T16:02:20.000000000Z usage_user cpu server0 east 44.3378 2018-02-12T16:02:30.000000000Z usage_user cpu server0 east 11.1584 2018-02-12T16:02:40.000000000Z usage_user cpu server0 east 46.712

Slide 52

Slide 52 text

// get the mean in 10m intervals of last hour from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu") |> range(start:-1h) |> window(every:15m) |> mean() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_user cpu server0 east 50.72841444444444 2018-02-12T15:43:41.128654848Z usage_user cpu server0 east 51.19163333333333 2018-02-12T15:13:41.128654848Z usage_user cpu server0 east 45.5091088235294 2018-02-12T15:58:41.128654848Z usage_user cpu server0 east 49.65145555555555 2018-02-12T16:05:06.708945484Z usage_user cpu server0 east 46.41292368421052 Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_system cpu server0 east 49.27158555555556 2018-02-12T15:58:41.128654848Z usage_system cpu server0 east 50.34854444444444 2018-02-12T16:05:06.708945484Z usage_system cpu server0 east 53.58707631578949 2018-02-12T15:13:41.128654848Z usage_system cpu server0 east 54.49089117647058 2018-02-12T15:43:41.128654848Z usage_system cpu server0 east 48.808366666666664

Slide 53

Slide 53 text

Elements of IFQL

Slide 54

Slide 54 text

Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m)

Slide 55

Slide 55 text

Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) built in functions

Slide 56

Slide 56 text

Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous functions

Slide 57

Slide 57 text

Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe forward operator

Slide 58

Slide 58 text

Named Parameters // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) named parameters only!

Slide 59

Slide 59 text

Readability

Slide 60

Slide 60 text

Flexibility

Slide 61

Slide 61 text

Functions have inputs & outputs

Slide 62

Slide 62 text

Testability

Slide 63

Slide 63 text

Builder

Slide 64

Slide 64 text

Inputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) no input

Slide 65

Slide 65 text

Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) output is entire db

Slide 66

Slide 66 text

Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe that output to filter

Slide 67

Slide 67 text

Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous filter function input is a single record {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 68

Slide 68 text

Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) A record looks like a flat object or row in a table {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 69

Slide 69 text

Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) tag key {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 70

Slide 70 text

Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start:-1m) same as before {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 71

Slide 71 text

Special Properties starts with _ reserved for system attributes from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 72

Slide 72 text

Special Properties works other way from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r._measurement == "cpu" and r._field == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 73

Slide 73 text

Special Properties _measurement and _field present for all InfluxDB data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 74

Slide 74 text

Special Properties _value exists in all series from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" and r[“_value"] > 50.0) |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}

Slide 75

Slide 75 text

Filter function output // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter function output is a boolean to determine if record is in set

Slide 76

Slide 76 text

Filter Operators // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) != =~ !~ in

Slide 77

Slide 77 text

Filter Boolean Logic // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => (r[“host"] == “server0" or r[“host"] == “server1") and r[“_measurement”] == “cpu") |> range(start:-1m) parens for precedence

Slide 78

Slide 78 text

Function with explicit return // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => {return r[“host"] == “server0"}) |> range(start:-1m) long hand function definition

Slide 79

Slide 79 text

Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter output is set of data matching filter function

Slide 80

Slide 80 text

Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) piped to range which further filters by a time range

Slide 81

Slide 81 text

Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) range output is the final query result

Slide 82

Slide 82 text

Function Isolation (but the planner may do otherwise)

Slide 83

Slide 83 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()

Slide 84

Slide 84 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() range and filter switched

Slide 85

Slide 85 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() results the same Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T17:52:02.322301856Z, 2018-02-12T17:53:02.322301856Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T17:53:02.322301856Z usage_user cpu server0 east 97.3174

Slide 86

Slide 86 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() is this the same as the top two? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)

Slide 87

Slide 87 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() moving max to here changes semantics from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)

Slide 88

Slide 88 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on only the last minute of data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)

Slide 89

Slide 89 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on data for all time from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)

Slide 90

Slide 90 text

Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() then that result is filtered down to the last minute (which will likely be empty) from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)

Slide 91

Slide 91 text

Planner Optimizes maintains query semantics

Slide 92

Slide 92 text

Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()

Slide 93

Slide 93 text

Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() this is more efficient

Slide 94

Slide 94 text

Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() query DAG different plan DAG same as one on left

Slide 95

Slide 95 text

Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user” r[“_value"] > 22.0) |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" r[“_value"] > 22.0) |> max() this does a full table scan

Slide 96

Slide 96 text

Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last()

Slide 97

Slide 97 text

Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last() anonymous filter function closure over surrounding context

Slide 98

Slide 98 text

User Defined Functions db = "mydb" measurement = “cpu" fn = (r) => r._measurement == measurement and r.host == "server0" from(db:db) |> filter(fn: fn) |> last() assign function to variable fn

Slide 99

Slide 99 text

User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h)

Slide 100

Slide 100 text

User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h) get rid of some common boilerplate?

Slide 101

Slide 101 text

User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) }

Slide 102

Slide 102 text

User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(db: "mydb", m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)

Slide 103

Slide 103 text

User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h) throws error error calling function "select": missing required keyword argument "db"

Slide 104

Slide 104 text

Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)

Slide 105

Slide 105 text

Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)

Slide 106

Slide 106 text

Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean")

Slide 107

Slide 107 text

Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean") Result: min Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:55:55.487457216Z, 2018-02-12T20:55:55.487457216Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- name

Slide 108

Slide 108 text

User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last()

Slide 109

Slide 109 text

User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() takes a table from a pipe forward by default

Slide 110

Slide 110 text

User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() calling it, then chaining

Slide 111

Slide 111 text

Passing as Argument mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } sending the from as argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()

Slide 112

Slide 112 text

Passing as Argument mf = (m, f, table=<-) => filter(fn: (r) => r._measurement == m and r._field == f, table: table) rewrite the function to use argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()

Slide 113

Slide 113 text

Any pipe forward function can use arguments min(table: range(start: -1h, table: filter(fn: (r) => r.host == "server0", table: from(db: "mydb"))))

Slide 114

Slide 114 text

Make you a Lisp

Slide 115

Slide 115 text

Easy to add Functions like plugins in Telegraf

Slide 116

Slide 116 text

code file

Slide 117

Slide 117 text

test file

Slide 118

Slide 118 text

package functions import ( "fmt" "github.com/influxdata/ifql/ifql" "github.com/influxdata/ifql/query" "github.com/influxdata/ifql/query/execute" "github.com/influxdata/ifql/query/plan" ) const CountKind = "count" type CountOpSpec struct { } func init() { ifql.RegisterFunction(CountKind, createCountOpSpec) query.RegisterOpSpec(CountKind, newCountOp) plan.RegisterProcedureSpec(CountKind, newCountProcedure, CountKind) execute.RegisterTransformation(CountKind, createCountTransformation) } func createCountOpSpec(args map[string]ifql.Value, ctx ifql.Context) (query.OperationSpec, error) { if len(args) != 0 { return nil, fmt.Errorf(`count function requires no arguments`) } return new(CountOpSpec), nil } func newCountOp() query.OperationSpec { return new(CountOpSpec) } func (s *CountOpSpec) Kind() query.OperationKind { return CountKind }

Slide 119

Slide 119 text

type CountProcedureSpec struct { } func newCountProcedure(query.OperationSpec) (plan.ProcedureSpec, error) { return new(CountProcedureSpec), nil } func (s *CountProcedureSpec) Kind() plan.ProcedureKind { return CountKind } func (s *CountProcedureSpec) Copy() plan.ProcedureSpec { return new(CountProcedureSpec) } func (s *CountProcedureSpec) PushDownRule() plan.PushDownRule { return plan.PushDownRule{ Root: SelectKind, Through: nil, } } func (s *CountProcedureSpec) PushDown(root *plan.Procedure, dup func() *plan.Procedure) { selectSpec := root.Spec.(*SelectProcedureSpec) if selectSpec.AggregateSet { root = dup() selectSpec = root.Spec.(*SelectProcedureSpec) selectSpec.AggregateSet = false selectSpec.AggregateType = "" return } selectSpec.AggregateSet = true selectSpec.AggregateType = CountKind }

Slide 120

Slide 120 text

type CountAgg struct { count int64 } func createCountTransformation(id execute.DatasetID, mode execute.AccumulationMode, spec plan.ProcedureSpec, ctx execute.Context (execute.Transformation, execute.Dataset, error) { t, d := execute.NewAggregateTransformationAndDataset(id, mode, ctx.Bounds(), new(CountAgg)) return t, d, nil } func (a *CountAgg) DoBool(vs []bool) { a.count += int64(len(vs)) } func (a *CountAgg) DoUInt(vs []uint64) { a.count += int64(len(vs)) } func (a *CountAgg) DoInt(vs []int64) { a.count += int64(len(vs)) } func (a *CountAgg) DoFloat(vs []float64) { a.count += int64(len(vs)) } func (a *CountAgg) DoString(vs []string) { a.count += int64(len(vs)) } func (a *CountAgg) Type() execute.DataType { return execute.TInt } func (a *CountAgg) ValueInt() int64 { return a.count }

Slide 121

Slide 121 text

Defines parser, validation, execution

Slide 122

Slide 122 text

Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) shortcut for this?

Slide 123

Slide 123 text

Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) square = (table=<-) { table |> map(fn: (r) => r._value * r._value) }

Slide 124

Slide 124 text

Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square()

Slide 125

Slide 125 text

Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square() namespace

Slide 126

Slide 126 text

MOAR EXAMPLES!

Slide 127

Slide 127 text

Math across measurements foo = from(db: "mydb") |> filter(fn: (r) => r._measurement == "foo") |> range(start: -1h) bar = from(db: "mydb") |> filter(fn: (r) => r._measurement == "bar") |> range(start: -1h) join( tables: {foo:foo, bar:bar}, fn: (t) => t.foo._value + t.bar._value) |> yield(name: "foobar")

Slide 128

Slide 128 text

Having Query from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start:-1h) |> window(every:10m) |> mean() // this is the having part |> filter(fn: (r) => r._value > 90)

Slide 129

Slide 129 text

Grouping // group - average utilization across regions from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") |> range(start: -1h) |> group(by: ["region"]) |> window(every:10m) |> mean()

Slide 130

Slide 130 text

Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: "host")

Slide 131

Slide 131 text

Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")

Slide 132

Slide 132 text

Get Metadata tagValues = (table=<-) => table |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")

Slide 133

Slide 133 text

Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: “host") |> count()

Slide 134

Slide 134 text

Functions Implemented as IFQL // _sortLimit is a helper function, which sorts // and limits a table. _sortLimit = (n, desc, cols=["_value"], table=<-) => table |> sort(cols:cols, desc:desc) |> limit(n:n) // top sorts a table by cols and keeps only the top n records. top = (n, cols=["_value"], table=<-) => _sortLimit(table:table, n:n, cols:cols, desc:true)

Slide 135

Slide 135 text

Project Status and Timeline

Slide 136

Slide 136 text

API 2.0 Work Lock down query request/response format

Slide 137

Slide 137 text

Apache Arrow

Slide 138

Slide 138 text

We’re contributing the Go implementation! https://github.com/influxdata/arrow

Slide 139

Slide 139 text

Finalize Language (a few months or so)

Slide 140

Slide 140 text

Ship with Enterprise 1.6 (summertime)

Slide 141

Slide 141 text

Hack & workshop day tomorrow! Ask the registration desk today

Slide 142

Slide 142 text

Thank you! Paul Dix paul@influxdata.com @pauldix