Slide 1

Slide 1 text

The future of InfluxDB: Clustering & API Paul Dix CEO & cofounder of InfluxDB paul@influxdb.com @pauldix

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

How many of you have looked at InfluxDB?

Slide 4

Slide 4 text

What it’s for…

Slide 5

Slide 5 text

Metrics

Slide 6

Slide 6 text

Time Series

Slide 7

Slide 7 text

Analytics

Slide 8

Slide 8 text

Events

Slide 9

Slide 9 text

Use Cases

Slide 10

Slide 10 text

DevOps

Slide 11

Slide 11 text

Real-time analytics (user & business)

Slide 12

Slide 12 text

Sensor Data

Slide 13

Slide 13 text

End Intro

Slide 14

Slide 14 text

Influx API v1

Slide 15

Slide 15 text

SQL-ish

Slide 16

Slide 16 text

A query language is an API

Slide 17

Slide 17 text

select percentile(90, value) from response_times group by time(10m) where time > now() - 1d

Slide 18

Slide 18 text

select percentile(90, value) from response_times group by time(10m) where time > now() - 1d Aggregate functions require a group by time

Slide 19

Slide 19 text

What about transformations? Derivative, moving average, sampling

Slide 20

Slide 20 text

select derivative(value) from redis_keys_count where time > now() - 1d

Slide 21

Slide 21 text

select derivative(value) from redis_keys where time > now() - 1d This function doesn’t require group by. It’s not an aggregate

Slide 22

Slide 22 text

Or where clauses on results?

Slide 23

Slide 23 text

select percentile(90, value) from response_times group by time(10m) having percentile > 1000 where time > now() - 1d

Slide 24

Slide 24 text

Filling in missing measurements?

Slide 25

Slide 25 text

Doesn’t feel right

Slide 26

Slide 26 text

Influx API v2

Slide 27

Slide 27 text

Totally unfinished ideas. ! We need your feedback!

Slide 28

Slide 28 text

Databases {! "name": "paulDB",! "retentionPolicies": [! "..."! ],! "defaultRetentionPolicy": "1_week"! }!

Slide 29

Slide 29 text

Retention Policies [! {! "name": "1_week",! "duration": "7d",! "replicationFactor": 1! },! {! "name": "6_months",! "duration": "182d",! "replicationFactor": 3! },! {! "name": "2_years",! "duration": "730d",! "replicationFactor": 3! }! ]!

Slide 30

Slide 30 text

Data Structure Top level series name [! {! "name": "cpu_load",! "values" : [! {! "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]!

Slide 31

Slide 31 text

Data Structure [! {! "name": "cpu_load",! "values" : [! {! "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]! Built in column Built in column

Slide 32

Slide 32 text

Tags are hierarchical [! {! "name": "cpu_load",! "values" : [! {! "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]!

Slide 33

Slide 33 text

Write multiple series [! {! "values" : [! {! "name": "cpu_load",! "double": 89.0,! },! {! "name": "cpu_wait",! "double": 5! }! ],! "time": 1412689241000,! "tags": [“dataCenter/USWest/host/serverA”],! }! ]! Pull common values outside

Slide 34

Slide 34 text

More closely mirrors the sensor and DevOps use case

Slide 35

Slide 35 text

Sequence number optional [! {! "name": "events",! "values" : [! {! "bool": true,! "tags": [“type/click/userId/1”],! }! ],! "setSequenceNumber": true! }! ]! Server assigns

Slide 36

Slide 36 text

Can still have irregular events that occur at the same time in the same series

Slide 37

Slide 37 text

Data types Double, bool, string, bytes

Slide 38

Slide 38 text

Series now only have one value and tags

Slide 39

Slide 39 text

Tags are indexed!

Slide 40

Slide 40 text

Get defined tags select tags(cpu_load)! [! {! "name": "cpu_load",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result

Slide 41

Slide 41 text

Get defined tags by time [! {! "name": "cpu_load",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result select tags(cpu_load)! where time > now() - 1h!

Slide 42

Slide 42 text

Get tags for multiple series [! {! "name": "cpu_load",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! },! {! "name": "cpu_wait",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result select tags(cpu_load), tags(cpu_wait)!

Slide 43

Slide 43 text

Get tag values [! {! "name": "cpu_load",! "columns": ["host"],! "values": [! ["serverA"],! ]! }! ]! Query Result select tag_values(cpu_load, host)!

Slide 44

Slide 44 text

Get compound tag values [! {! "name": "cpu_load",! "columns": [“region","host"],! "values": [! [“us-east“,"serverA"],! [“us-east“,"serverB"],! [“us-west“,"serverC"]! ]! }! ]! Query Result select tag_values(cpu_load, region, host)!

Slide 45

Slide 45 text

Filter by time and tag [! {! "name": "cpu_load",! "columns": [“region","host"],! "values": [! [“us-west“,"serverC"]! ]! }! ]! Query Result select tag_values(cpu_load, host)! where time > now() - 1h! and region = 'USWest'!

Slide 46

Slide 46 text

How many unique series [! {! "name": "cpu_load",! "columns": ["count"],! "values": [! [10157]! ]! }! ]! Query Result select count(tag_values(tags(cpu_load)))!

Slide 47

Slide 47 text

How many unique series [! {! "name": "cpu_load",! "columns": ["count"],! "values": [! [2241]! ]! }! ]! Query Result select count(tag_values(tags(cpu_load)))! where time > now() - 1h! and region = 'USWest'!

Slide 48

Slide 48 text

Queries always scoped by retention! select tags(“6_months”.”cpu_load”)! [! {! "name": "cpu_load",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result

Slide 49

Slide 49 text

Database default {! "name": "paulDB",! "retentionPolicies": [! "..."! ],! "defaultRetentionPolicy": "1_week"! }!

Slide 50

Slide 50 text

Example Queries

Slide 51

Slide 51 text

List names list names! ! -- or see the names for a given retention policy! list names for 6_month!

Slide 52

Slide 52 text

Query raw data select cpu_load! where data_center = 'us-west' ! and host = 'serverA' ! and time > now() - 1h! Query Result Always! default! column [! {! "name": "cpu_load",! "columns": ["double", “time"],! "values": [! [34.2, 1412805662],! ...! ]! }! ]!

Slide 53

Slide 53 text

Query raw data select cpu_load! where data_center = 'us-west' ! and host = 'serverA' ! and time > now() - 1h! Query Result Always! time [! {! "name": "cpu_load",! "columns": ["double", “time"],! "values": [! [34.2, 1412805662],! ...! ]! }! ]!

Slide 54

Slide 54 text

Query raw data Query Result select log_lines.string! where time > now() - 10m! [! {! "name": "log_lines",! "columns": ["string", "time"],! "values": [! ["INFO: stuff here", 1412805662],! ...! ]! }! ]!

Slide 55

Slide 55 text

Query raw data from other retention policy select 6_month.cpu_load! where data_center = 'us-west' ! and host = 'serverA' ! and time > now() - 7d! [! {! "name": "cpu_load",! "columns": [“double”, “time”],! "values": [! [34.2, 1412805600],! ...! ]! }! ]! Query Result

Slide 56

Slide 56 text

Query raw data Query Result select log_lines.string! where time > now() - 10m! [! {! "name": "log_lines",! "columns": [“string”, “time”],! "values": [! ["INFO: stuff here”, 1412805600],! ...! ]! }! ]!

Slide 57

Slide 57 text

Down sample on the fly Query Result select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)! [! {! "name": "cpu_load",! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!

Slide 58

Slide 58 text

Down sample on the fly (merge all hosts) Query Result select mean(cpu_load)! where data_center = 'us-west'! and time > now() - 24h! group by time(10m)! [! {! "name": "cpu_load",! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!

Slide 59

Slide 59 text

Down sample on the fly (expanding into many series) Query select mean(cpu_load)! where data_center = 'us-west'! and time > now() - 24h! expand by time(10m), host!

Slide 60

Slide 60 text

Result [! {! "name": "cpu_load",! "tags": {! "data_center": "us-west"! "host": "serverA"! },! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! },! {! "name": "cpu_load",! "tags": {! "data_center": "us-west"! "host": "serverB"! },! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!

Slide 61

Slide 61 text

Down sample on the fly Query select mean(cpu_load)! where data_center = ‘us-west'! and host in [“serverA”, “serverB”]! and time > now() - 24h! expand by time(10m), host!

Slide 62

Slide 62 text

Down sample from multiple series at the same time Query select mean(cpu_load), max(cpu_wait)! where data_center = ‘us-west'! and host = 'serverA' and! time > now() - 24h! group by time(10m)!

Slide 63

Slide 63 text

Down sample from multiple series at the same time Result [! {! "name": "cpu_load",! "columns": ["mean", "time"],! "values": [...]! },! {! "name": "cpu_wait",! "columns": ["max", "time"],! "values": [...]! }! ]!

Slide 64

Slide 64 text

Query by regex select log_lines.string! where string =~ /error/i! and time > now() - 4h!

Slide 65

Slide 65 text

Query by regex against tag select log_lines.string! where application =~ /^ruby.*/! and time > now() - 1h!

Slide 66

Slide 66 text

Get the top 10 hosts select mean(cpu_load)! where data_center = 'us-west'! and time > now() - 30m! order by double desc! limit 10!

Slide 67

Slide 67 text

These ideas are already outdated…

Slide 68

Slide 68 text

Pulling back to concepts

Slide 69

Slide 69 text

Selecting series

Slide 70

Slide 70 text

filtering

Slide 71

Slide 71 text

transforming

Slide 72

Slide 72 text

merging

Slide 73

Slide 73 text

splitting

Slide 74

Slide 74 text

functional approach seems better

Slide 75

Slide 75 text

like jQuery chaining

Slide 76

Slide 76 text

split("cpu_load", on:"host", where: "dataCenter"="USWest") .percentile(90, window: "10m") .filter(>95) .limit(1) .merge() .limit(10)

Slide 77

Slide 77 text

Just ideas

Slide 78

Slide 78 text

Clustering

Slide 79

Slide 79 text

Two Parts

Slide 80

Slide 80 text

Broker

Slide 81

Slide 81 text

Data Node

Slide 82

Slide 82 text

How writes work

Slide 83

Slide 83 text

Any server Write

Slide 84

Slide 84 text

Broker Broker Broker Any server Write Streaming Raft Cluster

Slide 85

Slide 85 text

Writes are CP

Slide 86

Slide 86 text

Broker Data Node Broker Broker Any server Write

Slide 87

Slide 87 text

Broker Data Node Data Node Broker Broker Any server Write If replication factor = 2

Slide 88

Slide 88 text

Broker Data Node Data Node Broker Broker Any server Write Data Node Data Node Data Node Data Node

Slide 89

Slide 89 text

How Queries Work

Slide 90

Slide 90 text

Data Node Data Node Any server Data Node Data Node Data Node Data Node select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!

Slide 91

Slide 91 text

Data Node Data Node Any server Data Node Data Node Data Node Data Node Compute Locally select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!

Slide 92

Slide 92 text

Data Node Data Node Any server Data Node Data Node Data Node Data Node Send Summary Ticks select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!

Slide 93

Slide 93 text

Clustering Goal: 1-2M values per second

Slide 94

Slide 94 text

Potential Cluster Size: 3-5 Brokers 50 Data Nodes

Slide 95

Slide 95 text

We’re working on these now, feedback welcome!

Slide 96

Slide 96 text

Thanks Paul Dix paul@influxdb.com @pauldix