Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Future of InfluxDB: API and Clustering

Paul Dix
October 15, 2014
520

Future of InfluxDB: API and Clustering

Talk I gave at the London Go Meetup

Paul Dix

October 15, 2014
Tweet

Transcript

  1. The future of InfluxDB: Clustering & API Paul Dix CEO

    & cofounder of InfluxDB paul@influxdb.com @pauldix
  2. select percentile(90, value) from response_times group by time(10m) where time

    > now() - 1d Aggregate functions require a group by time
  3. select derivative(value) from redis_keys where time > now() - 1d

    This function doesn’t require group by. It’s not an aggregate
  4. Retention Policies [! {! "name": "1_week",! "duration": "7d",! "replicationFactor": 1!

    },! {! "name": "6_months",! "duration": "182d",! "replicationFactor": 3! },! {! "name": "2_years",! "duration": "730d",! "replicationFactor": 3! }! ]!
  5. Data Structure Top level series name [! {! "name": "cpu_load",!

    "values" : [! {! "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]!
  6. Data Structure [! {! "name": "cpu_load",! "values" : [! {!

    "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]! Built in column Built in column
  7. Tags are hierarchical [! {! "name": "cpu_load",! "values" : [!

    {! "double": 89.0,! "tags": [“dataCenter/USWest/host/serverA”],! "time": 1412689241000! }! ]! }! ]!
  8. Write multiple series [! {! "values" : [! {! "name":

    "cpu_load",! "double": 89.0,! },! {! "name": "cpu_wait",! "double": 5! }! ],! "time": 1412689241000,! "tags": [“dataCenter/USWest/host/serverA”],! }! ]! Pull common values outside
  9. Sequence number optional [! {! "name": "events",! "values" : [!

    {! "bool": true,! "tags": [“type/click/userId/1”],! }! ],! "setSequenceNumber": true! }! ]! Server assigns
  10. Get defined tags select tags(cpu_load)! [! {! "name": "cpu_load",! "columns":

    ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result
  11. Get defined tags by time [! {! "name": "cpu_load",! "columns":

    ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result select tags(cpu_load)! where time > now() - 1h!
  12. Get tags for multiple series [! {! "name": "cpu_load",! "columns":

    ["tag"],! "values": [! ["host"],! ["region"]! ]! },! {! "name": "cpu_wait",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result select tags(cpu_load), tags(cpu_wait)!
  13. Get tag values [! {! "name": "cpu_load",! "columns": ["host"],! "values":

    [! ["serverA"],! ]! }! ]! Query Result select tag_values(cpu_load, host)!
  14. Get compound tag values [! {! "name": "cpu_load",! "columns": [“region","host"],!

    "values": [! [“us-east“,"serverA"],! [“us-east“,"serverB"],! [“us-west“,"serverC"]! ]! }! ]! Query Result select tag_values(cpu_load, region, host)!
  15. Filter by time and tag [! {! "name": "cpu_load",! "columns":

    [“region","host"],! "values": [! [“us-west“,"serverC"]! ]! }! ]! Query Result select tag_values(cpu_load, host)! where time > now() - 1h! and region = 'USWest'!
  16. How many unique series [! {! "name": "cpu_load",! "columns": ["count"],!

    "values": [! [10157]! ]! }! ]! Query Result select count(tag_values(tags(cpu_load)))!
  17. How many unique series [! {! "name": "cpu_load",! "columns": ["count"],!

    "values": [! [2241]! ]! }! ]! Query Result select count(tag_values(tags(cpu_load)))! where time > now() - 1h! and region = 'USWest'!
  18. Queries always scoped by retention! select tags(“6_months”.”cpu_load”)! [! {! "name":

    "cpu_load",! "columns": ["tag"],! "values": [! ["host"],! ["region"]! ]! }! ]! Query Result
  19. List names list names! ! -- or see the names

    for a given retention policy! list names for 6_month!
  20. Query raw data select cpu_load! where data_center = 'us-west' !

    and host = 'serverA' ! and time > now() - 1h! Query Result Always! default! column [! {! "name": "cpu_load",! "columns": ["double", “time"],! "values": [! [34.2, 1412805662],! ...! ]! }! ]!
  21. Query raw data select cpu_load! where data_center = 'us-west' !

    and host = 'serverA' ! and time > now() - 1h! Query Result Always! time [! {! "name": "cpu_load",! "columns": ["double", “time"],! "values": [! [34.2, 1412805662],! ...! ]! }! ]!
  22. Query raw data Query Result select log_lines.string! where time >

    now() - 10m! [! {! "name": "log_lines",! "columns": ["string", "time"],! "values": [! ["INFO: stuff here", 1412805662],! ...! ]! }! ]!
  23. Query raw data from other retention policy select 6_month.cpu_load! where

    data_center = 'us-west' ! and host = 'serverA' ! and time > now() - 7d! [! {! "name": "cpu_load",! "columns": [“double”, “time”],! "values": [! [34.2, 1412805600],! ...! ]! }! ]! Query Result
  24. Query raw data Query Result select log_lines.string! where time >

    now() - 10m! [! {! "name": "log_lines",! "columns": [“string”, “time”],! "values": [! ["INFO: stuff here”, 1412805600],! ...! ]! }! ]!
  25. Down sample on the fly Query Result select mean(cpu_load)! where

    data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)! [! {! "name": "cpu_load",! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!
  26. Down sample on the fly (merge all hosts) Query Result

    select mean(cpu_load)! where data_center = 'us-west'! and time > now() - 24h! group by time(10m)! [! {! "name": "cpu_load",! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!
  27. Down sample on the fly (expanding into many series) Query

    select mean(cpu_load)! where data_center = 'us-west'! and time > now() - 24h! expand by time(10m), host!
  28. Result [! {! "name": "cpu_load",! "tags": {! "data_center": "us-west"! "host":

    "serverA"! },! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! },! {! "name": "cpu_load",! "tags": {! "data_center": "us-west"! "host": "serverB"! },! "columns": ["double", "time"],! "values": [! [21.1, 1412805662]! ]! }! ]!
  29. Down sample on the fly Query select mean(cpu_load)! where data_center

    = ‘us-west'! and host in [“serverA”, “serverB”]! and time > now() - 24h! expand by time(10m), host!
  30. Down sample from multiple series at the same time Query

    select mean(cpu_load), max(cpu_wait)! where data_center = ‘us-west'! and host = 'serverA' and! time > now() - 24h! group by time(10m)!
  31. Down sample from multiple series at the same time Result

    [! {! "name": "cpu_load",! "columns": ["mean", "time"],! "values": [...]! },! {! "name": "cpu_wait",! "columns": ["max", "time"],! "values": [...]! }! ]!
  32. Get the top 10 hosts select mean(cpu_load)! where data_center =

    'us-west'! and time > now() - 30m! order by double desc! limit 10!
  33. Broker Data Node Data Node Broker Broker Any server Write

    Data Node Data Node Data Node Data Node
  34. Data Node Data Node Any server Data Node Data Node

    Data Node Data Node select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!
  35. Data Node Data Node Any server Data Node Data Node

    Data Node Data Node Compute Locally select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!
  36. Data Node Data Node Any server Data Node Data Node

    Data Node Data Node Send Summary Ticks select mean(cpu_load)! where data_center = 'us-west'! and host = 'serverA'! and time > now() - 24h! group by time(10m)!