Slide 1

Slide 1 text

Building Custom Analytics with InfluxDB and D3 Paul Dix [email protected] @pauldix

Slide 2

Slide 2 text

InfluxDB An open source distributed analytics database.

Slide 3

Slide 3 text

Analytics Database? ● Metrics, events, time series ● Infrastructure metrics ● Application performance ● User events/analytics ● Business analytics

Slide 4

Slide 4 text

Events ● Measurements ● Exceptions ● Page Views ● User actions ● Commits ● Deploys

Slide 5

Slide 5 text

Metrics or measurements ● CPU load ● Memory usage ● Average response times ● Counts over fixed intervals

Slide 6

Slide 6 text

Things you want to know about over time

Slide 7

Slide 7 text

InfluxDB ● Written in Go ● No external dependencies ● 3 full time developers (YC W13)

Slide 8

Slide 8 text

How data is organized ● Databases (like in MySQL, Postgres, etc) ● Time series ○ like tables, but you can have millions of them ○ no need to define ahead of time ● Points or events ○ like rows, but schemaless like Mongo ○ single level hash

Slide 9

Slide 9 text

[{ "name": "foo", "columns": [ "time", "sequence_number", "val1", "val2" ], "points": [ [1384295094, 3, "paul", 23], [1384295094, 2, "john", 92], [1384295094, 1, "todd", 61] ] }, {...}] As JSON

Slide 10

Slide 10 text

Write points curl -X POST \ 'http://localhost:8086db/mydb/series?u=paul&p=pass' \ -d '[{"name":"foo", "columns":["val"], "points": [[3]]}]'

Slide 11

Slide 11 text

Querying curl \ 'http://...:8086/db/mydb/series?u=paul&p=pass&q=...'

Slide 12

Slide 12 text

SQL(ish) Query Language select * from foo where time > now() - 4h

Slide 13

Slide 13 text

[{ "name": "foo", "columns": [ "time", "sequence_number", "val1", "val2" ], "points": [ [1384295094, 3, "paul", 23], [1384295094, 2, "john", 92], [1384295094, 1, "todd", 61] ] }, {...}] JSON data returned

Slide 14

Slide 14 text

select count(state) from user_events group by time(5m) where time > now() - 7d and state = ‘MA’

Slide 15

Slide 15 text

[{ "name": "user_events", "columns": [ "time", "count" ], "points": [ [1389729000, 3], [1389728700, 5], [1389728400, 1] ] }, {...}] JSON data returned

Slide 16

Slide 16 text

select count(state), state from user_events group by time(5m), state where time > now() - 7d

Slide 17

Slide 17 text

[{ "name": "user_events", "columns": [ "time", "count", "state" ], "points": [ [1389729000, 3, "NY"], [1389729000, 5, "CO"], [1389728700, 1, "NY"] ] }, {...}] JSON data returned

Slide 18

Slide 18 text

Other functions ● Percentiles ● Min, max, first, last, sum, count, stddev ● Histogram ● Lag, lead, funnel (next release)

Slide 19

Slide 19 text

Built in UI

Slide 20

Slide 20 text

# install influxdb brew update && brew install influxdb influxdb -config=/usr/local/etc/influxdb.conf # set up the admin dashboard git clone https://github.com/influxdb/influxdb-admin cd influxdb-admin bundle install middleman server open http://localhost:4567 Building Custom UIs

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Starting the Interface # make the dir in the local admin interface # dir so it shows up on the drop down mkdir /usr/local/opt/influxdb/share/admin/interfaces/paul_owns # make the dir and file in the influxdb-admin repo mkdir source/interfaces/paul_owns touch source/interfaces/paul_owns/index.html # or slim, etc.

Slide 23

Slide 23 text

# line 36 in config.rb # activate :livereload Kill Live Reload

Slide 24

Slide 24 text

Paul's Custom Dash

Slide 25

Slide 25 text

No content

Slide 26

Slide 26 text

Slide 27

Slide 27 text

Write some data [ { "name": "events", "columns": ["type", "email"], "points": [ ["signup", "[email protected]"], ["paid", "[email protected]"] ] } ]

Slide 28

Slide 28 text

...
Graph it!

Slide 29

Slide 29 text

q = "SELECT COUNT(type) FROM events GROUP BY time(1h), type fill(0)" parent.influxdb.query(q, (points) => typeToSeries = {} points.forEach((point) => series = typeToSeries[point.type] if !series typeToSeries[point.type] = []; series = typeToSeries[point.type]; series.push({x: point.time / 1000, y: point.count}) )

Slide 30

Slide 30 text

lastColorUsed = -1 colors = d3.scale.category10() data = for type, series of typeToSeries lastColorUsed += 1 { data: series.reverse(), color: colors(lastColorUsed), name: type }

Slide 31

Slide 31 text

graph = new Rickshaw.Graph( height: 200, element: document.querySelector("#events_graph"), renderer: 'area', stroke: true, series: data ) graph.render()

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

...

Detail

Add some hover detail

Slide 34

Slide 34 text

hoverDetail = new Rickshaw.Graph.HoverDetail graph: graph, formatter: (series, x, y) -> date = '' + new Date(x * 1000).toUTCString() + '' swatch = '' content = swatch + series.name + ": " + parseInt(y) + '
' + date renderDetail(series.name, x) content

Slide 35

Slide 35 text

renderDetail = (series, x) => startTime = x endTime = startTime + 3600 q = "select * from events where time > " + startTime + "s and time < " + endTime + "s and type = '" + series + "'"

Slide 36

Slide 36 text

parent.influxdb.query(q, (points) => ul = "
    " points.forEach (point) => li = "
  • " li += point.email li += "
  • " ul += li ul += "
" $("#events_detail").html(ul) )

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

Many possibilities ● Histograms over time ● Pulling in context ● Drilling down

Slide 39

Slide 39 text

We’re looking for help ● Dashboard builder ● Explorer

Slide 40

Slide 40 text

Thanks! [email protected] @pauldix