Slide 1

Slide 1 text

Getting Started with Time Series Data Sebastian Cohnen, @tisba
 stormforger.com, @StormForgerApp
 NoSQL Matters 2014, Dublin, Sep. 2014 with InfluxDB

Slide 2

Slide 2 text

About me • Sebastian Cohnen, M. Sc. • Developer & Founder from Cologne, Germany • (Web-) Architectures, Performance & Scalability • Founder of StormForger.com; load test HTTP-based systems

Slide 3

Slide 3 text

Time Series Data • Data, which has an inherent relation to time, e.g… • Sensor Data, Log Data • Stock Prices • "Events" in general

Slide 4

Slide 4 text

Use Cases • (statistical) Analysis • Summaries and Aggregations • Visualization • Dashboards,
 Monitoring & Alerting

Slide 5

Slide 5 text

Time Series Database (TSDB) • Make use of the inherent relation to time • (almost) everything builds around “ORDER BY time” • Examples: RRDTool, carbon (Graphite), OpenTSDB, … and InfluxDB

Slide 6

Slide 6 text

Wish List for TSDBs • Flexible Query Language; e.g. SQL-like • Stream Processing • Full control over Downsampling & Retention Policies

Slide 7

Slide 7 text

No content

Slide 8

Slide 8 text

No content

Slide 9

Slide 9 text

InfluxDB • written in Go, Open Source since 04/2013 • HTTP (binary protocol planned) • SQL-like Query Language • Clustering Support

Slide 10

Slide 10 text

Data Model • Databases • Time Series (“Tables”) • Points/Events (“Rows”): schema-less, key-value pairs

Slide 11

Slide 11 text

Let's take a look…

Slide 12

Slide 12 text

Example Data time host value mount 1409665622000 a.example.com 42 /mnt 1409137873000 b.example.com 23 /mnt

Slide 13

Slide 13 text

Writing Data [{ "name" : "hd_used", ! "columns" : ["time", "value", "host", "mount"], "points" : [ [1409665622000, 42, "a.example.com", "/mnt"], [1409137873000, 23, "b.example.com", "/mnt"] ] ! }] POST "http://localhost:8086/db/matters-dub-2014/series" database time series points

Slide 14

Slide 14 text

[{ "name": "hd_used", "columns": [ "time", "sequence_number", "value", "host", "mount" ], "points": [ [ 1409137878266, 10001, 42, "a.example.com", "/mnt" ], … ] }] GET "http://localhost:8086/db/ matters-dub-2014/series?q=$QUERY" Querying Data SELECT * FROM hd_used

Slide 15

Slide 15 text

InfluxDB Query Language

Slide 16

Slide 16 text

SQL-like SELECT status, duration, path FROM production.web.server01.requests WHERE time > now() - 1h

Slide 17

Slide 17 text

Aggregation SELECT COUNT(duration) AS request_count, MEDIAN(duration) AS median, PERCENTILE(duration, 95.0) AS p95, PERCENTILE(duration, 99.0) AS p99 FROM production.web.server01.requests WHERE time > "2014-09-03 00:00:00.000" AND time < "2014-09-05" GROUP BY time(1m)

Slide 18

Slide 18 text

Regular Expressions SELECT COUNT(duration) AS request_count FROM production.web.server01.requests WHERE time > "2014-09-03 00:00:00.000" AND time < "2014-09-05" ! AND user_agent =~ /.*Chrome.*/ ! GROUP BY time(1m)

Slide 19

Slide 19 text

Selecting Multiple Series SELECT COUNT(duration) AS request_count ! FROM production.web.server01.requests, production.web.server02.requests ! WHERE time > "2014-09-03 00:00:00.000" AND time < "2014-09-05"

Slide 20

Slide 20 text

…or via Regular Expressions SELECT COUNT(duration) AS request_count ! FROM /production\.web\.server\d{2}\.requests/ ! ! WHERE time > "2014-09-03 00:00:00.000" AND time < "2014-09-05"

Slide 21

Slide 21 text

Join SELECT errors_per_minute.value /
 page_views_per_minute.value ! FROM errors_per_minute INNER JOIN page_views_per_minute a1 b1 a2 b2 a3 a3 a1 b2 a2 b1 b3 a3 t1 t2 t3 t4

Slide 22

Slide 22 text

Merge SELECT COUNT(duration) FROM production.web.server01.requests MERGE production.web.server02.requests a1 b1 a2 b2 a3 a3 a1 b2 a2 b1 b3 a3

Slide 23

Slide 23 text

Continuous Queries “Stream Processing”

Slide 24

Slide 24 text

Continuous Queries • Idea: Process data as it arrives, not (only) at query time • Continuous Queries can be defined via "SQL" and run continuously • Support for backfilling (e.g. when creating new queries on existing data)

Slide 25

Slide 25 text

Continuous Queries: Use Cases • Organizing, Fanout • Downsampling & Aggregation

Slide 26

Slide 26 text

Organizing Data

Slide 27

Slide 27 text

time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 SELECT path, duration FROM http_requests GROUP BY http_status response_times.500 … response_times.200 Input Events http_requests INTO response_times.[http_status]

Slide 28

Slide 28 text

SELECT Across Series SELECT PERCENTILE(duration, 95.0) FROM /response_times\..*/ { "response_times.200": [{ "percentile": 5.058, "time": 0 }], "response_times.500": [{ "percentile": 63.761, "time": 0 }] }

Slide 29

Slide 29 text

Downsample & Aggregate Data

Slide 30

Slide 30 text

Goals • group by status code • keep 1 data point per minute • calculate 95th percentile & mean duration • put results into response_times.1m.$status_code

Slide 31

Slide 31 text

time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 time: 1409137878266 path: /example.html duration: 0.23 status: 200 SELECT MEAN(duration) AS mean, PERCENTILE(duration, 95.0) AS p95 FROM http_requests GROUP BY time(1m), http_status INTO response_times.1m.[http_status] http_requests response_times.1m.500 … response_times.1m.200 1min 1min 1min

Slide 32

Slide 32 text

SELECT time, p95, mean FROM /response_times\.1m\..*/ SELECT time, p95, mean FROM response_times.1m.200 WHERE time > "2014-09-03 00:00:00.000" AND time < "2014-09-05"

Slide 33

Slide 33 text

InfluxDB UIs • build-in (Simple Management) • http://grafana.org/ • https://github.com/ hakobera/influga

Slide 34

Slide 34 text

:-)

Slide 35

Slide 35 text

Q&A Sebastian Cohnen, @tisba
 stormforger.com, @StormForgerApp
 NoSQL Matters 2014, Dublin, Sep. 2014