InfluxDB - a distributed
time series, metrics, and
events database
Paul Dix
paul@influxdb.com
@pauldix
@influxdb
Slide 2
Slide 2 text
YC (W13), 3 people full time:
Todd Persen
John Shahid
Paul Dix (me)
Slide 3
Slide 3 text
What it’s for…
Slide 4
Slide 4 text
Metrics
Slide 5
Slide 5 text
Time Series
Slide 6
Slide 6 text
Analytics
Slide 7
Slide 7 text
Events
Slide 8
Slide 8 text
Can’t you just use a
regular DB?
Slide 9
Slide 9 text
order by time?
Slide 10
Slide 10 text
Doesn’t Scale
Slide 11
Slide 11 text
Example from metrics:
!
100 measurements per host *
10 hosts *
8640 per day (once every 10s) *
365 days
!
= 3,153,600,000 records per year
Slide 12
Slide 12 text
Have fun with that
table…
Slide 13
Slide 13 text
But wait, we’ll just keep
the summaries!
Slide 14
Slide 14 text
1h averages =
!
8,760,000 per year
Slide 15
Slide 15 text
Lose Detail and
AdHoc Queryability
Slide 16
Slide 16 text
So let’s use Cassandra,
HBase, or Scaleasaurus!
Slide 17
Slide 17 text
Too much application
code and complexity
Slide 18
Slide 18 text
Application logic and
scripts to compute
summaries
Slide 19
Slide 19 text
Application level logic
for balancing
Slide 20
Slide 20 text
No data locality for
AdHoc queries
Slide 21
Slide 21 text
And then there’s
more…
Slide 22
Slide 22 text
Web services
Slide 23
Slide 23 text
Libraries for web
services
Slide 24
Slide 24 text
Data collection
Slide 25
Slide 25 text
Visualization
Slide 26
Slide 26 text
–Paul Dix
“Building an application with an analytics
component today is like building a web
application in 1998. You spend months building
infrastructure before getting to the actual thing
you want to build.”
Slide 27
Slide 27 text
Analytics should be about
analyzing and interpreting data,
not the infrastructure to store and
process it.
Slide 28
Slide 28 text
No content
Slide 29
Slide 29 text
HTTP API
Web services built in
Slide 30
Slide 30 text
HTTP API (writes)
curl -X POST \
'http://localhost:8086/db/mydb/series?u=paul&p=pass' \
-d '[{"name":"foo", "columns":["val"], "points": [[3]]}]'
Javascript library + D3,
HighCharts, Rickshaw,
NVD3, etc.
Definitely more to do here!
Slide 59
Slide 59 text
Data Collection
CollectD Proxy, StatsD backend, Carbon ingestion,
OpenTSDB (soon)
Slide 60
Slide 60 text
Coming Soon
Slide 61
Slide 61 text
ugh, Documentation
Slide 62
Slide 62 text
Series Metadata
Slide 63
Slide 63 text
Binary Protocol
Slide 64
Slide 64 text
Pubsub
select * from some_series
where host = “serverA”
into subscription()
select percentile(90, value) from some_series
group by time(1m)
into subscription()
Slide 65
Slide 65 text
Custom Functions
select myFunc(value) from some_series
Slide 66
Slide 66 text
Rack aware sharding
and querying
Slide 67
Slide 67 text
Multi-datacenter
replication
Push and bi-directional
Slide 68
Slide 68 text
Indexes?
Slide 69
Slide 69 text
Ponies?
Tell @jvshahid that you want your pony ;)
Slide 70
Slide 70 text
But it’s ready to go now.
Production deployments
already running.
Slide 71
Slide 71 text
Need help?
support@influxdb.com
Thanks!
paul@influxdb.com
@pauldix