Time Series Data Models and the future of InfluxDB's Query Language
A talk I gave at the Austin Monitoring Meetup about different open source projects' time series data models and some ideas for a new InfluxDB query language.
Lookup tag/label values OpenTSDB /api/search/lookup?query={host=*} (all series with host?) Prometheus {host=~“.+”} (all series with host?) InfluxDB SHOW TAG VALUES WITH KEY = “host” SHOW TAG VAVUES FROM “cpu” with KEY = “host”
Drill Down OpenTSDB /api/search/lookup?query={host=*,service=mysql} (all metrics on the hosts) Prometheus {host=~“.+”,service=mysql} (all metrics on the hosts) InfluxDB SHOW TAG VALUES WITH KEY = “host” WHERE “service” = ‘mysql’
Facets __name__ host service region group … go_goroutines go_memstats_alloc_bytes go_memstats_alloc_bytes_total go_memstats_gc_sys_bytes go_memstats_other_sys_bytes …
Facets __name__ host service region group … go_goroutines go_memstats_alloc_bytes go_memstats_alloc_bytes_total go_memstats_gc_sys_bytes go_memstats_other_sys_bytes … host service region group …
Facets __name__ host service region group … go_goroutines go_memstats_alloc_bytes go_memstats_alloc_bytes_total go_memstats_gc_sys_bytes go_memstats_other_sys_bytes … host service region group … dynamic hierarchy! (name already selected)
Query Languages Query Language Example Graphite functional target, from, until sumSeries(summarize(water.level.h2o.feet.*, '1hour', 'max')) OpenTSDB http params startTime, endTime, metric, aggregationFunction, filter, functions, expressions Prometheus functional-ish increase(http_requests_total{job=“prometheus”}[5m]) InfluxDB 1.x SQL-ish select mean(system) from cpu where time > now() - 6h group by time(10m)
Requirements • Support InfluxDB 1.x Data Model • Support InfluxDB 1.x QL • Support Prometheus Data Model • Functional Query Language • Rich Query Builder UI • Query Completion CLI • PromQL?
Example Queries database(name:"testdb") .select(criteria:`"host" = 'A' and "system" = 'cpu'`) .range(startOffset:"-1h") Wrap strings in back ticks to avoid pesky escaping
// get the cpu load of hosts that have mysql running var db = database(name:"testdb") db.select( criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{ db.select(`"service" = 'mysql'`).values(key:"host") }`) .range(startOffset:"-4h")
// get the cpu load of hosts that have mysql running var db = database(name:"testdb") db.select( criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{ db.select(`"service" = 'mysql'`).values(key:"host") }`) .range(startOffset:"-4h") Variables
// get the cpu load of hosts that have mysql running var db = database(name:"testdb") db.select( criteria:`"system" = 'cpu' and "metric" = 'load' and "host" in #{ db.select(`"service" = 'mysql'`).values(key:"host") }`) .range(startOffset:"-4h") string interpolation
// get the count in 10m periods in the last 24h from an event stream // and filter that to only include those periods that were 2 sigma above the average var m = database(name:”testdb”).select(criteria:"\"event\" = 'pageview'") .range(startOffset:"-24h") .merge() .window(func:count(),duration:"10m")
// get the count in 10m periods in the last 24h from an event stream // and filter that to only include those periods that were 2 sigma above the average var m = database(name:”testdb”).select(criteria:”\”event\” = 'pageview'") .range(startOffset:"-24h") .merge() .window(func:count(),duration:"10m") // this is shorthand for m.stddev.join(op:"*", right:2) var sigma = m.stddev() * 2
// get the count in 10m periods in the last 24h from an event stream // and filter that to only include those periods that were 2 sigma above the average var m = database(name:”testdb”).select(criteria:”\”event\” = 'pageview'") .range(startOffset:"-24h") .merge() .window(func:count(),duration:"10m") // this is shorthand for m.stddev.join(op:"*", right:2) var sigma = m.stddev() * 2 // return only the counts 1 sigma above m.filter(exp:"$ > #{sigma}")
// return the last hour of time series of the top 10 host cpu utilizations by // their average load over last 10 minutes var topTen = db.select(criteria:”\”metric\” = 'load' and system = 'cpu'") .range(startOffset:"-10m") .mean() .sort(func:first()) .slice(end:10) .values(key:"host") db.select(criteria:”\"metric\" = 'load' and system = 'cpu' and host in #{topTen}") .range(startOffset:"-1h")