Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Taming InfluxDB

Taming InfluxDB

Story about taming InfluxDB from its very beginning in Teachbase

Makar Ermokhin

September 28, 2019
Tweet

More Decks by Makar Ermokhin

Other Decks in Programming

Transcript

  1. + =

  2. !6

  3. !7

  4. !8

  5. !9

  6. Timeline 2013 — InfluxDB initial release 2014 — Teachbase started

    using InfluxDB (2015 — Prometheus public release) Present days — InfluxDB 2.0 alpha !10
  7. Collect users «active time» in the app Read stats grouped

    by time (day, month) Filter stats by resource (course, quiz, etc.) Generate charts! Challenge !12
  8. PostgreSQL: grouping by time SELECT DATE_TRUNC('minute', created_at) AS time, COUNT(DISTINCT(visits.user_id))

    AS users_count FROM visits WHERE visits.created_at > '2019-09-25 21:46:35 +0300'"::timestamp AND visits.created_at < '2019-09-25 23:46:35 +0300'"::timestamp GROUP BY 1 !15
  9. PostgreSQL: grouping without gaps WITH minutes AS ( SELECT generate_series(

    DATE_TRUNC('minute', now()) - '6 hours'"::interval, DATE_TRUNC('minute', now()), '1 minute'"::interval ) as minute ) SELECT minutes.minute AS time, COUNT(DISTINCT(visits.user_id)) AS users_count FROM minutes LEFT JOIN users ON DATE_TRUNC('minute', visits.created_at) = minutes.minute GROUP BY 1 !17
  10. Downsampling data SELECT COUNT(DISTINCT(user_id)) AS users_count FROM visits WHERE time

    > '2019-09-25T03:00:00Z' AND time < '2019-09-26T03:00:00Z' GROUP BY time(1h) !20
  11. Relational DBMS: comparison with the previous record "-- price changing

    "-- of course if we have window functions SELECT price - lag(price) OVER my_window FROM stock_prices WINDOW my_window AS (ORDER BY created_at ASC) WHERE series_id = 'some' !21
  12. SELECT one_host.cpu_load AS cpu_load_first, other_host.cpu_load AS cpu_load_second FROM one_host INNER

    JOIN other_host ON ABS((one_host.time - other_host.time)"::float) < 0.01 !23 Relational DBMS: joining by time
  13. InfluxDB SQL-like query language "=> easier to learn HTTP API

    "=> easier to build clients New and shiny (in 2014) !29
  14. InfluxDB fields Key-value pairs that stores your data Values can

    be strings, floats, integers, or Booleans Not indexed !32
  15. Query example SELECT mean(time_spent) FROM visits WHERE time > '2019-09-23

    03:00:00' AND time < '2019-09-24 03:00:00’ AND account_id = 1985 GROUP BY time(1h), user_id, account_id !34
  16. Query example SELECT mean(time_spent) FROM visits WHERE time > '2019-09-23

    03:00:00' AND time < '2019-09-24 03:00:00’ AND account_id = 1985 GROUP BY time(1h), user_id, account_id Select aggregated stats !35
  17. Query example SELECT mean(time_spent) FROM visits WHERE time > '2019-09-23

    03:00:00' AND time < '2019-09-24 03:00:00’ AND account_id = 1985 GROUP BY time(1h), user_id, account_id Specify measurement !36
  18. Query example SELECT mean(time_spent) FROM visits WHERE time > '2019-09-23

    03:00:00' AND time < '2019-09-24 03:00:00’ AND account_id = 1985 GROUP BY time(1h), user_id, account_id Filtering by timestamp and tag !37
  19. Query example SELECT mean(time_spent) FROM visits WHERE time > '2019-09-23

    03:00:00' AND time < '2019-09-24 03:00:00’ AND account_id = 1985 GROUP BY time(1h), user_id, account_id Group by time interval and tag !38
  20. Continuous Queries Continuous queries (CQ) are InfluxQL queries that run

    automatically and periodically on realtime data and store query results in a specified measurement. !39
  21. CQ example CREATE CONTINUOUS QUERY cq_course_visits_by_day ON teachbase BEGIN SELECT

    sum(time_spent) AS total_spent, count(distinct(user_id)) AS user_count, count(distinct(account_id)) AS account_count INTO course_visits_by_day FROM visits GROUP BY time(1d), course_id END !40
  22. Writing activity Open WebSocket connection when user visits page Send

    context (tags) via socket Commit time into Influx when user left page or once in X seconds !42
  23. class VisitMetrics < Influxer"::Metrics set_series :visits tags :course_session_id, :course_id, :account_id,

    :user_id attributes :time_spent, :field_user_id, :field_account_id validates :user_id, :account_id, :time_spent, presence: true before_write :tags_to_fields scope :by_course, "->(id) { where(course_id: id) } scope :today, "-> { where(time: Time.current.all_day) } private def tags_to_fields self.field_account_id = account_id self.field_user_id = user_id end end !46
  24. !48

  25. class CourseSession < ApplicationRecord has_metrics :visits, foreign_key: :course_session_id end course_session

    = CourseSession.find(100500) course_session.visits # "=> Influxer"::Relation course_session.visits.today # "=> only today !49 Associations
  26. Caching course_session = CourseSession.find(100500) course_session.visits.past(:hour) # performing query and caching

    result for 1 minute # some magic VisitMetrics.where(course_session_id: 100500).past(:hour) # fetch from cache course_session.visits # performing and caching with default expiring time !50
  27. Dumping to PostgreSQL Once per day Just fetch data from

    Influx and write it into Postgres !55
  28. InfluxDB out of memory CQ causes Influx to consume all

    memory Influx unable to restart Days of analytics lost !58
  29. TICK stack Telegraf — collects metrics InfluxDB — stores metrics

    Chronograf — shows metrics Kapacitor — processes metrics (sends alerts) !63
  30. TICK stack Easy to setup Flexible metrics & views Many

    plugins for Telegraf There is no Ruby plugins !66
  31. Prometheus: learning curve Web UI out of box ✅ No

    CLI by default ❌ Custom, non-SQL-like, query language ❌ !75
  32. Prometheus: integration complexity HTTP API ✅ Official Ruby client ✅

    No obvious way to insert data ⁉ No SQL-like querying support ❌ !76
  33. TimescaleDB: infrastructure overhead If you use PostgreSQL you really don’t

    need any new infrastructure ✅ Otherwise you need only PostgreSQL (and maybe migrate your DB to PostgreSQL) ✅ Cloud version ✅ !82
  34. TimescaleDB: popularity !83 Db-engines score: 1.40 Db-engines overall rank: #126

    Db-engines TSDB rank: #8 PostgreSQL score/overall rank: 482.25/#4
  35. TimescaleDB: other ACID transactions ✅ You able to store time

    series and other data in one DB ✅ !84
  36. Clickhouse: integration complexity HTTP API ✅ Ruby client ✅ No

    ActiveRecord-like client ❌ Batch insert strong preferred ❌ !86
  37. Clickhouse: other !89 Clickhouse is not for key-value requests Didn’t

    found built-in way for gapfilling Clickhouse is good for «wide» tables Clickhouse is good in big data
  38. Summary Prometheus is good for monitoring TimescaleDB is good when

    you want store everything in one DB Clickhouse is good for BigData InfluxDB is still good for us !94