Building Distributed Timeseries Database in Go

Building Distributed Timeseries Database in Go

Vulcan an open source distributed timeseries database based on prometheus. In this talk we will talk about the origins and how we built it.

Fc19147e89d6521f92ef9e2c1c4dac24?s=128

Matthew Campbell

February 25, 2017
Tweet

Transcript

  1. Distributed Timeseries Database in Go

  2. Go is the future of NoSQL/NewSQL

  3. Databases written in Go 4 Prometheus 4 CockroachDb 4 InfluxDb

    4 Dgraph 4 EtcD 4 Consuld
  4. Talk about architecture

  5. What is a timeseries

  6. Use cases for timeseries 4 Stocks 4 Monitoring 4 IOT

  7. About timeseries 4 Timeseries can be lossy 4 Timeseries compress

    uniquely on data sets 4 Write heavy 4 Key, Time, DataPoint 4 CNX:IND, June 15 12:23, $23.40
  8. Dark days 4 Graphite 4 InfluxDb 4 Mysql storing metrics

    4 OpenTSDB (UGGHHHHHH)
  9. Prometheus

  10. None
  11. rate(http_request_latency[1m])

  12. None
  13. Initial architecture Beta for 3000 customers

  14. Hash sharded Prometheus 3-4 per datacenter

  15. None
  16. None
  17. Performance requirements 4 3 Gbits/sec of traffic 4 100k Writes

    a second 4 50ms Reads 4 100,000 customers to start 4 20 TB of storage
  18. Introducing Vulcan https://github.com/digitalocean/vulcan

  19. Strange PRs

  20. A fateful meeting at Soundcloud...

  21. Architecture changes 4 Split to microservices 4 Containerization 4 Message

    Queues
  22. Pipelining data

  23. None
  24. Scaling storage

  25. None
  26. Metrics format

  27. Timeseries Schema 4 V1 Timeseries Table 4 key (Combined Key)

    4 timestamp (Combined Key) 4 datapoint (float64)
  28. 4 V2 Chunks (1 KB)

  29. 4 V2 Timeseries Table 4 key (Combined Key) 4 timestamp

    range (2hours) (Combined Key) 4 raw data (1kb blob)
  30. 4 Index Table 4 Customer (Combined key) 4 keyPrefix (Combined

    key) 4 time 4 key
  31. In memory query engine

  32. Downsampling

  33. Final Architecture

  34. None
  35. Questions?