Vulcan an open source distributed timeseries database based on prometheus. In this talk we will talk about the origins and how we built it.
Distributed TimeseriesDatabase in Go
View Slide
Go is the future of NoSQL/NewSQL
Databases written in Go4 Prometheus4 CockroachDb4 InfluxDb4 Dgraph4 EtcD4 Consuld
Talk about architecture
What is a timeseries
Use cases for timeseries4 Stocks4 Monitoring4 IOT
About timeseries4 Timeseries can be lossy4 Timeseries compress uniquely on data sets4 Write heavy4 Key, Time, DataPoint4 CNX:IND, June 15 12:23, $23.40
Dark days4 Graphite4 InfluxDb4 Mysql storing metrics4 OpenTSDB (UGGHHHHHH)
Prometheus
rate(http_request_latency[1m])
Initial architectureBeta for 3000 customers
Hash sharded Prometheus3-4 per datacenter
Performance requirements4 3 Gbits/sec of traffic4 100k Writes a second4 50ms Reads4 100,000 customers to start4 20 TB of storage
Introducing Vulcanhttps://github.com/digitalocean/vulcan
Strange PRs
A fateful meeting at Soundcloud...
Architecture changes4 Split to microservices4 Containerization4 Message Queues
Pipelining data
Scaling storage
Metrics format
Timeseries Schema4 V1 Timeseries Table4 key (Combined Key)4 timestamp (Combined Key)4 datapoint (float64)
4 V2 Chunks (1 KB)
4 V2 Timeseries Table4 key (Combined Key)4 timestamp range (2hours) (Combined Key)4 raw data (1kb blob)
4 Index Table4 Customer (Combined key)4 keyPrefix (Combined key)4 time4 key
In memory query engine
Downsampling
Final Architecture
Questions?