Upgrade to Pro — share decks privately, control downloads, hide ads and more …

InfluxDB's new storage engine: The Time Structured Merge Tree

Paul Dix
October 14, 2015

InfluxDB's new storage engine: The Time Structured Merge Tree

Paul Dix

October 14, 2015
Tweet

More Decks by Paul Dix

Other Decks in Technology

Transcript

  1. InfluxDB’s new storage engine: The Time Structured Merge Tree Paul

    Dix CEO at InfluxDB @pauldix paul@influxdb.com
  2. Shards 10/11/2015 10/12/2015 Data organized into Shards of time, each

    is an underlying DB efficient to drop old data 10/13/2015 10/10/2015
  3. Arranging in Key/Value Stores 1,1443782126 Key Value 80 2,1443782126 18

    1,1443782127 81 2,1443782256 15 2,1443782130 17 3,1443700126 18
  4. Components WAL In memory cache Index Files Similar to LSM

    Trees Same like MemTables like SSTables
  5. In Memory Cache // cache and flush variables cacheLock sync.RWMutex

    cache map[string]Values flushCache map[string]Values temperature,device=dev1,building=b1#internal
  6. In Memory Cache // cache and flush variables cacheLock sync.RWMutex

    cache map[string]Values flushCache map[string]Values writes can come in while WAL flushes
  7. // cache and flush variables cacheLock sync.RWMutex cache map[string]Values flushCache

    map[string]Values dirtySort map[string]bool values can come in out of order. mark if so, sort at query time
  8. awesome time series data WAL (an append only file) in

    memory index on disk index (periodic flushes)
  9. The Index Data File Min Time: 10000 Max Time: 29999

    Data File Min Time: 30000 Max Time: 39999 Data File Min Time: 70000 Max Time: 99999 Contiguous blocks of time
  10. The Index Data File Min Time: 10000 Max Time: 29999

    Data File Min Time: 15000 Max Time: 39999 Data File Min Time: 70000 Max Time: 99999 can overlap
  11. The Index cpu,host=A Min Time: 10000 Max Time: 20000 cpu,host=A

    Min Time: 21000 Max Time: 39999 Data File Min Time: 70000 Max Time: 99999 but a specific series must not overlap
  12. The Index Data File Data File Data File a file

    will never overlap with more than 2 others time ascending Data File Data File
  13. The Index Data File Min Time: 10000 Max Time: 29999

    Data File Min Time: 30000 Max Time: 39999 Data File Min Time: 70000 Max Time: 99999 Data File Min Time: 10000 Max Time: 99999 they periodically get compacted (like LSM)
  14. Compacting while appending new data func (w *WriteLock) LockRange(min, max

    int64) { // sweet code here } func (w *WriteLock) UnlockRange(min, max int64) { // sweet code here }
  15. Compacting while appending new data func (w *WriteLock) LockRange(min, max

    int64) { // sweet code here } func (w *WriteLock) UnlockRange(min, max int64) { // sweet code here } This should block until we get it
  16. Back to the data files… Data File Min Time: 10000

    Max Time: 29999 Data File Min Time: 30000 Max Time: 39999 Data File Min Time: 70000 Max Time: 99999
  17. Access file like a byte slice func (d *dataFile) MinTime()

    int64 { minTimePosition := d.size - minTimeOffset timeBytes := d.mmap[minTimeOffset : minTimeOffset+timeSize] return int64(btou64(timeBytes)) }
  18. Binary Search for ID func (d *dataFile) StartingPositionForID(id uint64) uint32

    { seriesCount := d.SeriesCount() indexStart := d.indexPosition() min := uint32(0) max := uint32(seriesCount) for min < max { mid := (max-min)/2 + min offset := mid*seriesHeaderSize + indexStart checkID := btou64(d.mmap[offset : offset+timeSize]) if checkID == id { return btou32(d.mmap[offset+timeSize : offset+timeSize+posSize]) } else if checkID < id { min = mid + 1 } else { max = mid } } return uint32(0) } The Index: IDs are sorted
  19. test last night: 100,000 series 100,000 points per series 10,000,000,000

    total points 5,000 points per request c3.8xlarge, writes from 4 other systems ~390,000 points/sec ~3 bytes/point (random floats, could be better)