Writing a High Performance Database in Go

Writing a High Performance Database in Go

My talk from GopherCon 2014.

6c76488dff9b5d9a872dff88f008f88e?s=128

benbjohnson

April 24, 2014
Tweet

Transcript

  1. Writing a High Performance Database in Go

  2. Two Meanings of “Database”

  3. Database Server

  4. Database Library

  5. You may never write a database but...

  6. HOW WE ACCESS DATA AFFECTS US ALL!

  7. Why write a database in Go?

  8. Things that need to be really f*cking fast Things that

    need to be pretty fast
  9. Things that need to be really f*cking fast Things that

    need to be pretty fast User Management Schema Management Query Parsing Backup / Recovery Bulk Data Insertion etc...
  10. Things that need to be really f*cking fast Things that

    need to be pretty fast Query Execution User Management Schema Management Query Parsing Backup / Recovery Bulk Data Insertion etc...
  11. There’s more to databases than speed

  12. There’s more to databases than speed Easy Deployment

  13. There’s more to databases than speed Easy Deployment User friendly

    API
  14. There’s more to databases than speed Easy Deployment User friendly

    API Simple debugging
  15. How do you make the fast parts fast?

  16. Option #1: CGO

  17. Pro: Integrate with tons of existing libraries

  18. Con: Overhead incurred with each C function call

  19. LuaJIT Easy to integrate, good community Half the speed of

    C, weird caveats
  20. LLVM Really, really fast Really, really complicated

  21. The point isn’t to just use C

  22. The point is that C is an option

  23. Option #2: Pure Go

  24. Bolt

  25. Basics of Bolt Pure Go port of LMDB Memory-mapped B+tree

    MVCC, ACID transactions Zero copy reads
  26. Batch Work Together

  27. Batch Size 1 Bolt Batch Benchmarks Performance 10 100 1000

    Baseline 9x Baseline 45x Baseline 90x Baseline Disclaimer: YMMV
  28. Use a channel to stream changes Transaction Coalescing Group changes

    into single transaction Either all changes commit or rollback
  29. Encoding Matters!

  30. JSON Baseline gogoprotobuf 20x JSON Cap’n Proto 60x JSON Encoding

    Performance Disclaimer: YMMV
  31. See also: Albert Strasheim’s “Serialization in Go” Talk Encoding Performance

    http://www.slideshare.net/albertstrasheim/serialization-in-go https://github.com/cloudflare/goser
  32. Here’s a crazy idea...

  33. Direct map to your data file

  34. // Create a byte slice with the same size as

    type T. var value = make([]byte, unsafe.Sizeof(T{}) // Map a typed pointer from the byte slice and update it. var t = (*T)unsafe.Pointer(&value[0]) t.ID = 123 t.MyIntValue = 20 // Insert value into database. db.Update(func(tx *bolt.Tx) error { return tx.Bucket(“T”).Put([]byte(“123”), value) }) Map a struct to a []byte
  35. // Start a read transaction. db.View(func(tx *bolt.Tx) error { c

    := tx.Bucket(“T”).Cursor() // Iterate over each value in the bucket. for k, v := c.First(); k != nil; k, v = c.Next() { var t = (*T)unsafe.Pointer(&value[0]) // ... do something with “t” ... } return nil }) Map a []byte to a struct
  36. No encoding/decoding Pros: Insert 100k values/sec Read 20M values/sec

  37. Fixed struct layout Cons: Machine specific endianness People will think

    you’re crazy
  38. Your CPU can do 3 billion operations per second so

    USE IT!
  39. How to think about performance optimization

  40. Self-actualization Hierarchy of Need Esteem Love/Belonging Safety Physiological

  41. Self-actualization Hierarchy of Need Esteem Love/Belonging Safety Physiological

  42. Memory Access Hierarchy of SPEED Mutexes Memory Allocation Disk I/O

    Network I/O
  43. Go can be extremely fast... if you know how to

    optimize it!
  44. Questions @benbjohnson