Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Writing a High Performance Database in Go

Writing a High Performance Database in Go

My talk from GopherCon 2014.

benbjohnson

April 24, 2014
Tweet

More Decks by benbjohnson

Other Decks in Technology

Transcript

  1. Writing a
    High Performance
    Database in Go

    View full-size slide

  2. Two Meanings of
    “Database”

    View full-size slide

  3. Database
    Server

    View full-size slide

  4. Database
    Library

    View full-size slide

  5. You may never write
    a database but...

    View full-size slide

  6. HOW WE
    ACCESS DATA
    AFFECTS US ALL!

    View full-size slide

  7. Why write a
    database in Go?

    View full-size slide

  8. Things that need to
    be really f*cking fast
    Things that need to
    be pretty fast

    View full-size slide

  9. Things that need to
    be really f*cking fast
    Things that need to
    be pretty fast
    User Management
    Schema Management
    Query Parsing
    Backup / Recovery
    Bulk Data Insertion
    etc...

    View full-size slide

  10. Things that need to
    be really f*cking fast
    Things that need to
    be pretty fast
    Query Execution
    User Management
    Schema Management
    Query Parsing
    Backup / Recovery
    Bulk Data Insertion
    etc...

    View full-size slide

  11. There’s more to databases than speed

    View full-size slide

  12. There’s more to databases than speed
    Easy Deployment

    View full-size slide

  13. There’s more to databases than speed
    Easy Deployment
    User friendly API

    View full-size slide

  14. There’s more to databases than speed
    Easy Deployment
    User friendly API
    Simple debugging

    View full-size slide

  15. How do you make
    the fast parts fast?

    View full-size slide

  16. Option #1: CGO

    View full-size slide

  17. Pro: Integrate with
    tons of existing libraries

    View full-size slide

  18. Con: Overhead incurred
    with each C function call

    View full-size slide

  19. LuaJIT
    Easy to integrate, good community
    Half the speed of C, weird caveats

    View full-size slide

  20. LLVM
    Really, really fast
    Really, really complicated

    View full-size slide

  21. The point isn’t to just use C

    View full-size slide

  22. The point is that C is an option

    View full-size slide

  23. Option #2: Pure Go

    View full-size slide

  24. Basics of Bolt
    Pure Go port of LMDB
    Memory-mapped B+tree
    MVCC, ACID transactions
    Zero copy reads

    View full-size slide

  25. Batch Work
    Together

    View full-size slide

  26. Batch Size
    1
    Bolt Batch Benchmarks
    Performance
    10
    100
    1000
    Baseline
    9x Baseline
    45x Baseline
    90x Baseline
    Disclaimer: YMMV

    View full-size slide

  27. Use a channel to stream changes
    Transaction Coalescing
    Group changes into single transaction
    Either all changes commit or rollback

    View full-size slide

  28. Encoding
    Matters!

    View full-size slide

  29. JSON Baseline
    gogoprotobuf 20x JSON
    Cap’n Proto 60x JSON
    Encoding Performance
    Disclaimer: YMMV

    View full-size slide

  30. See also: Albert Strasheim’s
    “Serialization in Go” Talk
    Encoding Performance
    http://www.slideshare.net/albertstrasheim/serialization-in-go
    https://github.com/cloudflare/goser

    View full-size slide

  31. Here’s a crazy
    idea...

    View full-size slide

  32. Direct map to
    your data file

    View full-size slide

  33. // Create a byte slice with the same size as type T.
    var value = make([]byte, unsafe.Sizeof(T{})
    // Map a typed pointer from the byte slice and update it.
    var t = (*T)unsafe.Pointer(&value[0])
    t.ID = 123
    t.MyIntValue = 20
    // Insert value into database.
    db.Update(func(tx *bolt.Tx) error {
    return tx.Bucket(“T”).Put([]byte(“123”), value)
    })
    Map a struct to a []byte

    View full-size slide

  34. // Start a read transaction.
    db.View(func(tx *bolt.Tx) error {
    c := tx.Bucket(“T”).Cursor()
    // Iterate over each value in the bucket.
    for k, v := c.First(); k != nil; k, v = c.Next() {
    var t = (*T)unsafe.Pointer(&value[0])
    // ... do something with “t” ...
    }
    return nil
    })
    Map a []byte to a struct

    View full-size slide

  35. No encoding/decoding
    Pros:
    Insert 100k values/sec
    Read 20M values/sec

    View full-size slide

  36. Fixed struct layout
    Cons:
    Machine specific endianness
    People will think you’re crazy

    View full-size slide

  37. Your CPU can do 3 billion
    operations per second so
    USE IT!

    View full-size slide

  38. How to think about
    performance optimization

    View full-size slide

  39. Self-actualization
    Hierarchy of Need
    Esteem
    Love/Belonging
    Safety
    Physiological

    View full-size slide

  40. Self-actualization
    Hierarchy of Need
    Esteem
    Love/Belonging
    Safety
    Physiological

    View full-size slide

  41. Memory Access
    Hierarchy of SPEED
    Mutexes
    Memory Allocation
    Disk I/O
    Network I/O

    View full-size slide

  42. Go can be extremely
    fast... if you know
    how to optimize it!

    View full-size slide

  43. Questions
    @benbjohnson

    View full-size slide