FlatBuffers for Go

FlatBuffers for Go Fast and Fun Serialization 21 January 2015
Robert Winslow Programmer What we'll cover today Serialization basics Why we need another serialization format What makes FlatBuffers special Example code and usage

What is a serialization format? A standard way to store
structured data, then read it back. Examples: JSON Protocol Buffers Thrift XML Serialization Who here spends a lot of time interacting with serialized data?

Serialization Trick question: everybody. Serialization standards are what let us
make sense of sequences of bytes. Why a new serialization format? Android game developers at Google needed a better way to store data. Games are demanding applications. Memory bottlenecks are bad. Using too much CPU wastes battery life.

Why a new serialization format? The primary alternative was Protocol
Buffers. Protocol Buffers is a major open source serialization project from Google. Who here has used Protocol Buffers? Why a new serialization format? Good things about Protocol Buffers: Robust. Secure. Popular. "Nobody ever got fired for choosing Protocol Buffers."

Why a new serialization format? Bad things about Protocol Buffers:
Allocates temporary objects to unpack data. No direct random access. Poor data locality. Large codebase (3.8MB of code). Slow. Rumors of cost at scale... Why a new serialization format? The Fun Propulsion Lab is a tooling group inside Android. They decided to try a new approach.

Why a new serialization format? They asked, could we build
a serialization format that: Is simple, Versions your data with a schema, Enables random access, And is ridiculously fast? They tried and succeeded. The result is FlatBuffers. How fast? By the numbers Read-only microbenchmarks on a small dataset: Library ops/sec nanoseconds/op FlatBuffers 12,500,000 80 Protocol Buffers LITE 3,311 302,000 Rapid JSON 1,718 583,000 pugixml 5,102 196,000 Over 1000 times faster than Protocol Buffers. This uses the C++ version; we hope to make the Go version comparably fast. Source: google.github.io/flatbuffers/md__benchmarks.html (https://google.github.io/flatbuffers/md__benchmarks.html)

What is FlatBuffers? New serialization standard that is both featureful
and fast. Open-source project from Google, released in 2014. Created and maintained by Wouter van Oortmerssen, a programming language creator and game developer. Licensed under the permissive Apache v2 license. The big idea (TL;DR)

The big idea (TL;DR) FlatBuffer files are statically-typed, schema-versioned, portable
structs. Wisdom "Rule 5. Data dominates. If you've chosen the right data structures and organized things well, the algorithms will almost always be self-evident. Data structures, not algorithms, are central to programming." - Rob Pike, Rob Pike's 5 Rules of Programming

Speed: Part 1 Why care about speed? Computers are "fast
enough". Orders of magnitude still matter. What if you could use 100 computers instead of 1,000 to do a task?

What makes FlatBuffers so fast at read operations? A different
approach: No memory allocations. Tight packing of data is friendly to CPU caches (L1, L2, L3) Minimal code on hot execution paths (CPU instruction cache). The philosophy is different FlatBuffers relies on pointer arithmetic to read data without allocating intermediate objects. No calls to make / malloc!

Minimal serialization of an array An array is a sequence
of fixed-width elements. Use pointer arithmetic to find the data you want. Minimal serialization of an array A small array of int32: 2 3 5 7 (first four primes) Bytes for representing four numbers, each 4 bytes wide: 0 1 0 0 1 1 0 0 1 0 1 0 1 1 1 0 (little-endian)

Minimal serialization of an array Given the buffer: buf :=
[]byte{0, 0, 0, 0, 1, 0, 0, 0, 2, 0, 0, 0, 3, 0, 0, 0} Get the ith value: // Get: Read a little-endian int32 from a buffer. func Get(i int, buf []byte) int32 { offset := i * 4 data := buf[offset : offset+4] var n int32 = *(*int32)(unsafe.Pointer(&data[0])) return n } Get(0, buf) // 2 Get(1, buf) // 3 Get(2, buf) // 5 Get(3, buf) // 7 Minimal serialization of a struct A struct is just a heterogenous group of fixed-width elements. Use pointer arithmetic to find the data you want.

Minimal serialization of a struct Given the struct type Particle:
type Particle struct { X int16 // bytes: 0, 1 Y int16 // bytes: 2, 3 Z int16 // bytes: 4, 5 RGB [3]byte // bytes: 6, 7, 8 } Stored in this buffer: buf := []byte{1, 0, 2, 0, 3, 0, 128, 0, 192} Get the X value: func GetX(buf []byte) (n int16) { data := buf[:2] n = *(*int16)(unsafe.Pointer(&data[0])) return } Minimal serialization of a struct type Particle struct { X int16 // bytes: 0, 1 Y int16 // bytes: 2, 3 Z int16 // bytes: 4, 5 RGB [3]byte // bytes: 6, 7, 8 } buf := []byte{1, 0, 2, 0, 3, 0, 128, 0, 192} func GetY(buf []byte) (n int16) { data := buf[2:4] n = *(*int16)(unsafe.Pointer(&data[0])) return }

Minimal serialization of a struct type Particle struct { X
int16 // bytes: 0, 1 Y int16 // bytes: 2, 3 Z int16 // bytes: 4, 5 RGB [3]byte // bytes: 6, 7, 8 } buf := []byte{1, 0, 2, 0, 3, 0, 128, 0, 192} func GetZ(buf []byte) (n int16) { data := buf[4:6] n = *(*int16)(unsafe.Pointer(&data[0])) return } Minimal serialization of a struct type Particle struct { X int16 // bytes: 0, 1 Y int16 // bytes: 2, 3 Z int16 // bytes: 4, 5 RGB [3]byte // bytes: 6, 7, 8 } buf := []byte{1, 0, 2, 0, 3, 0, 128, 0, 192} func GetRGB(buf []byte) (rgb [3]byte) { data := buf[6:9] rgb = *(*[3]byte)(unsafe.Pointer(&data[0])) return }

The philosophy is simple Just like arrays and structs, the
FlatBuffers library uses pointer arithmetic. Every FlatBuffer is read in-place with pointer arithmetic operations. Show me code

Using FlatBuffers Here's an example schema: // player.fbs namespace Game;
table Player { name:string (id: 0, required); health:short = 100 (id: 1); armor:short = 100 (id: 2); } root_type Player; file_identifier "PLAY"; file_extension "player"; Using FlatBuffers Feed the schema to the FlatBuffers generator to create accessor code: flatc -g player.fbs It creates one Go file used to work with data: Game └── Player.go It's a relatively small file: $ wc -c Game/Player.go 1441 Game/Player.go

Using FlatBuffers To create data, use the generated builder functions:
builder := flatbuffers.NewBuilder(0) name := builder.CreateString("Robert") game.PlayerStart(builder) game.PlayerAddName(builder, name) game.PlayerAddHealth(builder, 60) // Tell FlatBuffers we are finished writing this object: player := game.PlayerEnd(builder) builder.Finish(player) // Save the backing byte buffer to a file: buf := builder.Bytes[builder.Head():] err := ioutil.WriteFile("robert.player", buf, 0666) It generates a tiny file: $ wc -c robert.player 40 robert.player Using FlatBuffers To read data, use generated getter functions: // Load the buffer from the file: buf, err := ioutil.ReadFile("robert.player") if err != nil { log.Fatal(err) } // Initialize FlatBuffers code to use the data: player := game.GetRootAsPlayer(buf, 0) // Print the data we saved: fmt.Printf("Name: %s\n", player.Name()) fmt.Printf("Health: %3d\n", player.Health()) fmt.Printf("Armor: %3d\n", player.Armor()) Prints the data we saved: Name: Robert Health: 60 Armor: 100

Using FlatBuffers Here's how the object looks on disk: 8
bytes: Offset of the Player object 12 bytes: Player object metadata 4 bytes: Name string metadata 4 bytes: Health 4 bytes: Armor + 8 bytes: "Robert" and padding -------------------------------------------- 40 bytes The philosophy is simple FlatBuffers generates code that uses just a few jumps to get any element you want. We use pointer arithmetic to skip the parsing step completely.

Speed: Part 2 Speed: The big picture We're not just
talking about mobile anymore. Every computer is resource-constrained. At scale, inefficiences add up to tremendous amounts of energy, time, and money.

FlatBuffers can help Orders of magnitude faster Read-only microbenchmarks on
a small dataset: Library ops/sec nanoseconds/op FlatBuffers 12,500,000 80 Protocol Buffers LITE 3,311 302,000 Rapid JSON 1,718 583,000 pugixml 5,102 196,000 Over 1000 times faster than Protocol Buffers. This uses the C++ version; we hope to make the Go version comparably fast. Source: google.github.io/flatbuffers/md__benchmarks.html (https://google.github.io/flatbuffers/md__benchmarks.html)

More reasons to use FlatBuffers Not only is it fast,
it also supports Schema versioning Union fields Default values Inline structs Variable-length vectors Available in C++ C# Java Go Hackable 2,200+ GitHub stars Clearly written, easy to comprehend Stable wire format Unit test suite Fuzz test suites Few lines of code C++ 3109 C/C++ Header 1179 Go 724 --------------------- Total 5012

Use FlatBuffers today! Documentation: google.github.io/flatbuffers (https://google.github.io/flatbuffers) Source code: github.com/google/flatbuffers (https://github.com/google/flatbuffers)
Go runtime library: go get github.com/google/flatbuffers/go Schema compiler: git clone https://github.com/google/flatbuffers.git Thank you Robert Winslow Programmer [email protected] (mailto:[email protected]) http://rwinslow.com/ (http://rwinslow.com/) @robert_winslow (http://twitter.com/robert_winslow)

FlatBuffers for Go

FlatBuffers for Go

Hakka Labs

More Decks by Hakka Labs

Other Decks in Programming

Featured

Transcript

FlatBuffers for Go Fast and Fun Serialization 21 January 2015

What is a serialization format? A standard way to store

Serialization Trick question: everybody. Serialization standards are what let us

Why a new serialization format? The primary alternative was Protocol

Why a new serialization format? Bad things about Protocol Buffers:

Why a new serialization format? They asked, could we build

What is FlatBuffers? New serialization standard that is both featureful

The big idea (TL;DR) FlatBuffer files are statically-typed, schema-versioned, portable

Speed: Part 1 Why care about speed? Computers are "fast

What makes FlatBuffers so fast at read operations? A different

Minimal serialization of an array An array is a sequence

Minimal serialization of an array Given the buffer: buf :=

Minimal serialization of a struct Given the struct type Particle:

Minimal serialization of a struct type Particle struct { X

The philosophy is simple Just like arrays and structs, the

Using FlatBuffers Here's an example schema: // player.fbs namespace Game;

Using FlatBuffers To create data, use the generated builder functions:

Using FlatBuffers Here's how the object looks on disk: 8

Speed: Part 2 Speed: The big picture We're not just

FlatBuffers can help Orders of magnitude faster Read-only microbenchmarks on

More reasons to use FlatBuffers Not only is it fast,

Use FlatBuffers today! Documentation: google.github.io/flatbuffers (https://google.github.io/flatbuffers) Source code: github.com/google/flatbuffers (https://github.com/google/flatbuffers)