Slide 1

Slide 1 text

Protobuf in Elixir @tony612

Slide 2

Slide 2 text

Overview • Intro to Proto Buffers • How I implement it in Elixir • What I learned by writing protobuf-elixir

Slide 3

Slide 3 text

“Protocol buffers are a language-neutral, platform-neutral extensible mechanism for serializing structured data.” — by Google

Slide 4

Slide 4 text

Protobuf • A data format(like JSON) • Structured data with schema • Encoded as binary • Written in proto and generated in any language

Slide 5

Slide 5 text

Protobuf JSON

Slide 6

Slide 6 text

Encoding message Test1 { required int32 a = 1; } 08 150 01 Protobuf 3 bytes {"a":150} JSON 9 bytes %Test1{a: 150}

Slide 7

Slide 7 text

Base 128 Varints 1000 0001 0000 0001 1: has more most significant bit 0: end more significant group 2’s complement(64bits in pb) group variable int: store an arbitrarily large integer in a small number of bytes

Slide 8

Slide 8 text

Base 128 Varints 0000 0001 1 Base128 varint Decimal Base128 varint Decimal Calculate 0000 0001 1 1 0111 1111 127 111 1111 = 127 1000 0000 0000 0001 128 000 0001 000 0000 1001 0110 0000 0001(150 01) 150 000 0001 001 0110 1111 1111 …(total 9) 0000 0001 -1 1111 1111 111 1111… (64 1)

Slide 9

Slide 9 text

Wire type

Slide 10

Slide 10 text

Encoding message Test1 { required int32 a = 1; } 08 150 01 value (field_number << 3) | wire_type (1 << 3) | 0 varint of key 1x8+0

Slide 11

Slide 11 text

Decode logic 1. Decode varint to get field number and wire type 2. Get the bit string(value) based on wire type(varint, Length-delimited) 3. Decode the bit string to get right value based on metadata <>

Slide 12

Slide 12 text

Protobuf VS JSON Protobuf JSON Binary(smaller sometimes) Text pre-defined schema Free schema No schema in data Schema included in data Better backward compatibility (but with proper usage) not easy to break things Typed - Computer readable Human readable

Slide 13

Slide 13 text

JSON can be smaller message Test1 { required int32 a = 1; } 08 255 …(9) 1 Protobuf 11 bytes {“a”:-1} JSON 8 bytes %Test1{a: -1}

Slide 14

Slide 14 text

Must-know for Protobuf • Only add new fields • Don’t change old fields(only if they’re not used anywhere or types are compatible, like int32 and int64). refer: updating • There’s no way to distinguish zero-value or not setting(Protobuf 3) • Always set 0 of Enum to a unused value(like UNKNOWN)

Slide 15

Slide 15 text

Generate code $ protoc -I=$SRC_DIR —somelang_out=$DST_DIR —plugin=./protoc-gen-somelang $SRC_DIR/demo.proto plugin can be inferred from -somelang_out to find protoc-gen-somelang in $PATH

Slide 16

Slide 16 text

Generate code 1. protoc(c++) parse your protobuf files, then generate encoded binary using plugin.proto(plugin.proto is defined by protoc) 2. protoc runs your executable plugin and send the encoded binary to your plugin via STDOUT 3. Your plugin generate the code, encode it using plugin.proto and write the binary to STDOUT

Slide 17

Slide 17 text

Generate code protoc *.proto binary plugin binary plugin.proto *.pb.ex

Slide 18

Slide 18 text

How I implement Protobuf in Elixir

Slide 19

Slide 19 text

Components Express pb in Elixir Decode Encode protoc plugin

Slide 20

Slide 20 text

DSL for a message

Slide 21

Slide 21 text

Where’s field?

Slide 22

Slide 22 text

Where the magic happens

Slide 23

Slide 23 text

import DSL

Slide 24

Slide 24 text

define __message_props__ function

Slide 25

Slide 25 text

Express pb in Elixir

Slide 26

Slide 26 text

By now, we can store protobuf info in a function of a module, which we can use to decode, encode pb

Slide 27

Slide 27 text

Decoding logic

Slide 28

Slide 28 text

escript for building plugin

Slide 29

Slide 29 text

protoc plugin

Slide 30

Slide 30 text

A trick for generator plugin.pb.ex for plugin.proto is needed for decoding STDOUT and encoding to STDOUT when generating Elixir code But how to generate Elixir code for plugin.proto?

Slide 31

Slide 31 text

A trick for generator Write plugin.pb.ex by hand at first

Slide 32

Slide 32 text

What I learned • Macro of Elixir is powerful. Elixir is powerful • Binary handling in Elixir is easy • Keep macro simple • Creating DSL is hard • Encapsulate your structured data in struct(like MessageProps, FieldProps) • Use functions and modules to keep your logic clear