Slide 1

Slide 1 text

Protocol Bu ers Protocol Bu ers implementation implementation with using Elixir with using Elixir niku niku

Slide 2

Slide 2 text

Agenda Agenda How to parse a binary to How to do by using Elixir 2 . 1

Slide 3

Slide 3 text

Me Me Living in Sapporo, Hokkaido Work at Farmnote A Software and Hardware company to solve agriculutual issues particularly hard management Enjoy sapporo-beam A community to talk about ErlangVM Almost every Thursday Since 2014 https://github.com/niku 3 . 1

Slide 4

Slide 4 text

BEAM bitstring and binary BEAM bitstring and binary basis basis 4 . 1

Slide 5

Slide 5 text

Bitstring and binary in BEAM Bitstring and binary in BEAM description representation a kind of bitstring? a kind of binary? bitstring a sequence of zero or more bits Yes No binary A bitstring its size is divisible by 8 Yes Yes 4 . 2

Slide 6

Slide 6 text

Manupirations Manupirations Type How to Result Making a binary as you read binary bitstring Pattern matching binary = is , is bitstring = is , is Concatenation binary bitstring = = Conversion to integer binary 511 bitstring = = = 2 4 . 3

Slide 7

Slide 7 text

ProtocolBu ers(PB) ProtocolBu ers(PB) A mechanism for serializing structured data like XML simpler 3 to 10 times smaller 20 to 100 times faster By default gRPC uses protocol buffers 5 . 1

Slide 8

Slide 8 text

Overview Overview StructuredData: %MyMessage{a: 150} Proto: message MyMessage { optional int32 a = 1; } Binary: 00001000_10010110_00000001 [Serialize] [Deserialize] StructuredData Binary \__Binary \__StructuredData / / +-----+ +-----+ |Proto| |Proto| +-----+ +-----+ 5 . 2

Slide 9

Slide 9 text

Binary Binary 00001000_10010110_00000001_...... msb 0 ^||||||| | | key number ^^^^||| | | value type ^^^ | | msb 1 ^ | msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 5 . 3

Slide 10

Slide 10 text

Value part Value part 00001000_10010110_00000001_.......... msb 0 ^||||||| | | key number ^^^^||| | | +---+ value type ^^^ | | | H | => msb 1 ^ | | E | => msb 0 ^ | | => drop msb 0010110 0000001 | R | => reverse 0000001 0010110 | E | => concatenate 000000_10010110 +---+ part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 6 . 1

Slide 11

Slide 11 text

Speci cation Speci cation 6 . 2

Slide 12

Slide 12 text

Varint Varint A type of a PB's value Represents an integer Has variable-length (1 ~ 16bytes) 6 . 3

Slide 13

Slide 13 text

MSB(Most Signi cant Bit) MSB(Most Signi cant Bit) To divide to a varint chunk If msb is , next byte is also a part of varint If msb is , next byte is the last byte of varint 10010110_00000001_... | | 1. MSB is 1. So next byte is also a part of varint -+ | 2. MSB is 0. So this byte is the last byte of varint ----------+ Varint chunk: 10010110 00000001 10010110_10000001_00000011_... | | | 1. MSB is 1. So next byte is also a part of varint -+ | | 2. MSB is 1. So next byte is also a part of varint ----------+ | 3. MSB is 0. So this byte is the last byte of varint -------------------+ Varint chunk: 10010110 10000001 00000011 6 . 4

Slide 14

Slide 14 text

Converting varint binary to integer Converting varint binary to integer 10010110_10000001_00000011_.......... 10010110_10000001_00000011 1. Divide to a varint chunk by using msb 0010110 0000001 0000011 2. Drop msbs 0000011 0000001 0010110 3. Reverse order by each 7bit 00000_11000000_10010110 4. Concatenate bitsrings || | | || 5. Cast as an integer || | | || || | | |+ 2 || | | +- 4 || | +--- 16 || +------ 128 |+-------------- 16384 +--------------- 32768 32768 + 16384 + 128 + 16 + 4 + 2 = 49302 6 . 5

Slide 15

Slide 15 text

Elixir Elixir 6 . 6

Slide 16

Slide 16 text

Dividing to a varint chunk - Pattern matching is awesome Dividing to a varint chunk - Pattern matching is awesome b = :binary.encode_unsigned(0b10010110_10000001_00000011) <> = b msb # => 1 rest_7bits # => <<22::size(7)>> rest_binary # => <<129, 3>> 10010110_10000001_00000011 + msb 1 +-----+ rest_7bits 0010110 +---------------+ rest_binary 10000001_00000011 6 . 7

Slide 17

Slide 17 text

defmodule MsbBinary do def scan(<>) do IO.inspect(msb: msb, rest_7bits: rest_7bits, rest_binary: rest_binary) end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3>>] 6 . 8

Slide 18

Slide 18 text

Dividing to a varint chunk - Recursion is also awesome Dividing to a varint chunk - Recursion is also awesome defmodule MsbBinary do def scan(<>) when msb == 0 do IO.inspect(msb: 0, rest_7bits: rest_7bits, rest_binary: rest_binary) end def scan(<>) when msb == 1 do IO.inspect(msb: 1, rest_7bits: rest_7bits, rest_binary: rest_binary) scan(rest_binary) # Do recursive end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3>>] [msb: 1, rest_7bits: <<1::size(7)>>, rest_binary: <<3>>] [msb: 0, rest_7bits: <<3::size(7)>>, rest_binary: ""] 6 . 9

Slide 19

Slide 19 text

Scanning varint Scanning varint 10010110_10000001_00000011_.......... 10010110_10000001_00000011 1. Divide to a varint chunk by using msb 0010110 0000001 0000011 2. Drop msbs 0000011 0000001 0010110 3. Reverse order by each 7bit 00000_11000000_10010110 4. Concatenate bitsrings 5. Cast as an integer 6 . 10

Slide 20

Slide 20 text

defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary) end defp do_scan(progress, <>) when msb == 0 do IO.inspect(progress: progress, msb: 0, rest_7bits: rest_7bits, rest_binary: rest_binary) {<>, rest_binary} end defp do_scan(progress, <>) when msb == 1 do IO.inspect(progress: progress, msb: 1, rest_7bits: rest_7bits, rest_binary: rest_binary) do_scan(<>, rest_binary) end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [progress: "", msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3, 5, 64>>] [progress: <<22::size(7)>>, msb: 1, rest_7bits: <<1::size(7)>>, rest_binary: <<3, 5, 64>>] [progress: <<2, 22::size(6)>>, msb: 0, rest_7bits: <<3::size(7)>>, rest_binary: <<5, 64>>] {<<6, 4, 22::size(5)>>, <<5, 64>>} # Result 6 . 11

Slide 21

Slide 21 text

Converting varint Converting varint defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary) end defp do_scan(progress, <>) when msb == 0 do {<>, rest_binary} end defp do_scan(progress, <>) when msb == 1 do do_scan(<>, rest_binary) end end defmodule Varint do def as_int32(b) when is_bitstring(b) do size = bit_size(b) <> = b i end end binary = :binary.encode_unsigned(0b10010110_00000001) {chunk, _rest} = MsbBinary.scan(binary) # => {<<2, 22::size(6)>>, ""} Varint.as_int32(chunk) # => 150 6 . 12

Slide 22

Slide 22 text

You've been able to understand the part You've been able to understand the part of gure of gure 00001000_10010110_00000001_.......... msb 0 ^||||||| | | key number ^^^^||| | | +---+ value type ^^^ | | | H | => msb 1 ^ | | E | => msb 0 ^ | | => drop msb 0010110 0000001 | R | => reverse 0000001 0010110 | E | => concatenate 000000_10010110 +---+ part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 6 . 13

Slide 23

Slide 23 text

Key part Key part +---+ 00001000_10010110_00000001_.......... | H | => msb 0 ^||||||| | | | E | => key number ^^^^||| | | | R | => value type ^^^ | | | E | msb 1 ^ | +---+ msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 7 . 1

Slide 24

Slide 24 text

Speci cation Speci cation 7 . 2

Slide 25

Slide 25 text

Scaning key part Scaning key part Same as varint Taking while msb is The msb which is the last byte of key part is Parsing value is a little bit different 7 . 3

Slide 26

Slide 26 text

Key part has two information Key part has two information A type of the value an integer between from 0 to 5 In the example, the type of value which is bind by key a is int32 due to A value of the key an integer greater than In the example, when key is serialized, it represented by 1 due to message MyMessage { optional int32 a = 1; } 7 . 4

Slide 27

Slide 27 text

Getting a type of the value Getting a type of the value The last 3 bits of key part number 3 and 4 are depricated number Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32, sint64, bool, enum 1 64-bit xed64, s xed64, double 2 Length- delimited string, bytes, embedded messages, packed repeated elds 5 32-bit xed32, s xed32, oat 7 . 5

Slide 28

Slide 28 text

Getting a value of the key Getting a value of the key The last 3 bits are "type of the value" Parsing as a varint except above 7 . 6

Slide 29

Slide 29 text

Parsing key part Parsing key part type bitstring integer meaning A value of the key 0001 1 The Value of the key is 1 A type of the value 000 0 Type of the value is Varint 00001000_10010110_........ msb 0 ^||||||| key number ^^^^||| value type ^^^ part +-key--+ +----value------- 7 . 7

Slide 30

Slide 30 text

Elixir Elixir defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary) end defp do_scan(progress <>) when msb == 0 do {<>, rest_binary} end defp do_scan(progress, <>) when msb == 1 do do_scan(<>, rest_binary) end end defmodule Key do @wire_type_size 3 @wire_types %{ 0 => :varint, 1 => :"64bit", 2 => :length_delimited, 3 => :start_group, 4 => :end_group, 5 => :"32bit" } def parse(b) when is_bitstring(b) do total_size = bit_size(b) key_size = total_size - @wire_type_size <> = b {key_no, @wire_types[wire_type_no]} end end binary = :binary.encode_unsigned(0b00001000_10010110_00000001) {chunk, _rest} = MsbBinary.scan(binary) # => {<<8::size(7)>>, <<150, 1>>} Key.parse(chunk) # => {1, :varint} 7 . 8

Slide 31

Slide 31 text

You've been able to understand the part You've been able to understand the part of gure of gure +---+ 00001000_10010110_00000001_.......... | H | => msb 0 ^||||||| | | | E | => key number ^^^^||| | | | R | => value type ^^^ | | | E | msb 1 ^ | +---+ msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 7 . 9

Slide 32

Slide 32 text

Conclusion Conclusion 8 . 1

Slide 33

Slide 33 text

You know BEAM basis about You know BEAM basis about bitstring bitstring and binary and binary Difference between bitstring and binary Make a binary as you read Bitstring pattern matching Concatenate bitstrings Convert bitstring to integer 8 . 2

Slide 34

Slide 34 text

You almost know what this gure You almost know what this gure represents represents 00001000_10010110_00000001_...... msb 0 ^||||||| | | key number ^^^^||| | | value type ^^^ | | msb 1 ^ | msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 8 . 3

Slide 35

Slide 35 text

You can explain You can explain is parsed to 8 . 4

Slide 36

Slide 36 text

defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary) end defp do_scan(progress <>) when msb == 0 do {<>, rest_binary} end defp do_scan(progress, <>) when msb == 1 do do_scan(<>, rest_binary) end end defmodule Varint do def as_int32(b) when is_bitstring(b) do size = bit_size(b) <> = b i end end defmodule Key do @wire_type_size 3 @wire_types %{ 0 => :varint, 1 => :"64bit", 2 => :length_delimited, 3 => :start_group, 4 => :end_group, 5 => :"32bit" } def parse(b) when is_bitstring(b) do total_size = bit_size(b) key_size = total_size - @wire_type_size <> = b {key_no, @wire_types[wire_type_no]} end end binary = :binary.encode_unsigned(0b00001000_10010110_00000001) {key_part, rest} = MsbBinary.scan(binary) {key_no, wire_type} = Key.parse(key_part) # => {1, :varint} {value_part, _} = MsbBinary.scan(rest) value = Varint.as_int32(value_part) # => 150 8 . 5

Slide 37

Slide 37 text

Appendix Appendix This slide is made for Erlang & Elixir Fest 2019 Protocol Buffer encoding speci cation My protocol Buffer implementation (Very experimental) in elixirforum You might also like articles https://elixir-fest.jp/ https://developers.google.com/protocol- buffers/docs/encoding https://github.com/niku/elixir_protobuf Further beam bitstring talks "Building a new MySQL adapter for Ecto" 9 . 1