Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Protocol Buffers implementation with using Elixir

niku
June 01, 2019

Protocol Buffers implementation with using Elixir

niku

June 01, 2019
Tweet

More Decks by niku

Other Decks in Technology

Transcript

  1. Me Me Living in Sapporo, Hokkaido Work at Farmnote A

    Software and Hardware company to solve agriculutual issues particularly hard management Enjoy sapporo-beam A community to talk about ErlangVM Almost every Thursday Since 2014 https://github.com/niku 3 . 1
  2. Bitstring and binary in BEAM Bitstring and binary in BEAM

    description representation a kind of bitstring? a kind of binary? bitstring a sequence of zero or more bits Yes No binary A bitstring its size is divisible by 8 Yes Yes 4 . 2
  3. Manupirations Manupirations Type How to Result Making a binary as

    you read binary bitstring Pattern matching binary = is , is bitstring = is , is Concatenation binary bitstring = = Conversion to integer binary 511 bitstring = = = 2 4 . 3
  4. ProtocolBu ers(PB) ProtocolBu ers(PB) A mechanism for serializing structured data

    like XML simpler 3 to 10 times smaller 20 to 100 times faster By default gRPC uses protocol buffers 5 . 1
  5. Overview Overview StructuredData: %MyMessage{a: 150} Proto: message MyMessage { optional

    int32 a = 1; } Binary: 00001000_10010110_00000001 [Serialize] [Deserialize] StructuredData Binary \__Binary \__StructuredData / / +-----+ +-----+ |Proto| |Proto| +-----+ +-----+ 5 . 2
  6. Binary Binary 00001000_10010110_00000001_...... msb 0 ^||||||| | | key number

    ^^^^||| | | value type ^^^ | | msb 1 ^ | msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 5 . 3
  7. Value part Value part 00001000_10010110_00000001_.......... msb 0 ^||||||| | |

    key number ^^^^||| | | +---+ value type ^^^ | | | H | => msb 1 ^ | | E | => msb 0 ^ | | => drop msb 0010110 0000001 | R | => reverse 0000001 0010110 | E | => concatenate 000000_10010110 +---+ part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 6 . 1
  8. Varint Varint A type of a PB's value Represents an

    integer Has variable-length (1 ~ 16bytes) 6 . 3
  9. MSB(Most Signi cant Bit) MSB(Most Signi cant Bit) To divide

    to a varint chunk If msb is , next byte is also a part of varint If msb is , next byte is the last byte of varint 10010110_00000001_... | | 1. MSB is 1. So next byte is also a part of varint -+ | 2. MSB is 0. So this byte is the last byte of varint ----------+ Varint chunk: 10010110 00000001 10010110_10000001_00000011_... | | | 1. MSB is 1. So next byte is also a part of varint -+ | | 2. MSB is 1. So next byte is also a part of varint ----------+ | 3. MSB is 0. So this byte is the last byte of varint -------------------+ Varint chunk: 10010110 10000001 00000011 6 . 4
  10. Converting varint binary to integer Converting varint binary to integer

    10010110_10000001_00000011_.......... 10010110_10000001_00000011 1. Divide to a varint chunk by using msb 0010110 0000001 0000011 2. Drop msbs 0000011 0000001 0010110 3. Reverse order by each 7bit 00000_11000000_10010110 4. Concatenate bitsrings || | | || 5. Cast as an integer || | | || || | | |+ 2 || | | +- 4 || | +--- 16 || +------ 128 |+-------------- 16384 +--------------- 32768 32768 + 16384 + 128 + 16 + 4 + 2 = 49302 6 . 5
  11. Dividing to a varint chunk - Pattern matching is awesome

    Dividing to a varint chunk - Pattern matching is awesome b = :binary.encode_unsigned(0b10010110_10000001_00000011) <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>> = b msb # => 1 rest_7bits # => <<22::size(7)>> rest_binary # => <<129, 3>> 10010110_10000001_00000011 + msb 1 +-----+ rest_7bits 0010110 +---------------+ rest_binary 10000001_00000011 6 . 7
  12. defmodule MsbBinary do def scan(<<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) do IO.inspect(msb: msb,

    rest_7bits: rest_7bits, rest_binary: rest_binary) end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3>>] 6 . 8
  13. Dividing to a varint chunk - Recursion is also awesome

    Dividing to a varint chunk - Recursion is also awesome defmodule MsbBinary do def scan(<<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 0 do IO.inspect(msb: 0, rest_7bits: rest_7bits, rest_binary: rest_binary) end def scan(<<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 1 do IO.inspect(msb: 1, rest_7bits: rest_7bits, rest_binary: rest_binary) scan(rest_binary) # Do recursive end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3>>] [msb: 1, rest_7bits: <<1::size(7)>>, rest_binary: <<3>>] [msb: 0, rest_7bits: <<3::size(7)>>, rest_binary: ""] 6 . 9
  14. Scanning varint Scanning varint 10010110_10000001_00000011_.......... 10010110_10000001_00000011 1. Divide to a

    varint chunk by using msb 0010110 0000001 0000011 2. Drop msbs 0000011 0000001 0010110 3. Reverse order by each 7bit 00000_11000000_10010110 4. Concatenate bitsrings 5. Cast as an integer 6 . 10
  15. defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary)

    end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 0 do IO.inspect(progress: progress, msb: 0, rest_7bits: rest_7bits, rest_binary: rest_binary) {<<rest_7bits::bitstring, progress::bitstring>>, rest_binary} end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 1 do IO.inspect(progress: progress, msb: 1, rest_7bits: rest_7bits, rest_binary: rest_binary) do_scan(<<rest_7bits::bitstring, progress::bitstring>>, rest_binary) end end binary = :binary.encode_unsigned(0b10010110_10000001_00000011) MsbBinary.scan(binary) [progress: "", msb: 1, rest_7bits: <<22::size(7)>>, rest_binary: <<129, 3, 5, 64>>] [progress: <<22::size(7)>>, msb: 1, rest_7bits: <<1::size(7)>>, rest_binary: <<3, 5, 64>>] [progress: <<2, 22::size(6)>>, msb: 0, rest_7bits: <<3::size(7)>>, rest_binary: <<5, 64>>] {<<6, 4, 22::size(5)>>, <<5, 64>>} # Result 6 . 11
  16. Converting varint Converting varint defmodule MsbBinary do def scan(binary) when

    is_binary(binary) do do_scan(<<>>, binary) end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 0 do {<<rest_7bits::bitstring, progress::bitstring>>, rest_binary} end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 1 do do_scan(<<rest_7bits::bitstring, progress::bitstring>>, rest_binary) end end defmodule Varint do def as_int32(b) when is_bitstring(b) do size = bit_size(b) <<i::integer-size(size)>> = b i end end binary = :binary.encode_unsigned(0b10010110_00000001) {chunk, _rest} = MsbBinary.scan(binary) # => {<<2, 22::size(6)>>, ""} Varint.as_int32(chunk) # => 150 6 . 12
  17. You've been able to understand the part You've been able

    to understand the part of gure of gure 00001000_10010110_00000001_.......... msb 0 ^||||||| | | key number ^^^^||| | | +---+ value type ^^^ | | | H | => msb 1 ^ | | E | => msb 0 ^ | | => drop msb 0010110 0000001 | R | => reverse 0000001 0010110 | E | => concatenate 000000_10010110 +---+ part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 6 . 13
  18. Key part Key part +---+ 00001000_10010110_00000001_.......... | H | =>

    msb 0 ^||||||| | | | E | => key number ^^^^||| | | | R | => value type ^^^ | | | E | msb 1 ^ | +---+ msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 7 . 1
  19. Scaning key part Scaning key part Same as varint Taking

    while msb is The msb which is the last byte of key part is Parsing value is a little bit different 7 . 3
  20. Key part has two information Key part has two information

    A type of the value an integer between from 0 to 5 In the example, the type of value which is bind by key a is int32 due to A value of the key an integer greater than In the example, when key is serialized, it represented by 1 due to message MyMessage { optional int32 a = 1; } 7 . 4
  21. Getting a type of the value Getting a type of

    the value The last 3 bits of key part number 3 and 4 are depricated number Meaning Used For 0 Varint int32, int64, uint32, uint64, sint32, sint64, bool, enum 1 64-bit xed64, s xed64, double 2 Length- delimited string, bytes, embedded messages, packed repeated elds 5 32-bit xed32, s xed32, oat 7 . 5
  22. Getting a value of the key Getting a value of

    the key The last 3 bits are "type of the value" Parsing as a varint except above 7 . 6
  23. Parsing key part Parsing key part type bitstring integer meaning

    A value of the key 0001 1 The Value of the key is 1 A type of the value 000 0 Type of the value is Varint 00001000_10010110_........ msb 0 ^||||||| key number ^^^^||| value type ^^^ part +-key--+ +----value------- 7 . 7
  24. Elixir Elixir defmodule MsbBinary do def scan(binary) when is_binary(binary) do

    do_scan(<<>>, binary) end defp do_scan(progress <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 0 do {<<rest_7bits::bitstring, progress::bitstring>>, rest_binary} end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 1 do do_scan(<<rest_7bits::bitstring, progress::bitstring>>, rest_binary) end end defmodule Key do @wire_type_size 3 @wire_types %{ 0 => :varint, 1 => :"64bit", 2 => :length_delimited, 3 => :start_group, 4 => :end_group, 5 => :"32bit" } def parse(b) when is_bitstring(b) do total_size = bit_size(b) key_size = total_size - @wire_type_size <<key_no::size(key_size), wire_type_no::size(@wire_type_size)>> = b {key_no, @wire_types[wire_type_no]} end end binary = :binary.encode_unsigned(0b00001000_10010110_00000001) {chunk, _rest} = MsbBinary.scan(binary) # => {<<8::size(7)>>, <<150, 1>>} Key.parse(chunk) # => {1, :varint} 7 . 8
  25. You've been able to understand the part You've been able

    to understand the part of gure of gure +---+ 00001000_10010110_00000001_.......... | H | => msb 0 ^||||||| | | | E | => key number ^^^^||| | | | R | => value type ^^^ | | | E | msb 1 ^ | +---+ msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 7 . 9
  26. You know BEAM basis about You know BEAM basis about

    bitstring bitstring and binary and binary Difference between bitstring and binary Make a binary as you read Bitstring pattern matching Concatenate bitstrings Convert bitstring to integer 8 . 2
  27. You almost know what this gure You almost know what

    this gure represents represents 00001000_10010110_00000001_...... msb 0 ^||||||| | | key number ^^^^||| | | value type ^^^ | | msb 1 ^ | msb 0 ^ drop msb 0010110 0000001 reverse 0000001 0010110 concatenate 000000_10010110 part +-key--+ +----value------+ kv pair +------------------------+ kv pair +--------+ message +-----------------------------------+ 8 . 3
  28. defmodule MsbBinary do def scan(binary) when is_binary(binary) do do_scan(<<>>, binary)

    end defp do_scan(progress <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 0 do {<<rest_7bits::bitstring, progress::bitstring>>, rest_binary} end defp do_scan(progress, <<msb::1, rest_7bits::bitstring-(7), rest_binary::binary>>) when msb == 1 do do_scan(<<rest_7bits::bitstring, progress::bitstring>>, rest_binary) end end defmodule Varint do def as_int32(b) when is_bitstring(b) do size = bit_size(b) <<i::integer-size(size)>> = b i end end defmodule Key do @wire_type_size 3 @wire_types %{ 0 => :varint, 1 => :"64bit", 2 => :length_delimited, 3 => :start_group, 4 => :end_group, 5 => :"32bit" } def parse(b) when is_bitstring(b) do total_size = bit_size(b) key_size = total_size - @wire_type_size <<key_no::size(key_size), wire_type_no::size(@wire_type_size)>> = b {key_no, @wire_types[wire_type_no]} end end binary = :binary.encode_unsigned(0b00001000_10010110_00000001) {key_part, rest} = MsbBinary.scan(binary) {key_no, wire_type} = Key.parse(key_part) # => {1, :varint} {value_part, _} = MsbBinary.scan(rest) value = Varint.as_int32(value_part) # => 150 8 . 5
  29. Appendix Appendix This slide is made for Erlang & Elixir

    Fest 2019 Protocol Buffer encoding speci cation My protocol Buffer implementation (Very experimental) in elixirforum You might also like articles https://elixir-fest.jp/ https://developers.google.com/protocol- buffers/docs/encoding https://github.com/niku/elixir_protobuf Further beam bitstring talks "Building a new MySQL adapter for Ecto" 9 . 1