Slide 1

Slide 1 text

No content

Slide 2

Slide 2 text

Grzegorz Kołodziejczyk CTO @ Ragnarson

Slide 3

Slide 3 text

Working with
 binary data
 in Ruby

Slide 4

Slide 4 text

How we got into this? Binary data 101 What's available in the STDLIB A gem that makes it all easier

Slide 5

Slide 5 text

How we got into this? Finally not a cookie cutter project Data from meters and sensors Collected by gateways, sent over MQTT MQTT -> Sidekiq OMS protocol, raw payloads

Slide 6

Slide 6 text

How we got into this? In production for over 2 years ~2000 "requests" per minute Industry leaders in amount of supported devices

Slide 7

Slide 7 text

How we got into this? Low-level protocol = C

Slide 8

Slide 8 text

How we got into this? No good open source libraries

Slide 9

Slide 9 text

How we got into this? Niche protocol

Slide 10

Slide 10 text

How we got into this? Do I really want to write C?

Slide 11

Slide 11 text

How we got into this? Can we do it in Ruby instead?

Slide 12

Slide 12 text

Binary data 101

Slide 13

Slide 13 text

Binary data 101 All data is binary

Slide 14

Slide 14 text

Binary data 101 All keyboards are mechanical keyboards

Slide 15

Slide 15 text

Binary data 101 Bit is either 1 or 0

Slide 16

Slide 16 text

Binary data 101 A byte is a group of 8 bits 27 26 25 24 23 22 21 20 128 64 32 16 8 4 2 1 0 1 0 1 1 0 0 1 0 64 0 16 8 0 0 1 89

Slide 17

Slide 17 text

Binary data 101 Endianness is byte order

Slide 18

Slide 18 text

Binary data 101 ASCII is for character encoding "a".ord => 97 122.chr => "z" 10.chr => "\n"

Slide 19

Slide 19 text

Binary data 101 Hexadecimal notation 0b11111111 == 255 == 0xFF 0b01011001 == 89 == 0x59

Slide 20

Slide 20 text

Binary data 101 Bit Byte Endianness ASCII Hex

Slide 21

Slide 21 text

What's available in the STDLIB

Slide 22

Slide 22 text

What's available in the STDLIB String#pack and String#unpack

Slide 23

Slide 23 text

What's available in the STDLIB String#pack and String#unpack "\x8A\xE3".unpack("H*") => ["8ae3"]

Slide 24

Slide 24 text

What's available in the STDLIB String#pack and String#unpack "\x8A\xE3".unpack1("H*") => "8ae3"

Slide 25

Slide 25 text

What's available in the STDLIB String#pack and String#unpack ["8ae3"].pack("H*") => "\x8A\xE3"

Slide 26

Slide 26 text

What's available in the STDLIB String#pack and String#unpack ["420A"].pack("H*") => "B\n"

Slide 27

Slide 27 text

What's available in the STDLIB

Slide 28

Slide 28 text

What's available in the STDLIB Integer | | Directive | Returns | Meaning ----------------------------------------------------------------- C | Integer | 8-bit unsigned (unsigned char) S | Integer | 16-bit unsigned, native endian (uint16_t) L | Integer | 32-bit unsigned, native endian (uint32_t) Q | Integer | 64-bit unsigned, native endian (uint64_t)

Slide 29

Slide 29 text

What's available in the STDLIB String | | Directive | Returns | Meaning ----------------------------------------------------------------- A | String | arbitrary binary string (remove trailing nulls and ASCII spaces) a | String | arbitrary binary string Z | String | null-terminated string B | String | bit string (MSB first) b | String | bit string (LSB first) H | String | hex string (high nibble first) h | String | hex string (low nibble first)

Slide 30

Slide 30 text

Some examples

Slide 31

Slide 31 text

IP Address

Slide 32

Slide 32 text

IP Address "\xC0\xA8\x01\x01" "\xC0\xA8\x01\x01".unpack("CCCC").join(".") => "192.168.1.1 "192.168.1.1".split(".").map(&:to_i).pack("CCCC") => "\xC0\xA8\x01\x01"

Slide 33

Slide 33 text

MAC Address "\x00\x01\x02\x03\x04\x05" "\x00\x01\x02\x03\x04\x05". unpack("H2H2H2H2H2H2").join(":") => "00:01:02:03:04:05" "00:01:02:03:04:05". split(":").pack("H2H2H2H2H2H2") => "\x00\x01\x02\x03\x04\x05"

Slide 34

Slide 34 text

gzip - show original filename +---+---+---+---+---+---+---+---+---+---+ |ID1|ID2|CM |FLG| MTIME |XFL|OS | (more-->) +---+---+---+---+---+---+---+---+---+---+ +---+---+=================================+ | XLEN |...XLEN bytes of "extra field"...| (more-->) +---+---+=================================+ +=========================================+ |...original file name, zero-terminated...| (more-->) +=========================================+

Slide 35

Slide 35 text

1 def original_filename(file_path) 2 File.open(file_path, "rb") do |f| 3 header = f.read(10) 4 magic, cm, flg, mtime, xfl, os = header.unpack("H4H2B8LH2C") 5 6 (magic == "1f8b") || raise("Invalid gzip header") 7 (cm == "08") || raise("Unknown compression method") 8 9 _, _, _, fcomment, fname, fextra, fhcrc, ftext = flg.split("") 10 11 if fextra == "1" 12 xlen = f.read(2).unpack1("S") 13 f.seek(xlen, IO::SEEK_CUR) 14 end 15 16 f.gets("\x00").unpack1("Z*") if fname == "1" 17 end 18 end

Slide 36

Slide 36 text

gzip - show original filename original_filename("test2.txt.gz") => "test.txt"

Slide 37

Slide 37 text

1 def original_filename(file_path) 2 File.open(file_path, "rb") do |f| 3 header = f.read(10) 4 magic, cm, flg, mtime, xfl, os = header.unpack("H4H2B8LH2C") 5 6 (magic == "1f8b") || raise("Invalid gzip header") 7 (cm == "08") || raise("Unknown compression method") 8 9 _, _, _, fcomment, fname, fextra, fhcrc, ftext = flg.split("") 10 11 if fextra == "1" 12 xlen = f.read(2).unpack1("S") 13 f.seek(xlen, IO::SEEK_CUR) 14 end 15 16 f.gets("\x00").unpack1("Z*") if fname == "1" 17 end 18 end

Slide 38

Slide 38 text

1 ['audio/x-ms-asx', [[0, 'ASF '], [0..64, ' "\u001F\x8B"

Slide 39

Slide 39 text

BinData - a better way

Slide 40

Slide 40 text

BinData - a better way https://github.com/dmendel/bindata "BinData provides a declarative way to read and write structured binary data." used by ~1300 repos

Slide 41

Slide 41 text

1 class IPAddr < BinData::Primitive 2 array :octets, type: :uint8, initial_length: 4 3 4 def set(val) 5 self.octets = val.split(/\./).map(&:to_i) 6 end 7 8 def get 9 self.octets.map(&:to_s).join(".") 10 end 11 end IPAddr.new("192.168.1.1").to_binary_s => "\xC0\xA8\x01\x01"

Slide 42

Slide 42 text

1 class IPAddr < BinData::Primitive 2 array :octets, type: :uint8, initial_length: 4 3 4 def set(val) 5 self.octets = val.split(/\./).map(&:to_i) 6 end 7 8 def get 9 self.octets.map(&:to_s).join(".") 10 end 11 end IPAddr.read("\xC0\xA8\x01\x01") => "192.168.1.1"

Slide 43

Slide 43 text

1 class MacAddr < BinData::Primitive 2 array :octets, type: :uint8, initial_length: 6 3 4 def set(val) 5 self.octets = val.split(/:/).collect(&:to_i) 6 end 7 8 def get 9 octets.collect { |octet| "%02x" % octet }.join(":") 10 end 11 end MacAddr.new("00:01:02:03:04:05"). to_binary_s => "\x00\x01\x02\x03\x04\x05"

Slide 44

Slide 44 text

1 class Gzip < BinData::Record 2 # Known compression methods 3 DEFLATE = 8 4 5 endian :little 6 7 uint16 :ident, asserted_value: 0x8b1f 8 uint8 :compression_method, asserted_value: DEFLATE 9 10 bit3 :freserved, asserted_value: 0 11 bit1 :fcomment 12 bit1 :ffile_name 13 bit1 :fextra 14 bit1 :fcrc16 15 bit1 :ftext 16 17 uint32 :mtime 18 uint8 :extra_flags 19 uint8 :os 20 21 struct :extra, onlyif: -> { fextra.nonzero? } do 22 uint16 :len 23 string :data, read_length: :len 24 end 25 stringz :file_name, onlyif: -> { ffile_name.nonzero? } 26 stringz :comment, onlyif: -> { fcomment.nonzero? } 27 uint16 :crc16, onlyif: -> { fcrc16.nonzero? } 28 29 # ignore rest so that we don't load everything into memory 30 end 1 def original_filename(file_path) 2 File.open(file_path, "rb") do |f| 3 header = f.read(10) 4 magic, cm, flg, mtime, xfl, os = header.unpack("H4H2B8LH2C") 5 6 (magic == "1f8b") || raise("Invalid gzip header") 7 (cm == "08") || raise("Unknown compression method") 8 9 _, _, _, fcomment, fname, fextra, fhcrc, ftext = flg.split("") 10 11 if fextra == "1" 12 xlen = f.read(2).unpack1("S") 13 f.seek(xlen, IO::SEEK_CUR) 14 end 15 16 f.gets("\x00").unpack1("Z*") if fname == "1" 17 end 18 end

Slide 45

Slide 45 text

Real life examples

Slide 46

Slide 46 text

1 class Mbus::BCD4 < BinData::Primitive 2 bit4 :d3 3 bit4 :d4 4 bit4 :d1 5 bit4 :d2 6 7 def set(value) 8 self.d1 = (value / 10**3) % 10 9 self.d2 = (value / 10**2) % 10 10 self.d3 = (value / 10**1) % 10 11 self.d4 = (value / 10**0) % 10 12 end 13 14 def get 15 d1 * 10**3 + 16 d2 * 10**2 + 17 d3 * 10**1 + 18 d4 * 10**0 19 end 20 end

Slide 47

Slide 47 text

1 class Mbus::DataInformationField < BinData::Record 2 bit1 :extension_bit 3 bit1 :storage_number_lsb 4 bit2 :function_field_code 5 bit4 :data_field_code 6 data_information_field_extension :data_information_field_extension, onlyif: -> { extension? } 7 8 # ... 9 end 1 class Mbus::DataInformationFieldExtension < BinData::Record 2 bit1 :extension_bit 3 bit1 :device_unit 4 bit2 :tariff 5 bit4 :storage_number 6 data_information_field_extension :data_information_field_extension, onlyif: -> { extension? } 7 8 # ... 9 end data_information_field.data_information_field_extension.data_information_field_extension....

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

1 class Mbus::DataTypeI < BinData::Primitive 2 bit1 :leap_year 3 bit1 :dst 4 bit6 :second 5 6 bit1 :invalid 7 bit1 :dst_direction 8 bit6 :minute 9 10 bit3 :day_of_week 11 bit5 :hour 12 13 bit3 :year_lower 14 bit5 :day 15 16 bit4 :year_upper 17 bit4 :month 18 19 bit2 :dst_offset 20 bit6 :week_number 21 22 def get 23 Time.new(year, month, day, hour, minute, second) 24 end 25 26 def year 27 2000 + year_upper * 2**3 + year_lower 28 end 29 end

Slide 50

Slide 50 text

1 class Mbus::AuthenticationAndFragmentationLayer < BinData::Record 2 include Mbus::CI 3 4 # AFL-Length [OMSvol2:4.1.2 6.2.1] 5 uint8 :afll 6 # Fragmentation Control Field [OMSvol2:4.1.2 6.2.1] 7 fragmentation_control_field :fcl 8 # Message Control Field [OMSvol2:4.1.2 6.2.1] 9 uint8 :mcl, onlyif: -> { fcl.mclp == 1 } 10 # Key Information Field [OMSvol2:4.1.2 6.2.1] 11 int16le :ki, onlyif: -> { fcl.kip == 1 } 12 # Message Counter Field [OMSvol2:4.1.2 6.2.1] 13 int32le :message_counter_c, onlyif: -> { fcl.mcrp == 1 } 14 # Message Authentication Code [OMSvol2:4.1.2 6.2.1] 15 string :cmac, length: 8, onlyif: -> { fcl.macp == 1 } 16 # Message Length Field [OMSvol2:4.1.2 6.2.1] 17 int16le :ml, onlyif: -> { fcl.mlp == 1 } 18 19 uint8 :next_control_info 20 count_bytes_remaining :bytes_remaining 21 transport_layer :transport_layer, read_length: :bytes_remaining, onlyif: :ci_transport_layer? 22 23 # ... 24 end

Slide 51

Slide 51 text

Recap Definitely doable in Ruby Even quite fun! Simple operations -> pack/unpack BinData saved our lives

Slide 52

Slide 52 text

No content

Slide 53

Slide 53 text

Thanks!