Building a MySQL Database Driver in Elixir

13d3aea89d4b5ba2cd1fa0c88d3038f5?s=47 Maqbool
November 14, 2019

Building a MySQL Database Driver in Elixir

Have you ever wondered what happens beneath the covers when you talk to your Database? Well, you are in for a treat! In this talk, we are going to uncover the dark magic behind Database Drivers. We will look at everything that is needed to talk to a database, query its data and transform it into the native data types in Elixir.

13d3aea89d4b5ba2cd1fa0c88d3038f5?s=128

Maqbool

November 14, 2019
Tweet

Transcript

  1. Building MySQL database driver in Elixir Mohd Maqbool Alam (@maqboolism)

    https://blog.maqbool.net https://t.me/elixirdelhi (Elixir Delhi)
  2. Erlang Ecosystem Foundation https://erlef.org/

  3. Goals • Learning about Elixir binary pattern matching • Building

    a ELF binary parser (readelf) • Making socket connections on the BEAM and learning more about the various :gen_tcp API • Understanding MySQL client-server protocol and convert its various packets into Elixir native types
  4. Binary pattern Matching Binary pattern matching is the powerful feature

    of Elixir for extracting bits and piece out of a binary string
  5. iex(1)> string = <<"Functional Conf 2019">> "Functional Conf 2019" iex(2)>

    <<first:: binary-size(10) , rest::binary>> = string "Functional Conf 2019" iex(3)> first "Functional"
  6. Unit and Size Type Unit integer 1 bit float 1

    bit binary 8 bit
  7. size * unit

  8. Modifiers signed integer unsigned (default) integer little integer, float, utf16,

    utf32 big (default) integer, float, utf16, utf32 native integer, utf16, utf32
  9. iex(1)> File.read!("foo.png") <<137, 80, 78, 71, 13, 10, 26, 10,

    0, 0, 0, 13, 73, 72, 68, 82, 0, 0, 3, 151, 0, 0, 0, 145, 8, 6, 0, 0, 0, 87, 36, 76, 101, 0, 0, 0, 4, 115, 66, 73, 84, 8, 8, 8, 8, 124, 8, 100, 136, 0, ...>> iex(2)> File.read!("bar.jpeg") <<255, 216, 255, 224, 0, 16, 74, 70, 73, 70, 0, 1, 1, 0, 0, 1, 0, 1, 0, 0, 255, 219, 0, 67, 0, 5, 3, 4, 4, 4, 3, 5, 4, 4, 4, 5, 5, 5, 6, 7, 12, 8, 7, 7, 7, 7, 15, 11, 11, 9, ...>>
  10. defmodule ImageTyper do @png_signature <<137::size(8), 80::size(8), 78::size(8), 71::size(8), 13::size(8), 10::size(8),

    26::size(8), 10::size(8)>> @jpg_signature <<255::size(8), 216::size(8)>> def type(<<@png_signature, rest::binary>>), do: :png def type(<<@jpg_signature, rest::binary>>), do: :jpg def type(_), do: :unknown end credit: Elixir Docs
  11. ~ ⚡ readelf -h a.out ELF Header: Magic: 7f 45

    4c 46 02 01 01 00 00 00 00 00 00 00 00 00 Class: ELF64 Data: 2's complement, little endian Version: 1 (current) OS/ABI: UNIX - System V ABI Version: 0 Type: EXEC (Executable file) Machine: Advanced Micro Devices X86-64 Version: 0x1 Entry point address: 0x401040 Start of program headers: 64 (bytes into file) Start of section headers: 16360 (bytes into file) Flags: 0x0 Size of this header: 64 (bytes) Size of program headers: 56 (bytes) Number of program headers: 11 Size of section headers: 64 (bytes) Number of section headers: 30 Section header string table index: 29
  12. iex(1)> File.read!("a.out") |> Base.encode16 "7F454C4602010100000000000000000002003E000100000040104000000000004000000000000000E83F0000000000000 0000000400038000B0040001E001D000600000004000000400000000000000040004000000000004000400000000000680 2000000000000680200000000000008000000000000000300000004000000A802000000000000A802400000000000A8024 000000000001C000000000000001C000000000000000100000000000000010000000400000000000000000000000000400 00000000000004000000000003804000000000000380400000000000000100000000000000100000005000000001000000 000000000104000000000000010400000000000C501000000000000C501000000000000001000000000000001000000040

    00000002000000000000000204000000000000020400000000000500100000000000050010000000000000010000000000 0000100000006000000102E000000000000103E400000000000103E4000000000001402000000000000180200000000000 000100000000000000200000006000000202E000000000000203E400000000000203E400000000000D001000000000000D 00100000000000008000000000000000400000004000000C402000000000000C402400000000000C402400000000000440 00000000000004400000000000000040000000000000050E57464040000002820000000000000282040000000000028204 000000000003C000000000000003C00000000000000040000000000000051E574640600000000000000000000000000000 000000000000000000000000000000000000000000000000000000000100000000000000052E5746404000000102E00000 0000000103E400000000000103E400000000000F001000000000000F00100000000000001000000000000002F6C6962363 42F6C642D6C696E75782D7838362D36342E736F2E3200040000001000000001000000474E5500000000000300000002000 00000000000040000001400000003000000474E5500E40E42211785A2618243355D362220624CC6039F010000000100000 00100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 B0000001200000000000000000000000000000000000000120000001200000000000000000000000000000000000000300 000002000000000000000000000000000000000000000006C6962632E736F2E36007072696E7466005F5F6C6962635F737 46172745F6D61696E00474C4942435F322E322E35005F5F676D6F6E5F73746172745F5F000000000200020000000100010 0010000001000000000000000751A6909000002002 … "
  13. defmodule ReadElf do end defstruct [ :magic, :class, :endian, :version,

    :os_abi, :abi_version, :type, :machine, :version, :entry_point_address, :start_of_program_headers, :start_of_section_headers, :flags, :size_of_this_header, :size_of_program_headers, :number_of_program_headers, :size_of_section_headers, :number_of_section_headers, :section_header_string_table_index ] def parse( <<magic_number::binary-size(8), arch::binary-size(2), endiness::binary-size(2), elf_version::binary-size(2), padding::binary-size(16), e_type::binary-size(4), rest::binary>> ) do elf = %ReadELF{ magic: magic_number, version: elf_version, machine: arch } end def main(path) do path |> File.read! |> Base.encode16 |> parse |> IO.inspect end
  14. The Berkeley socket API • socket() • bind() • listen()

    • connect() • accept() • send() • recv() • close() • ...
  15. Credit: https://www.cse.iitk.ac.in/users/dheeraj/cs425/lec18.html

  16. :gen_tcp • connect/3,4 • accept/1,2 • close/1 • listen/2 •

    recv/2,3 • send/2 • ...
  17. host = ‘localhost’ port = 3306 opts = [packet: :raw,

    mode: :binary, active: false] {:ok, socket} = :gen_tcp.connect(host, port, 5000, opts) {:ok, #Port<0.181>} Connecting to a MySQL database Server using :gen_tcp erlang module
  18. Behaviours • OTP design pattern • Generic behaviour module(server) and

    Specific callback module(client) • Separation of concern • Easier to read and understand code written by other developers
  19. -module(pattern). -export([start/1, init/1]). start(Args) -> % Start the server. spawn(server,

    init, [Args]). init(Args) -> % Initialize the internal process state State = initialize_state(Args), loop(State). loop(State) -> % Receive and handle messages. receive {handle, Msg} -> NewState = handle(Msg, State), loop(NewState); stop -> terminate(State) % Stop the process. end. terminate(State) -> % Cleanup prior to termination. clean_up(State).
  20. Generic • Initializing the server state • The loop data

    • The client requests • Handling client requests • Contents of server reply • Cleaning up • Spawning the server • Storing the loop data • Sending the requests to the server • Sending replies to the client • Receiving server replies • Stopping the server Specific Designing for Scalability with Erlang/OTP
  21. Standard Erlang/OTP behaviours ❏ gen_server ❏ gen_statem ❏ gen_event ❏

    supervisor
  22. Example GenServer # Start the server (Generic) {:ok, pid} =

    GenServer.start_link(Stack, [:hello]) # This is the client GenServer.call(pid, :pop) #=> :hello GenServer.cast(pid, {:push, :world}) #=> :ok GenServer.call(pid, :pop) #=> :world Credit: Elixir Docs
  23. def handle_cast({:push, element}, state) do {:noreply, [element | state]} end

    # Callbacks (specific) def init(stack) do {:ok, stack} end def handle_cast({:push, element}, state) do {:noreply, [element | state]} end
  24. db_connection “Database connection behaviour and database connection pool designed for

    handling transaction, prepare/execute, cursors and client process describe/encode/decode”
  25. db_connection API (Behaviour) ★ start_link/2 ★ close/3 & close!/3 ★

    execute/4 & execute!/4 ★ status/2 ★ ... Generic ❏ connect/1 ❏ disconnect/2 ❏ ping/1 ❏ handle_prepare/3 ❏ handle_execute/4 ❏ ... Specific
  26. MySQL Client /Server Communication

  27. Protocol Basics • Structure of a standard packet ◦ 4

    bytes header + Packet body ◦ Packet length int<3> ◦ Sequence number int<1> ◦ Packet body int<n> ◦ <<payload_length::24-little, sequence_id::8, payload::binary>> = data • Protocol data types ◦ int, string • Connection & command phase
  28. None
  29. Protocol data types int<1> Fixed-length integers int<lenenc> Length-encoded integers string<fix>

    Fixed-length strings string<NUL> Null-terminated strings string<lenenc> Length-encoded strings string<EOF> End-of-file length strings
  30. Initial Handshake Packet (V10) 1 [0a] protocol version string[NUL] server

    version 4 connection id string[8] auth-plugin-data-part-1 1 [00] filler 2 capability flags (lower 2 bytes) if more data in the packet: 1 character set 2 status flags 2 capability flags (upper 2 bytes) if capabilities & CLIENT_PLUGIN_AUTH { 1 length of auth-plugin-data } else { 1 [00] } string[10] reserved (all [00]) if capabilities & CLIENT_SECURE_CONNECTION { string[$len] auth-plugin-data-part-2 ($len=MAX(13, length of auth-plugin-data - 8)) if capabilities & CLIENT_PLUGIN_AUTH { string[NUL] auth-plugin name }
  31. https://dev.mysql.com/doc/internals/en/connection-phase-packets.html#packet-Protocol::Handshake

  32. iex> <<version::little-integer-size(8), server_version::little-binary-size(6), 0, connection_id::little-integer-size(32), auth_plugin_data_part_1::little-binary-size(8), filler::little-integer-size(8), capability_flags_lower::little-integer-size(16), character_set::little-integer-size(8), status_flags::little-integer-size(16),

    capability_flags_upper::little-integer-size(16), auth_plugin_data_len::little-integer-size(8), reserved::little-binary-size(10), auth_plugin_data_part_2::little-binary-size(13), auth_plugin_name::binary>> = body iex> [auth_plugin_name, _] = String.split(auth_plugin_name, <<0>>)
  33. Handshake Response Packet 4 capability flags, CLIENT_PROTOCOL_41 always set 4

    max-packet size 1 character set string[23] reserved (all [0]) string[NUL] username if capabilities & CLIENT_PLUGIN_AUTH_LENENC_CLIENT_DATA { lenenc-int length of auth-response string[n] auth-response } else if capabilities & CLIENT_SECURE_CONNECTION { 1 length of auth-response string[n] auth-response } else { string[NUL] auth-response } if capabilities & CLIENT_CONNECT_WITH_DB { string[NUL] database } if capabilities & CLIENT_PLUGIN_AUTH { string[NUL] auth plugin name } if capabilities & CLIENT_CONNECT_ATTRS { lenenc-int length of all key-values lenenc-str key lenenc-str value if-more data in 'length of all key-values', more keys and value pairs }
  34. https://dev.mysql.com/doc/internals/en/connection-phase-packets.html#packet-Protocol::HandshakeResponse • capability_flags (4) -- capability flags of the client

    as defined in Protocol::CapabilityFlags • max_packet_size (4) -- max size of a command packet that the client wants to send to the server • character_set (1) -- connection's default character set as defined in Protocol::CharacterSet. • username (string.fix_len) -- name of the SQL account which client wants to log in -- this string should be interpreted using the character set indicated by character set field. • auth-response (string.NUL) -- opaque authentication response data generated by Authentication Method indicated by the plugin name field. • database (string.NUL) -- initail database for the connection -- this string should be interpreted using the character set indicated by character set field. • auth plugin name (string.NUL) -- the Authentication Method used by the client to generate auth-response value in this packet. This is an UTF-8 string. Fields
  35. iex> use Bitwise iex> capability_flags = 0x00000200 ||| 0x00008000 |||

    0x00080000 iex> max_packet_size = 65535 iex> charset = 0x21 iex> username = "myxql_test" iex> auth_response = <<0x00>> iex> client_auth_plugin = "caching_sha2_password" iex> payload = <<capability_flags::32-little, max_packet_size::32-little, charset, 0::8*23, username::binary, 0x00, auth_response::binary, client_auth_plugin::binary, 0x00>> iex> sequence_id = 1 iex> data = <<byte_size(payload)::24-little, sequence_id, payload::binary>>
  36. None
  37. Resources • Understanding MySQL Internals by Sasha Pachev • http://blog.plataformatec.com.br/2018/11/building-a-new-m

    ysql-adapter-for-ecto-part-i-hello-world/ • https://www.youtube.com/watch?v=lBaDmsFTJxY • https://andrealeopardi.com/posts/handling-tcp-connections -in-elixir/ • https://hexdocs.pm/db_connection/DBConnection.html