Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let there be light

Let there be light

In this talk we're going to explore the boot process of the Erlang virtual machine. We'll trace the code path from the beginning of the C main function until the Application.start/2 callback is executed. We'll see how the C code interacts with the Erlang code, how the Erlang code is loaded, what is the first Erlang code to run, and finally how applications are started.

Michał Muskała

April 08, 2019
Tweet

More Decks by Michał Muskała

Other Decks in Programming

Transcript

  1. LET THERE BE LIGHT
    From nothing to a running application

    View Slide

  2. ~ λ erl

    View Slide

  3. ~ λ erl
    Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:8:8] [ds:…]
    Eshell V10.0 (abort with ^G)
    1>

    View Slide

  4. ~ λ time erl -s erlang halt
    Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:8:8] [ds:…]
    0.20 real 0.13 user 0.06 sys

    View Slide

  5. MICHAŁ MUSKAŁA
    http://michal.muskala.eu/
    https://github.com/michalmuskala/
    @michalmuskala

    View Slide

  6. int main(int argc, char ** argv)

    View Slide

  7. #!/bin/sh

    View Slide

  8. bin/erl
    #!/bin/sh
    # …
    #
    ROOTDIR="/opt/erlang/21.0"
    BINDIR=$ROOTDIR/erts-10.0/bin
    EMU=beam
    PROGNAME=`echo $0 | sed 's/.*\ ///'`
    export EMU
    export ROOTDIR
    export BINDIR
    export PROGNAME
    exec "$BINDIR/erlexec" ${1+"$@"}

    View Slide

  9. erlexec

    View Slide

  10. erlexec
    • Merge ERL_AFLAGS, ERL_FLAGS, ERL_ZFLAGS env vars into argc
    • Expand -args_file into argc
    • Validate arguments, expand for compatibility, add -home
    • Consume -env to system environment
    • All + options are either consumer or translated to -

    those that are left lead to an error in the next step

    View Slide

  11. erlexec
    • Custom malloc with +MYm or ERL_MALLOC_LIB - only libc value
    supported
    • -smp to enable/disable multithreading - 

    OTP 21 removed single-threaded implementation
    • -instr expands to +Mim true that leads to a “bad "Mi" parameter”
    • Gets CPU information (sysctrl, NUMA) and directly discards it

    View Slide

  12. beam.smp

    View Slide

  13. erl_main.c
    int
    main(int argc, char **argv)
    {
    erl_start(argc, argv);
    return 0;
    }

    View Slide

  14. View Slide

  15. erl_init.c - early_init
    • read CPU topology
    • configure time monitoring for time warp mode
    • start thread progress services

    https://github.com/erlang/otp/tree/master/erts/emulator/internal_doc
    • start allocators, pollset
    • parse CPU topology, thread count, allocators and system-specific arguments

    View Slide

  16. erl_init.c
    • Parse various arguments (+sbwtdcpu none +SDPcpu 50:25 +MBrsbcst 50)

    http://erlang.org/doc/man/erl.html http://erlang.org/doc/man/erts_alloc.html
    • Set stack size - 1 MB for regular schedulers, 320 kB for dirty schedulers
    • Set up data structures for processes - links, monitors, process table, garbage collector
    state, tracing, install BIFs, load preloaded modules
    • first processes - erl_first_process_otp
    • system processes - code_purger, literal_area_collector, dirty_process_signal_handler

    View Slide

  17. erl_map.c
    void erts_init_map(void) {
    erts_init_trap_export(&hashmap_merge_trap_export,
    am_maps, am_merge_trap, 1,
    &maps_merge_trap_1);
    return;
    }

    View Slide

  18. PRELOADED MODULES
    erlang
    erts_internal
    erl_tracer
    erts_literal_area_collector
    erts_dirty_process_signal_handler
    atomics
    counters
    persistent_term
    erts_code_purge
    erl_init
    init
    prim_buffer
    prim_eval
    prim_file
    prim_inet
    zlib
    prim_zip
    erl_prim_loader

    View Slide

  19. preloaded.c
    const unsigned preloaded_size_erts_code_purger = 4984;
    const unsigned char preloaded_erts_code_purger[] = {
    0x46,0x4f,0x52,0x31,0x00,0x00,0x13,0x70, /* FOR1 ...p */
    0x42,0x45,0x41,0x4d,0x41,0x74,0x55,0x38, /* BEAMAtU8 */
    0x00,0x00,0x02,0xe2,0x00,0x00,0x00,0x44, /* .......D */
    0x10,0x65,0x72,0x74,0x73,0x5f,0x63,0x6f, /* .erts_co */
    0x64,0x65,0x5f,0x70,0x75,0x72,0x67,0x65, /* de_purge */
    0x72,0x05,0x73,0x74,0x61,0x72,0x74,0x06, /* r.start. */
    0x65,0x72,0x6c,0x61,0x6e,0x67,0x04,0x73, /* erlang.s */
    0x65,0x6c,0x66,0x08,0x72,0x65,0x67,0x69, /* elf.regi */
    0x73,0x74,0x65,0x72,0x04,0x74,0x72,0x75, /* ster.tru */
    0x65,0x09,0x74,0x72,0x61,0x70,0x5f,0x65, /* e.trap_e */
    0x78,0x69,0x74,0x0c,0x70,0x72,0x6f,0x63, /* xit.proc */
    0x65,0x73,0x73,0x5f,0x66,0x6c,0x61,0x67, /* ess_flag */

    View Slide

  20. SYSTEM PROCESSES
    • erts_literal_area_collector - Used when modules are unloaded to GC all
    processes, to remove literal terms from that module
    • erts_code_purger - when a new version of module is loaded, if a process
    uses old version, it has to be killed
    • dirty_process_signal_handler - VM sends “internal” messages to
    processes, processes running dirty code can’t execute them - have
    special processes for it

    View Slide

  21. beam_emu.c - process_main
    • Implements the main process loop
    • A direct-threading bytecode interpreter
    • Huge-ass function with goto all over the place
    • Generated from pseudo-C by perl scripts into C with macros on top of macros
    • Implements bytecode different from BEAM files - translation done by the loader

    View Slide

  22. instrs.tab
    get_tl(Src, Tl) {
    Eterm* tmp_ptr = list_val($Src);
    $Tl = CDR(tmp_ptr);
    }
    i_get(Src, Dst) {
    $Dst = erts_pd_hash_get(c_p, $Src);
    }
    i_get_hash(Src, Hash, Dst) {
    $Dst = erts_pd_hash_get_with_hx(c_p, $Hash, $Src);
    }
    i_get_tuple_element(Src, Element, Dst) {
    Eterm* src = ADD_BYTE_OFFSET(tuple_val($Src), $Element);
    $Dst = *src;
    }

    View Slide

  23. erl_init.c - erl_first_process_otp
    • builds an a dummy parent process
    • builds Erlang terms from remaining arguments in the dummy process
    • forces the dummy process to spawn a process using
    erl_init:start/2
    • cleans up the dummy process

    View Slide

  24. erl_init.erl
    start(Mod, BootArgs) ->
    %% Load the static nifs
    zlib:on_load(),
    erl_tracer:on_load(),
    prim_buffer:on_load(),
    prim_file:on_load(),
    %% Proceed to the specified boot module
    run(Mod, boot, BootArgs).

    View Slide

  25. init.erl
    boot(BootArgs) ->
    register(init, self()),
    process_flag(trap_exit, true),
    {Start0,Flags,Args} = parse_boot_args(BootArgs),
    %% We don't get to profile parsing of BootArgs
    case b2a(get_flag(profile_boot, Flags, false)) of
    false -> ok;
    true -> debug_profile_start()
    end,
    Start = map(fun prepare_run_args/1, Start0),
    boot(Start, Flags, Args).

    View Slide

  26. init.erl - boot/3
    • start init __boot __on_load_handler process
    • start boot process - coordinates boot process
    • start erl_prim_loader process - handles early requests to load modules
    • eval boot script
    • start all processes specified using -s or -run and -eval on CLI
    • init enters the main process loop to handle the user interface of the init module

    View Slide

  27. INIT OPTIONS
    -boot File : Absolute file name of the boot script.
    -boot_var Var Value : $Var in the boot script is expanded to Value.
    -loader LoaderMethod : efile, inet (Optional - default efile)
    -hosts [Node] : List of hosts from which we can boot. (Mandatory if -loader inet)
    -mode interactive : Auto load modules not needed at startup (default system behaviour).
    -mode embedded : Load all modules in the boot script, disable auto loading.
    -path : Override path in bootfile.
    -pa Path+ : Add my own paths first.
    -pz Path+ : Add my own paths last.
    -run : Start own processes.
    -s : Start own processes.

    View Slide

  28. BOOT SCRIPT
    • http://erlang.org/doc/man/script.html
    • Boot file is produced from a boot script using
    systools:script2boot/1
    • Instructions on how to start the release - by default one provided by
    OTP, when building releases, provided by distillery or relx
    • Loads all modules and starts all applications as required by the system

    View Slide

  29. bin/start.script
    {script,
    {"Erlang/OTP","22"},
    [{preLoaded,[atomics,…]},
    {progress,preloaded},
    {path,["$ROOT/lib/kernel/ebin","$ROOT/lib/
    stdlib/ebin"]},
    {primLoad,[error_handler,application,…]},
    {kernel_load_completed},
    {progress,kernel_load_completed},
    {path,["$ROOT/lib/kernel/ebin"]},
    {primLoad,[application_starter,…]},
    {path,["$ROOT/lib/stdlib/ebin"]},
    {primLoad,[array,base64,beam_lib,…]},
    {progress,modules_loaded},
    {path,["$ROOT/lib/kernel/ebin","$ROOT/lib/
    stdlib/ebin"]},
    {kernelProcess,heart,{heart,start,[]}},
    {kernelProcess,logger,{logger_server,start_link,[]}},
    {kernelProcess,application_controller,
    {application_controller,start,
    [{application,kernel, ..}]
    {progress,init_kernel_started},
    {apply,{application,load,[{application,stdlib,…}]},
    {progress,applications_loaded},
    {apply,{application,start_boot,[kernel,permanent]}},
    {apply,{application,start_boot,[stdlib,permanent]}},
    {apply,{c,erlangrc,[]}},
    {progress,started}]}.

    View Slide

  30. application:load/2
    • locates the .app file in the code path
    • merges app env - from .config file, from CLI, from the .app file
    • finds the application callback module

    View Slide

  31. application:start/2
    • spawns a module to start the application
    • ends up calling into application_master:start_link
    • spawns a process, that spawns a process that executes the start
    callback
    • monitors the returned PID

    View Slide

  32. View Slide

  33. elixir

    View Slide

  34. ~ λ ERL_AFLAGS="-emu_args" elixir -e "System.halt()"
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -s elixir start_cli -extra -e System.halt()

    View Slide

  35. ~ λ ERL_AFLAGS="-emu_args" elixir -e "System.halt()"
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -s elixir start_cli -extra -e System.halt()

    View Slide

  36. start_cli() ->
    {ok, _} = application:ensure_all_started(?MODULE),
    %% …
    ‘Elixir.Kernel.CLI':main(init:get_plain_arguments()).

    View Slide

  37. ~ λ ERL_AFLAGS="-emu_args" iex
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -user Elixir.IEx.CLI -extra --no-halt +iex

    View Slide

  38. ~ λ ERL_AFLAGS="-emu_args" iex
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -user Elixir.IEx.CLI -extra --no-halt +iex

    View Slide

  39. mix

    View Slide

  40. #!/usr/bin/env elixir
    Mix.start
    Mix.CLI.main

    View Slide

  41. SUMMARY
    • A lot of VM services implemented in Erlang
    • Lots of cruft accumulated over the ages
    • A lot of indirection that probably could be removed, but nobody bothered
    • Elixir adds another layers on top and underneath OTP boot process
    • I discovered the igor module for renaming modules

    View Slide

  42. LET THERE BE LIGHT
    From nothing to a running application

    View Slide

  43. DYNAMIC CODE LOADING
    • process_info(Pid, error_handler), by default - error_handler
    • when a function is not defined - undefined_function/3
    • tries loading module through code or init
    • calls the newly loaded module
    • otherwise calls “magic” $handle_undefined_function function
    • otherwise crashes

    View Slide

  44. error_handler.erl
    crash(Tuple) ->
    try erlang:error(undef)
    catch
    error:undef:Stacktrace ->
    Stk = [Tuple|tl(Stacktrace)],
    erlang:raise(error, undef, Stk)
    end.

    View Slide