Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Let there be light

Let there be light

In this talk we're going to explore the boot process of the Erlang virtual machine. We'll trace the code path from the beginning of the C main function until the Application.start/2 callback is executed. We'll see how the C code interacts with the Erlang code, how the Erlang code is loaded, what is the first Erlang code to run, and finally how applications are started.

Michał Muskała

April 08, 2019
Tweet

More Decks by Michał Muskała

Other Decks in Programming

Transcript

  1. LET THERE BE LIGHT
    From nothing to a running application

    View full-size slide

  2. ~ λ erl
    Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:8:8] [ds:…]
    Eshell V10.0 (abort with ^G)
    1>

    View full-size slide

  3. ~ λ time erl -s erlang halt
    Erlang/OTP 21 [erts-10.0] [source] [64-bit] [smp:8:8] [ds:…]
    0.20 real 0.13 user 0.06 sys

    View full-size slide

  4. MICHAŁ MUSKAŁA
    http://michal.muskala.eu/
    https://github.com/michalmuskala/
    @michalmuskala

    View full-size slide

  5. int main(int argc, char ** argv)

    View full-size slide

  6. bin/erl
    #!/bin/sh
    # …
    #
    ROOTDIR="/opt/erlang/21.0"
    BINDIR=$ROOTDIR/erts-10.0/bin
    EMU=beam
    PROGNAME=`echo $0 | sed 's/.*\ ///'`
    export EMU
    export ROOTDIR
    export BINDIR
    export PROGNAME
    exec "$BINDIR/erlexec" ${1+"$@"}

    View full-size slide

  7. erlexec
    • Merge ERL_AFLAGS, ERL_FLAGS, ERL_ZFLAGS env vars into argc
    • Expand -args_file into argc
    • Validate arguments, expand for compatibility, add -home
    • Consume -env to system environment
    • All + options are either consumer or translated to -

    those that are left lead to an error in the next step

    View full-size slide

  8. erlexec
    • Custom malloc with +MYm or ERL_MALLOC_LIB - only libc value
    supported
    • -smp to enable/disable multithreading - 

    OTP 21 removed single-threaded implementation
    • -instr expands to +Mim true that leads to a “bad "Mi" parameter”
    • Gets CPU information (sysctrl, NUMA) and directly discards it

    View full-size slide

  9. erl_main.c
    int
    main(int argc, char **argv)
    {
    erl_start(argc, argv);
    return 0;
    }

    View full-size slide

  10. erl_init.c - early_init
    • read CPU topology
    • configure time monitoring for time warp mode
    • start thread progress services

    https://github.com/erlang/otp/tree/master/erts/emulator/internal_doc
    • start allocators, pollset
    • parse CPU topology, thread count, allocators and system-specific arguments

    View full-size slide

  11. erl_init.c
    • Parse various arguments (+sbwtdcpu none +SDPcpu 50:25 +MBrsbcst 50)

    http://erlang.org/doc/man/erl.html http://erlang.org/doc/man/erts_alloc.html
    • Set stack size - 1 MB for regular schedulers, 320 kB for dirty schedulers
    • Set up data structures for processes - links, monitors, process table, garbage collector
    state, tracing, install BIFs, load preloaded modules
    • first processes - erl_first_process_otp
    • system processes - code_purger, literal_area_collector, dirty_process_signal_handler

    View full-size slide

  12. erl_map.c
    void erts_init_map(void) {
    erts_init_trap_export(&hashmap_merge_trap_export,
    am_maps, am_merge_trap, 1,
    &maps_merge_trap_1);
    return;
    }

    View full-size slide

  13. PRELOADED MODULES
    erlang
    erts_internal
    erl_tracer
    erts_literal_area_collector
    erts_dirty_process_signal_handler
    atomics
    counters
    persistent_term
    erts_code_purge
    erl_init
    init
    prim_buffer
    prim_eval
    prim_file
    prim_inet
    zlib
    prim_zip
    erl_prim_loader

    View full-size slide

  14. preloaded.c
    const unsigned preloaded_size_erts_code_purger = 4984;
    const unsigned char preloaded_erts_code_purger[] = {
    0x46,0x4f,0x52,0x31,0x00,0x00,0x13,0x70, /* FOR1 ...p */
    0x42,0x45,0x41,0x4d,0x41,0x74,0x55,0x38, /* BEAMAtU8 */
    0x00,0x00,0x02,0xe2,0x00,0x00,0x00,0x44, /* .......D */
    0x10,0x65,0x72,0x74,0x73,0x5f,0x63,0x6f, /* .erts_co */
    0x64,0x65,0x5f,0x70,0x75,0x72,0x67,0x65, /* de_purge */
    0x72,0x05,0x73,0x74,0x61,0x72,0x74,0x06, /* r.start. */
    0x65,0x72,0x6c,0x61,0x6e,0x67,0x04,0x73, /* erlang.s */
    0x65,0x6c,0x66,0x08,0x72,0x65,0x67,0x69, /* elf.regi */
    0x73,0x74,0x65,0x72,0x04,0x74,0x72,0x75, /* ster.tru */
    0x65,0x09,0x74,0x72,0x61,0x70,0x5f,0x65, /* e.trap_e */
    0x78,0x69,0x74,0x0c,0x70,0x72,0x6f,0x63, /* xit.proc */
    0x65,0x73,0x73,0x5f,0x66,0x6c,0x61,0x67, /* ess_flag */

    View full-size slide

  15. SYSTEM PROCESSES
    • erts_literal_area_collector - Used when modules are unloaded to GC all
    processes, to remove literal terms from that module
    • erts_code_purger - when a new version of module is loaded, if a process
    uses old version, it has to be killed
    • dirty_process_signal_handler - VM sends “internal” messages to
    processes, processes running dirty code can’t execute them - have
    special processes for it

    View full-size slide

  16. beam_emu.c - process_main
    • Implements the main process loop
    • A direct-threading bytecode interpreter
    • Huge-ass function with goto all over the place
    • Generated from pseudo-C by perl scripts into C with macros on top of macros
    • Implements bytecode different from BEAM files - translation done by the loader

    View full-size slide

  17. instrs.tab
    get_tl(Src, Tl) {
    Eterm* tmp_ptr = list_val($Src);
    $Tl = CDR(tmp_ptr);
    }
    i_get(Src, Dst) {
    $Dst = erts_pd_hash_get(c_p, $Src);
    }
    i_get_hash(Src, Hash, Dst) {
    $Dst = erts_pd_hash_get_with_hx(c_p, $Hash, $Src);
    }
    i_get_tuple_element(Src, Element, Dst) {
    Eterm* src = ADD_BYTE_OFFSET(tuple_val($Src), $Element);
    $Dst = *src;
    }

    View full-size slide

  18. erl_init.c - erl_first_process_otp
    • builds an a dummy parent process
    • builds Erlang terms from remaining arguments in the dummy process
    • forces the dummy process to spawn a process using
    erl_init:start/2
    • cleans up the dummy process

    View full-size slide

  19. erl_init.erl
    start(Mod, BootArgs) ->
    %% Load the static nifs
    zlib:on_load(),
    erl_tracer:on_load(),
    prim_buffer:on_load(),
    prim_file:on_load(),
    %% Proceed to the specified boot module
    run(Mod, boot, BootArgs).

    View full-size slide

  20. init.erl
    boot(BootArgs) ->
    register(init, self()),
    process_flag(trap_exit, true),
    {Start0,Flags,Args} = parse_boot_args(BootArgs),
    %% We don't get to profile parsing of BootArgs
    case b2a(get_flag(profile_boot, Flags, false)) of
    false -> ok;
    true -> debug_profile_start()
    end,
    Start = map(fun prepare_run_args/1, Start0),
    boot(Start, Flags, Args).

    View full-size slide

  21. init.erl - boot/3
    • start init __boot __on_load_handler process
    • start boot process - coordinates boot process
    • start erl_prim_loader process - handles early requests to load modules
    • eval boot script
    • start all processes specified using -s or -run and -eval on CLI
    • init enters the main process loop to handle the user interface of the init module

    View full-size slide

  22. INIT OPTIONS
    -boot File : Absolute file name of the boot script.
    -boot_var Var Value : $Var in the boot script is expanded to Value.
    -loader LoaderMethod : efile, inet (Optional - default efile)
    -hosts [Node] : List of hosts from which we can boot. (Mandatory if -loader inet)
    -mode interactive : Auto load modules not needed at startup (default system behaviour).
    -mode embedded : Load all modules in the boot script, disable auto loading.
    -path : Override path in bootfile.
    -pa Path+ : Add my own paths first.
    -pz Path+ : Add my own paths last.
    -run : Start own processes.
    -s : Start own processes.

    View full-size slide

  23. BOOT SCRIPT
    • http://erlang.org/doc/man/script.html
    • Boot file is produced from a boot script using
    systools:script2boot/1
    • Instructions on how to start the release - by default one provided by
    OTP, when building releases, provided by distillery or relx
    • Loads all modules and starts all applications as required by the system

    View full-size slide

  24. bin/start.script
    {script,
    {"Erlang/OTP","22"},
    [{preLoaded,[atomics,…]},
    {progress,preloaded},
    {path,["$ROOT/lib/kernel/ebin","$ROOT/lib/
    stdlib/ebin"]},
    {primLoad,[error_handler,application,…]},
    {kernel_load_completed},
    {progress,kernel_load_completed},
    {path,["$ROOT/lib/kernel/ebin"]},
    {primLoad,[application_starter,…]},
    {path,["$ROOT/lib/stdlib/ebin"]},
    {primLoad,[array,base64,beam_lib,…]},
    {progress,modules_loaded},
    {path,["$ROOT/lib/kernel/ebin","$ROOT/lib/
    stdlib/ebin"]},
    {kernelProcess,heart,{heart,start,[]}},
    {kernelProcess,logger,{logger_server,start_link,[]}},
    {kernelProcess,application_controller,
    {application_controller,start,
    [{application,kernel, ..}]
    {progress,init_kernel_started},
    {apply,{application,load,[{application,stdlib,…}]},
    {progress,applications_loaded},
    {apply,{application,start_boot,[kernel,permanent]}},
    {apply,{application,start_boot,[stdlib,permanent]}},
    {apply,{c,erlangrc,[]}},
    {progress,started}]}.

    View full-size slide

  25. application:load/2
    • locates the .app file in the code path
    • merges app env - from .config file, from CLI, from the .app file
    • finds the application callback module

    View full-size slide

  26. application:start/2
    • spawns a module to start the application
    • ends up calling into application_master:start_link
    • spawns a process, that spawns a process that executes the start
    callback
    • monitors the returned PID

    View full-size slide

  27. ~ λ ERL_AFLAGS="-emu_args" elixir -e "System.halt()"
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -s elixir start_cli -extra -e System.halt()

    View full-size slide

  28. ~ λ ERL_AFLAGS="-emu_args" elixir -e "System.halt()"
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -s elixir start_cli -extra -e System.halt()

    View full-size slide

  29. start_cli() ->
    {ok, _} = application:ensure_all_started(?MODULE),
    %% …
    ‘Elixir.Kernel.CLI':main(init:get_plain_arguments()).

    View full-size slide

  30. ~ λ ERL_AFLAGS="-emu_args" iex
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -user Elixir.IEx.CLI -extra --no-halt +iex

    View full-size slide

  31. ~ λ ERL_AFLAGS="-emu_args" iex
    Executing: /opt/erlang/21.0/erts-10.0/bin/beam.smp /opt/erlang/21.0/
    erts-10.0/bin/beam.smp -- -root /opt/erlang/21.0 -progname erl -- -
    home /Users/michalmuskala -- -pa /opt/elixir/master/lib/elixir/bin/ ../
    lib/eex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/elixir/ebin /opt/
    elixir/master/lib/elixir/bin/ ../lib/ex_unit/ebin /opt/elixir/master/lib/
    elixir/bin/ ../lib/iex/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/
    logger/ebin /opt/elixir/master/lib/elixir/bin/ ../lib/mix/ebin -elixir
    ansi_enabled true -noshell -user Elixir.IEx.CLI -extra --no-halt +iex

    View full-size slide

  32. #!/usr/bin/env elixir
    Mix.start
    Mix.CLI.main

    View full-size slide

  33. SUMMARY
    • A lot of VM services implemented in Erlang
    • Lots of cruft accumulated over the ages
    • A lot of indirection that probably could be removed, but nobody bothered
    • Elixir adds another layers on top and underneath OTP boot process
    • I discovered the igor module for renaming modules

    View full-size slide

  34. LET THERE BE LIGHT
    From nothing to a running application

    View full-size slide

  35. DYNAMIC CODE LOADING
    • process_info(Pid, error_handler), by default - error_handler
    • when a function is not defined - undefined_function/3
    • tries loading module through code or init
    • calls the newly loaded module
    • otherwise calls “magic” $handle_undefined_function function
    • otherwise crashes

    View full-size slide

  36. error_handler.erl
    crash(Tuple) ->
    try erlang:error(undef)
    catch
    error:undef:Stacktrace ->
    Stk = [Tuple|tl(Stacktrace)],
    erlang:raise(error, undef, Stk)
    end.

    View full-size slide