Towards online profiling of Erlang systems

Towards online profiling of Erlang systems

Recent releases of Erlang/OTP introduced features, which can be used to improve profiling tools for systems executed on the BEAM virtual machine. We discuss the need to introduce improvements into profiling tools with the Erlang-style concurrency in mind, so that they can help to understand performance of message passing and utilization of processes. We propose a new approach to implementation of the crucial element of such tools: the concurrent counters updates mechanism. To demonstrate the limitations of current tools in this area and verify the proposed approach, we present the results of a synthetic benchmark. The results clearly show that the proposed approach is a step towards a new generation of online profiling tools.

69639a461e92e590acdc1b554934bd8d?s=128

Michał Ślaski

August 18, 2019
Tweet

Transcript

  1. 1.

    TOWARDS ONLINE PROFILING Michał Ślaski Wojciech Turek @ Erlang Solutions

    @ AGH University Erlang Workshop, 18 August 2019
  2. 2.

    ABOUT ME • AGH'2005 • Workshop'2006 • Tech Lead'2012 •

    Lambda Days'2020 From HTTP to HTML From HTTP to HTML Experiences in Web Based Service Experiences in Web Based Service Applications Applications ACM Sigplan Erlang Workshop ACM Sigplan Erlang Workshop Portland, Oregon, September 16, 2006 Portland, Oregon, September 16, 2006 Francesco Cesarini Lukas Larsson Michal Slaski
  3. 5.

    TOWARDS ONLINE PROFILING •rapid analysis of data
 rather than delayed

    post-processing •measurements taken from a live system
  4. 7.

    WHAT TO PROFILE? • time spent on executing particular functions

    • memory allocation volumes • garbage collection events • number of processes created and terminated • mailbox sizes
  5. 8.

    MESSAGE PASSING PROFILE • which processes send or receive more

    messages
 than other processes • which process pairs communicate more often
 than other pairs • which processes send messages
 to pids of non-existing processes
  6. 9.

    FPROF • measure time spent executing functions • values stored

    in files for off-line analysis • collected data
 can be visualised
 using kcachegrind http://blog.equanimity.nl/blog/2013/04/24/fprof-kcachegrind/
  7. 10.

    XPROF • visual tracer tracking execution time of functions •

    helps to decide which functions are of interest http://www.erlang-factory.com/euc2017/peter-gomori https://github.com/Appliscale/xprof
  8. 12.

    PERCEPT2 • explores how Erlang application
 perform on multicore CPUs

    Multicore profiling for Erlang programs using percept2
 https://doi.org/10.1145/2505305.2505311
  9. 14.

    ERLANG.PL • visualisation of system activity Analysis of distributed systems

    dynamics with Erlang performance lab https://journals.agh.edu.pl/csci/article/view/2752
  10. 16.

    PROBLEM • in some cases the tools can fail due

    to sudden and intense increase of load • these unexpected states are the situations
 for which the profiling is used in the first place,
 so the failure is problematic
  11. 17.

    HOW TOOLS WORK? • tools collect events occurring during execution

    • collected values are aggregated • source of events: • erlang:statistics/1 • erlang:tracer/3
  12. 18.

    HOW TOOLS WORK? • aggregation of trace events
 can become

    an expensive task itself,
 which can influence the monitored system • some tools implement heuristics
 limiting number of aggregated events
  13. 19.

    AGGREGATING EVENTS considered three implementations of the counter • ETS

    counters - ets:update_counter/3 • NIF counters - C11 _Atomic long • R22 counters - counters:add/3
  14. 21.

    COLLECTING EVENTS • send events to process or port •

    aggregate with R22 or ETS counters • call a NIF module implementing tracer behavior • aggregate with NIF counters
  15. 22.

    SCHEDULER UTILIZATION # messages tracer process
 ETS counters tracer process


    R22 counters tracer module
 NIF counters no tracer
 no counters 8M 91.18% 93.52% 14.30% 14.15% 32M failure failure 99.94% 99.94%
  16. 24.

    CONCLUSIONS • increment counters in the context of processes being

    traced, which is possible with erl_tracer • overhead is spread across all available schedulers • overhead slows down the system,
 but also makes it possible to observe the system
  17. 25.

    TOOLS gcprof eprof cprof recon eflame eflame2 eep redbug visualixir

    erlubi looking_glass fprof xprof erlyberly percept percept2 wombat erlangpl