$30 off During Our Annual Pro Sale. View Details »

Exploring BEAM-based systems with erlang.pl

Exploring BEAM-based systems with erlang.pl

Processes running on the BEAM virtual machine share data through message passing. If certain processes receive more messages than they can handle, their inbox queue starts growing. Over time it can lead to bottlenecks. Similarly distributed systems share data through network packets sent between nodes and in some cases it can lead to network congestions. Gaining visibility into message passing and network traffic is of great importance as it helps to understand the behaviour of the entire system.

During this talk we will explore the behaviour of two systems under load - one based on Erlang (MongooseIM) and one based on Elixir (Phoenix Channels). We will demonstrate the erlang.pl tool and observe if its graphical representation of processes and clustering can help us learn something about the characteristics of the two systems.

Videos used in the presentation:
Phoenix Channels cluster view - https://www.youtube.com/watch?v=ybnQ0rre1Jc
Phoenix Channels node view - https://www.youtube.com/watch?v=ZD5WwGsBA7Q
MongooseIM cluster view - https://www.youtube.com/watch?v=BIbLOVNKZYw

Michał Ślaski

June 28, 2017
Tweet

More Decks by Michał Ślaski

Other Decks in Programming

Transcript

  1. ERLANG.PL
    new coding perspective
    Erlang User Group Krakow, 28 June 2017

    View Slide

  2. DO YOU KNOW BEAM ?

    View Slide

  3. WHY BEAM

    View Slide

  4. WHY BEAM
    • BEAM benefits relevant today:

    View Slide

  5. WHY BEAM
    • BEAM benefits relevant today:
    • concise code

    View Slide

  6. WHY BEAM
    • BEAM benefits relevant today:
    • concise code
    • reliable system

    View Slide

  7. WHY BEAM
    • BEAM benefits relevant today:
    • concise code
    • reliable system
    • equal latency for all users

    View Slide

  8. WHY BEAM
    • BEAM benefits relevant today:
    • concise code
    • reliable system
    • equal latency for all users
    • utilisation of multi core CPUs

    View Slide

  9. WHY BEAM
    • BEAM benefits relevant today:
    • concise code
    • reliable system
    • equal latency for all users
    • utilisation of multi core CPUs
    • traceability

    View Slide

  10. WHAT IS ERLANG.PL

    View Slide

  11. WHAT IS ERLANG.PL
    • Tool for the BEAM platform (a.k.a. Erlang VM)

    View Slide

  12. WHAT IS ERLANG.PL
    • Tool for the BEAM platform (a.k.a. Erlang VM)
    • Helps with system exploration

    View Slide

  13. WHAT IS ERLANG.PL
    • Tool for the BEAM platform (a.k.a. Erlang VM)
    • Helps with system exploration
    • Exploits benefits of traceability

    View Slide

  14. WHAT IS ERLANG.PL
    • Tool for the BEAM platform (a.k.a. Erlang VM)
    • Helps with system exploration
    • Exploits benefits of traceability
    • Aims at facilitating developer's work

    View Slide

  15. HISTORY

    View Slide

  16. HISTORY
    • Proof of concept in Jun 2013

    View Slide

  17. HISTORY
    • Proof of concept in Jun 2013
    • Open source since Feb 2017

    View Slide

  18. HISTORY
    • Proof of concept in Jun 2013
    • Open source since Feb 2017
    • Presented in Mar 2017 at EEF SF

    View Slide

  19. HISTORY
    • Proof of concept in Jun 2013
    • Open source since Feb 2017
    • Presented in Mar 2017 at EEF SF
    • Accepted for Google Summer of Code'2017

    View Slide

  20. WORKFLOW - STEP 1

    View Slide

  21. WORKFLOW - STEP 1
    • observe dashboard

    View Slide

  22. WORKFLOW - STEP 1
    • observe dashboard
    • look for trends

    View Slide

  23. WORKFLOW - STEP 1
    • observe dashboard
    • look for trends
    • observe that

    message traffic

    has increased

    View Slide

  24. WORKFLOW - STEP 1
    • observe dashboard
    • look for trends
    • observe that

    message traffic

    has increased
    • what is the root cause?

    View Slide

  25. Dashboard - mnesia executing transactions

    View Slide

  26. Dashboard - mnesia executing transactions

    View Slide

  27. WORKFLOW - STEP 2

    View Slide

  28. WORKFLOW - STEP 2
    if the root cause is external,

    View Slide

  29. WORKFLOW - STEP 2
    if the root cause is external,
    we observe increased

    traffic from external nodes

    View Slide

  30. CLUSTER TRAFFIC
    Cluster traffic - generate load on Phoenix channels
    https://github.com/arkgil/phx_load_gen

    View Slide

  31. CLUSTER TRAFFIC
    Cluster traffic - generate load on Phoenix channels
    https://github.com/arkgil/phx_load_gen

    View Slide

  32. CLUSTER TRAFFIC
    Cluster traffic - generate load on MongooseIM XMPP
    https://github.com/esl/amoc

    View Slide

  33. CLUSTER TRAFFIC
    Cluster traffic - generate load on MongooseIM XMPP
    https://github.com/esl/amoc

    View Slide

  34. WORKFLOW - STEP 3

    View Slide

  35. WORKFLOW - STEP 3
    if the root cause is internal,

    View Slide

  36. WORKFLOW - STEP 3
    if the root cause is internal,
    we observe increased traffic

    between internal processes

    View Slide

  37. Message passing - WebSocket and Phoenix Channel

    View Slide

  38. Message passing - WebSocket and Phoenix Channel

    View Slide

  39. WORKFLOW - STEP 4

    View Slide

  40. WORKFLOW - STEP 4
    • view the supervision tree

    View Slide

  41. WORKFLOW - STEP 4
    • view the supervision tree
    • find the process generating

    a lot of internal traffic

    View Slide

  42. WORKFLOW - STEP 4
    • view the supervision tree
    • find the process generating

    a lot of internal traffic
    • get to know its process_info/1

    View Slide

  43. WORKFLOW - STEP 5

    View Slide

  44. WORKFLOW - STEP 5
    • analyse the process' mailbox

    View Slide

  45. WORKFLOW - STEP 5
    • analyse the process' mailbox
    • observe the ordering of messages

    View Slide

  46. WORKFLOW - STEP 5
    • analyse the process' mailbox
    • observe the ordering of messages
    inspect how processing a
    message changed the state

    View Slide

  47. WORKFLOW - STEP 6

    View Slide

  48. WORKFLOW - STEP 6
    • inspect a flame graph

    View Slide

  49. WORKFLOW - STEP 6
    • inspect a flame graph
    • what keeps it busy?

    View Slide

  50. WORKFLOW - STEP 6
    • inspect a flame graph
    • what keeps it busy?
    • where most of the time is spent?

    View Slide

  51. DESIGN PRINCIPLES

    View Slide

  52. DESIGN PRINCIPLES
    • Intuitive navigation and visualisations

    View Slide

  53. DESIGN PRINCIPLES
    • Intuitive navigation and visualisations
    • Simplify complex sets of trace events

    View Slide

  54. DESIGN PRINCIPLES
    • Intuitive navigation and visualisations
    • Simplify complex sets of trace events
    • Don't worry about production systems

    View Slide

  55. DESIGN PRINCIPLES
    • Intuitive navigation and visualisations
    • Simplify complex sets of trace events
    • Don't worry about production systems
    • Leverage WebGL as much as possible

    View Slide

  56. WHAT IS AVAILABLE

    View Slide

  57. WHAT IS AVAILABLE
    • Dashboard, cluster, message passing and
    supervision tree views already implemented

    View Slide

  58. WHAT IS AVAILABLE
    • Dashboard, cluster, message passing and
    supervision tree views already implemented
    • Timeline tracking is work in progress

    View Slide

  59. HOW IT WORKS

    View Slide

  60. HOW IT WORKS
    • spawn remote fun()



    View Slide

  61. HOW IT WORKS
    • spawn remote fun()



    View Slide

  62. HOW IT WORKS
    • spawn remote fun()



    • turn on tracer for all processes

    View Slide

  63. HOW IT WORKS
    • spawn remote fun()



    • turn on tracer for all processes

    View Slide

  64. ERLANG.PL

    View Slide

  65. ERLANG.PL
    • www.erlang.pl
    • github.com/erlanglab
    • twitter @erlanglab

    View Slide

  66. QUESTIONS ?
    [email protected]

    View Slide