Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Predicting Performance Changes of Distributed A...

wrzasa
March 19, 2017

Predicting Performance Changes of Distributed Applications

This is my presentation from wroc_love.rb 2017 conference (http://wrocloverb.com) and EuRuKo 2017 (http://euruko2017.org)

The software used in the presentation is now opensource: https://github.com/wrzasa/rbsim/ Feel free to contact me if you need assistance.

wrzasa

March 19, 2017
Tweet

More Decks by wrzasa

Other Decks in Programming

Transcript

  1. PREDICTING PERFORMANCE CHANGES OF DISTRIBUTED APPLICATIONS Wojciech Rząsa @wrzasa Wojciech

    Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  2. Passion for informatics PhD, but primarily an engineer Rzeszow University

    of Technology Research: distributed systems Teaching: Ruby, Rails, ... Rzeszow Ruby User Group ABOUT ME http://rrug.pl Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  3. (c) e-ScienceCity , . http://www.e-sciencecity.org/ Creative Commons Attribution-ShareAlike 3.0 Unported

    License Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  4. GRID MONITORING — OCM-G Debugging Interactive applications Shared infrastructure Distributed

    No central management Standard interface for tools Tight security requirements Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  5. AGENDA How to predict performance changes Basic example Two case

    studies Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  6. Rząsa W.: Timed colored Petri net based estimation of efficiency

    of the grid applications. Supervisor: E. Nawarecki, AGH-UST, Kraków, 2011. Baliś B., Bubak M., Rząsa W., Szepieniec T., Wismüller R.: Security in the OCM-G Grid Application Monitoring System. PPAM 2003, LNCS 3019, pp. 779-787, 2004, Eds. R. Wyrzykowski et al. Baliś B., Bubak M., Rząsa W., Szepieniec T., Wismüller R.: Two Aspects of Security Solution for Distributed Systems in the Grid on the Example of the OCM-G. In proc. of CGW'03, pp.197-206, Kraków 2004 ISBN 83-915141-3-7. Baliś B., Bubak M., Rząsa W., Szepieniec T.: Efficiency of the GSI Secured Network Transmission. ICCS 2004, LNCS 3036, p. 107-115, 2004, Eds. M. Bubak et al. Rząsa W., Bubak M., Baliś B., Szepieniec T.: Simulation Method for Estimation of Security Overhead of Grid Applications. In proc. of CGW'05, pp. 300-307, Kraków 2006 ISBN 83-915141-5-3, EAN 9788391514153. Rząsa W., Bubak M., Baliś B., Szepieniec T.: Overhead Verification for Cryptographically Secured Transmission in the Grid. Computing and Informatics, Vol. 26, 2007, 89-101. Rząsa W., Bubak M.: Application of Petri Nets to Evaluation of Grid Applications Efficiency. In proc. of CGW'08, pp. 261-269, Kraków 2009, ISBN 978-83-61433-00-2. Rząsa W.: Combining Timed Colored Petri Nets and Real TCP Implementation to Reliably Simulate Distributed Applications. CN 2009, CCIS 39, pp. 79-86, 2009, Eds. A. Kwiecień, P. Gaj, and P. Stera. Dec G, Jędrzejec B, Rząsa W.: Kolorowana sieć Petriego jako model systemu podejmowania decyzji kredytowej. STUDIA INFORMATICA 2010, Volume 31, Number 2A (89). Rząsa W., Bubak M.: Simulation Method Supporting Development of Parallel Applications for Grids. In proc. of CGW'10, pp. 194-201, Kraków 2011, ISBN 978-83-61433-03-3. Dec G., Rząsa W.: Modelowanie wielowarstwowej rozproszonej aplikacji www z zastosowaniem TCPN. Praca zbiorowa pod red. L. Trybusa i S. Samoleja: Projektowanie, analiza i implementacja systemów czasu rzeczywistego, ISBN 878-83-206-1822-8, Wyd. Komunikacji i Łączności, Warszawa 2011, pp. 137-148. Rząsa W., Rzońca D., Stec A., Trybus B.: Analysis of Challenge-Response Authentication in a Networked Control System, in: Kwiecien A., Gaj P., and Stera P. (Eds.): Computer Networks 2012, Communications in Computer and Information Science 291, Springer-Verlag Berlin Heidelberg 2012, pp. 271-279. Rząsa W., Bubak M., Nawarecki E.: High-Level Model for Performance Evaluation of Distributed Applications, in: Balicki J., Krawczyk H., Nawarecki E. (Eds.): Grid and Volunteer Computing, Gdansk University of Technology Faculty of Elektronics, Telecomunication and Informatics Press, Gdańsk 2012, pp. 7-23. Rząsa W.: Synchronization Algorithm for Timed Colored Petri Nets and Ns-2 Simulators, in: Kwiecień A., Gaj P., and Stera P. (Eds): CN2013, CCIS 370, pp. 1-10, Springer-Verlag Berlin Heidelberg, 2013, ISSN 1865-0929, ISBN 978-3-642-38864-4. Kowalski, M.; Rzasa, W., "Object-oriented approach to Timed Colored Petri Net simulation," Computer Science and Information Systems (FedCSIS), 2013 Federated Conference on, pp.1401,1404, 8-11 Sept. 2013, ISBN 978-1-4673-4471-5 (Web), 978-83-60810-53-8 (USB), IEEE Catalog Number: CFP1385N-ART (Web),CFP1385N-USB (USB) Jamro M., Rzońca D., Rząsa W.: Testing communication tasks in distributed control systems with SysML and Timed Colored Petri Nets model. Computers in Industry, Vol. 71, August 2015, pp. 77-87. Rząsa W.: "Simulation-Based Analysis of a Platform as a Service Infrastructure Performance from a User Perspective", P. Gaj et al. (Eds.): CN 2015, CCIS 522, pp. 182–192, 2015 ISBN: 978-3-319-19418-9. Rząsa W., Rzońca D.: Event-Driven Approach to Modeling and Performance Estimation of a Distributed Control System, in: Gaj P., Kwiecień A., and Stera P. (Eds.): Computer Networks 2016, Communications in Computer and Information Science 608, Springer International Publishing 2016, pp. 168-179. Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  7. SOLUTION BASICS Simulation Model described in Ruby-based DSL Simulator based

    on a formalism Stats available via Ruby iterators Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  8. BASIC EXAMPLE Web server Web clients Resources Wojciech Rząsa @wrzasa

    Predicting Performance Changes of Distributed Applications
  9. WEB SERVER program :apache do on_event :data_received do |data| stats_start

    server: :apache, name: process.name cpu do |cpu| (100 * data.size.in_bytes / cpu.performance).miliseconds end send_data to: data.src, size: data.size * 10, type: :response, content: data.content stats_stop server: :apache, name: process.name end end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  10. WEB CLIENT program :wget do |opts| sent = 0 on_event

    :send do # . . . end on_event :data_received do |data| # . . . end register_event :send end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  11. WEB CLIENT — SEND program :wget do |opts| sent =

    0 on_event :send do cpu { |cpu| (150 / cpu.performance).miliseconds } send_data to: opts[:target], size: 1024.bytes, type: :request, content: sent sent += 1 if sent < opts[:count] register_event :send, delay: 5.miliseconds end end on_event :data_received do |data| # . . . end register_event :send end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  12. WEB CLIENT — RECEIVE program :wget do |opts| sent =

    0 on_event :send do # . . . end on_event :data_received do |data| log "Got data #{data} in process #{process.name}" stats event: :request_served, client: process.name end register_event :send end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  13. RESOURCES node :desktop do cpu 100 end node :gandalf do

    cpu 1400 end net :net01, bw: 1024.bps net :net02, bw: 510.bps route from: :desktop, to: :gandalf, via: [ :net01, :net02 ], twoway: true Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  14. PROCESSES ON NODES new_process :client1, program: :wget, args: { target:

    :server, count: 10 } new_process :client2, program: :wget, args: { target: :server, count: 10 } new_process :server, program: :apache put :server, on: :gandalf put :client1, on: :desktop put :client2, on: :desktop Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  15. SAVE YOUR MODEL e.g. in model.rb Wojciech Rząsa @wrzasa Predicting

    Performance Changes of Distributed Applications
  16. RUN SIMULATION class Experiment < RBSim::Experiment end params = {

    } sim = Experiment.new sim.run './model.rb', params sim.save_stats 'simulation.stats' Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  17. PROCESS SIMULATION STATS class Experiment < RBSim::Experiment def print_req_times_for(s) app_stats.durations(server:

    s) do |tags, start, stop| puts "Req. time #{(stop - start).in_miliseconds} ms." end end end all_stats = Experiment.read_stats 'simulation.stats' first_experiment = all_stats.first first_experiment.print_req_times_for(:apache) Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  18. BASIC EXAMPLE SUMMARY Model in DSL No boilerplate Web server

    and client (~30 LoC) Resources (~12 LoC) Mapping processes to resources (~6 LoC) Running simulation (~8 LoC) Loading saved stats (~3 LoC) Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  19. CASE STUDY #1 RAPGENIUS VS. HEROKU FEBRUARY 2013 https://genius.com/James-somers-herokus-ugly- secret-annotated

    Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  20. FOR HEROKU Perfect scalability Don't have to detect idle/busy dynos

    Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  21. FOR A CLIENT Is "random routing" worse? How much? Wojciech

    Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  22. -- E. Dijkstra In the good old days physicists repeated

    each other's experiments, just to be sure. Today they stick to FORTRAN, so that they can share each other's programs, bugs included. Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  23. ITEMS TO MODEL Random HTTP router "Intelligent" HTTP Wojciech Rząsa

    @wrzasa Predicting Performance Changes of Distributed Applications
  24. RANDOM HTTP ROUTER program :random_router do |servers| on_event :data_received do

    |data| if data.type == :request server = servers.sample send_data to: server, size: data.size, type: :request, content: { from: data.src, content: data.content } elsif data.type == :response send_data to: data.content[:from], size: data.size, type: :response, content: data.content[:content] else raise "Unknown data type #{data.type} received " + "by #{process.name}" end end end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  25. INTELLIGENT HTTP ROUTER program :router do |servers| request_queue = []

    on_event :data_received do |data| # . . . end on_event :process_request do # . . . end end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  26. INTELLIGENT HTTP ROUTER on_event :data_received do |data| if data.type ==

    :request request_queue << data register_event :process_request elsif data.type == :response servers << data.src send_data to: data.content[:from], size: data.size, type: :response, content: data.content[:content] register_event :process_request else raise "Unknown data type #{data.type} received " + "by #{process.name}" end end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  27. INTELLIGENT HTTP ROUTER on_event :process_request do unless servers.empty? or request_queue.empty?

    data = request_queue.shift server = servers.shift send_data to: server, size: data.size, type: :request, content: { from: data.src, content: data.content } unless request_queue.empty? register_event :process_request end end end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  28. RESOURCES (EXAMPLE) servers.each do |s| node s do cpu 1

    end new_process s, program: :webserver, args: { request_times: params[:request_times] } put s, on: s end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  29. WHAT IF? we had more independent "intelligent" routers? good scalability

    + better performance for users? Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  30. CASE STUDY #1 SUMMARY Reusability (model items) Flexibility (arbitrary algorithms

    in routers) Different levels of details for results histograms apdex What ifs Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  31. CASE STUDY #2 TO SCALE HEROKU APPLICATION ... ...OR NOT

    TO SCALE? Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  32. HEROKU DYNOS Name RAM CPU share Compute Price per dyno-month

    standard- 1x 512MB 1x 1x-4x $25 standard- 2x 1024MB 2x 4x-8x $50 Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  33. API BACKEND Rails Unicorn CPU intensive 6 standard-1x dynos scale

    to 3 standard-2x dynos? Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  34. UNICORN Master process Balancing load of worker processes Like Heroku's

    old "intelligent router"! Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  35. APPLICATION SCALING 2x faster dynos 2x fewer dynos More Unicorn

    workers per dyno Same number of Unicorn workers per application Same price More RAM for peaks Better load balancing? Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  36. APPLICATION PARAMETERS Load (req/min) Response times (distribution) Wojciech Rząsa @wrzasa

    Predicting Performance Changes of Distributed Applications
  37. REUSED ITEMS HTTP client HTTP server Random HTTP router Unicorn

    master process (Intelligent Heroku router) Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  38. MODELING DYNOS OR node :standard1x do cpu 1 end node

    :standard2x do cpu 2 end node :standard2x do cpu 1 cpu 1 end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  39. HEROKU DYNOS Name RAM CPU share Compute Price per dyno-month

    standard- 1x 512MB 1x 1x-4x $25 standard- 2x 1024MB 2x 4x-8x $50 Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  40. DYNO SCALING HORIZONTAL OR VERTICAL? DOES IT MATTER!? Wojciech Rząsa

    @wrzasa Predicting Performance Changes of Distributed Applications
  41. SIMULATE AND node :standard2x do cpu 2 end node :standard2x

    do cpu 1 cpu 1 end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  42. IT DOES MATTER HOW DYNOS ARE SCALED! HOW TO FIND

    OUT? documentation does not help... cat /proc/cpuinfo does not help... Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  43. SINGLE CPU INTENSIVE TASK Comparable time on both dyno types

    def cpu_intensive_task(n) start = Time.now (1..n).reduce(:*) Time.now - start end Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  44. 16 CPU INTENSIVE TASKS AT ONCE on standard-1x (2 CPUs)

    on standard-2x (4 CPUs) real 1m8.690s user 2m13.360s sys 0m3.871s real 0m29.182s user 2m17.570s sys 0m4.053s Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  45. CONCLUSION Dynos are scaled horizontally (more CPUs) We shouldn't change

    dyno config Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  46. CASE STUDY #2 SUMMARY Modeling with reusable components Simulation-tested alternative

    configurations Simple, cheap experiments to verify crucial factors Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications
  47. SUMMARY Easier, cheaper, faster DSL, no boilerplate code What ifs

    No magic — just software science Simulation as a Service ;-) Rubber duck Wojciech Rząsa @wrzasa Predicting Performance Changes of Distributed Applications