Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Profiling PHP applications

Profiling PHP applications

It's nothing new that speed is important for the success of any web application. Only a few hundred milliseconds may lie between a user leaving your site or staying. Unfortunately performance problems are oftentimes hard to fix and even harder to pinpoint.

In this talk I will show you how we at ResearchGate measure web application performance, which means not only timing how long the PHP backend took to deliver a page, but also tracking the speed the users actually perceives in the browser. After that you will see how you can track down and analyze any problems you found through measuring with the help of tools like Xdebug, XHProf and the Symfony Debug Toolbar. And if you still need to get faster after optimizing and fixing all these issues, I'll introduce you to some tricks, techniques and patterns to even further decrease load times.

Bastian Hofmann

November 07, 2013
Tweet

More Decks by Bastian Hofmann

Other Decks in Programming

Transcript

  1. ResearchGate gives science back to the people who make it

    happen. We help researchers build reputation and accelerate scientific progress. On their terms. ‟ the goal is to give...
  2. Questions? Ask by the way, if you have any questions

    throughout this talk, if you don't understand something, just raise your hand and ask.
  3. but seriously, in the last years multiple studies were made

    on the importance of speed for a web application. how does it affect usage and conversion? how long are people waiting for content? how does it affect sales? if people left because the site is slow, are they coming back?
  4. ..

  5. Server the first thing the contributes to pagespeed is what

    happens on the server. this is also the easiest part, because it is completely under our control
  6. Your PHP application Request Response so what your server and

    your application is doing between incoming request and outgoing response
  7. Your PHP application Request Response Load balancer though this does

    not mean only your application, but also the rest of your setup, like a loadbalancer
  8. Your PHP application Request Response Load balancer and also your

    application is probably not a single small php script, but a big application with multiple components that each can affect speed differently. so getting a more detailed view on these components might be interesting as well. more to that later
  9. web server db http service http service cache user request

    additionally most bigger applications have some kind of a service oriented architecture, same things apply here. knowing about the speed of the different services is important.
  10. But there is more ... your application does not stop

    at your server. somehow it needs to get to your user
  11. so internet connectivity is also a big part, contributing to

    pagespeed, that means everything from dns lookup, over ssl handshake to actually transporting the content over the wire
  12. when your user received the content, he needs to display

    it. and some browsers are way slower than others in doing it
  13. and of course nearly no web application comes without javascript.

    this needs to be loaded and executed as well
  14. But the rest as well my point is: what is

    important is the pagespeed your user perceives. this contains everything from server to his browser. in the end it's your fault if the site is slow, even if the user's computer and browser is crappy.
  15. because although everything seems to be fine on your fast,

    2 month old machine, with lot's of ram, cpu power, latest chrome, from your 50mbit vdsl connection with a very low ping to your data center.
  16. people in countries with big latencies to your datacenter and/or

    slow internet connections (rember dial up)
  17. For older browsers you have to do it yourself though

    ... e.g. by manually measuring the time with javascript and on the server. this is actually kind of hard (clock offsets, etc)
  18. Or

  19. Getting it back to the server so now you have

    all the timestamps in your javascript, you need to ...
  20. input filter output basic workflow is that you have some

    input where logstash gets log messages in, on this input you can execute multiple filters that modify the message and then you can output the filtered message somewhere
  21. Very rich plugin system to do this it offers a

    very large and rich plugin system for all kinds of inputs, filters and outputs, and you can also write your own
  22. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters for our purpose we can just have boomerang in the browser collect the timestamps and then send a small tracking request (inserting an image) to a tracking server. the timestamps are added as query parameters to this request. the server only returns an empty image and logs the request to his access log which logstash can parse.
  23. Graphite http://graphite.wikidot.com/ again there are many tools available to collect

    and display these metrics: one i want to highlight is graphite
  24. graphite comes with a powerful interface where you can plot

    and aggregate this data into graphs and perform different mathematical functions on it to get it exactly the way you want to display your data
  25. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters graphite statsd and statsd is a small load balancing daemon for it. so this is your setup logstash sends these timestamps to statsd who aggregates the information and sends them to graphite
  26. input { file { type => "pagespeed-access" path => [

    "/var/log/nginx/ access_log/monitoring-access.log" ] } } in logstash this is how to get the date from the log into logstash
  27. filter{ grok { type => "pagespeed-access" pattern => "^.*\s\"[A-Z]+\s[^\?\s]+ \?page=%{DATA:page}\&connectTime=%

    {NUMBER:connectTime}...)?\sHTTP\/\d\.\d\".* $" } grok { type => "pagespeed-access" match => ["page", "^(profile| home|...)\.logged(In|Out)$"] exclude_tags => ["_grokparsefailure"] } } you can apply filters to put it into a structured form and validate it
  28. output { statsd { type => "pagespeed-access" exclude_tags => ["_grokparsefailure"]

    host => "localhost" port => 8126 namespace => "pagespeed" sender => "" timing => [ "%{page}.connect", "% {connectTime}", ... ] } } and the put the data somehwere else. here we are sending it to statsd. what's that
  29. Can we measure more? I said earlier we may need

    information about services etc
  30. Load balancer in a soa architecture we can do something

    similar with the access logs of our services, which also have timing information. or if we have a load balancer in between as well. we can get useful information from there
  31. Example: HAProxy you can get the time of the request,

    time spent in haproxy queues etc.
  32. input { file { type => "haproxy-http-log" path => [

    "/var/log/ haproxy-http.log*" ] } } example config
  33. filter { grok { type => "haproxy-http-log" pattern => "%{HAPROXYHTTP}"

    } mutate { type => "haproxy-http-log" gsub => [ "server_name", "\.", "_", "client_ip", "\.", "_" ] } } example config
  34. output { statsd { type => "haproxy-http-log" exclude_tags => ["_grokparsefailure"]

    host => "localhost" port => 8125 namespace => "lb" sender => "" increment => [ ! ! "haproxy.%{backend_name}.%{server_name}.% {client_ip}.hits", ! ! "haproxy.%{backend_name}.%{server_name}.% {client_ip}.responses.%{http_status_code}" ! ! ] timing => [ "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_request", "%{time_request}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_backend_connect", "%{time_backend_connect}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_backend_response", "%{time_backend_response}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_queue", "%{time_queue}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_duration", "%{time_duration}" ] } } example config
  35. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters graphite statsd logstash load balancer access log logstash service access log logstash can analyse these logs and send them to statsd as well
  36. From within your PHP app What is also useful is

    to measure certain things from within your php app, e.g. rendering time. time database requests took. time spent of certain business logic etc. you can either just log this to a file and use the same logstash mechanism, or if you just need to have it for debugging, do it differently. more to that later.
  37. By pages but you should not only measure all your

    request, you should differentiate by...
  38. Define goals and after measuring everything and you see that

    you are slow somehwere, you should define goals, what performance you want to reach
  39. first tool usefull for this is ... xdebug has quite

    a few functionalities like offering the ability to make breakpoints in your code, nicer error displays and so on. but one is also profiling of your app
  40. xdebug.profiler_enable_trigger = 1 http://url?XDEBUG_PROFILE you can either activate profiling for

    every request or selectively for all requests that have a GET, POST or COOKIE parameter called XDEBUG_PROFILE
  41. Webgrind https://github.com/jokkedk/webgrind this write so called cachegrind files. in order

    to view this you can use tools like kcachegrind or the easiest one ...
  42. you can see everything that happend in this request, every

    function that was invoked, how often this was and how long it took.
  43. Use it locally on your dev machine one thing with

    xdebug, it slows down php, so .. but not in production
  44. Use it in production for a subset of requests you

    can safely use it in production, it comes with a performance overhead but only when used, so you can activate it, but only use for a small percentage of requests or when manually activated (e.g. by a cookie).
  45. Symfony Debug Toolbar i said earlier, that there is another

    good way to get information about your applications internals, especially if you only need it for debugging and not in a graph. this is with the..
  46. you can click on it and it gives you nice

    detailed information about the request. stuff like doctrine queries, a nice timeline, exceptions, routing, events etc.
  47. Extend it http://symfony.com/doc/current/cookbook/profiler/ data_collector.html but did you know that you

    can extend it? there are some good ready made extensions available, e.g. for caching, http calls, versioning etc. just check packagist, but you can also write your own easily.
  48. here are some examples how we at researchgate extended it

    (disclaimer: we are not even using full symfony, but only some components).
  49. Step 3 now that you have all this debugging information

    to pinpoint your bottlenecks, let's get to ...
  50. That's someting you have to do unfortunately ... since it

    is very dependent on your application and your setup