Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Profiling PHP applications

Profiling PHP applications

Confoo 2014, Montreal

It's nothing new that speed is important for the success of any web application. This talk will show how you can correctly measure the performance of your site and track down bottlenecks with tools like Xdebug, XHProf or the Symfony Debug Toolbar. And if you still need to get faster after optimizing and fixing all these issues, I'll introduce you to some tricks, techniques and patterns to even further decrease load times.

8e82eb7e128a14a16d642ae55227339b?s=128

Bastian Hofmann

February 27, 2014
Tweet

Transcript

  1. My site is slow, what can I do? @BastianHofmann Profiling

    PHP applications
  2. This talk is all about... ...

  3. Speed. And with that I mean, not the drug, the

    movie or any game, but
  4. Speed of your web application the ..

  5. We'll talk about...

  6. Why it matters and why you should care about it

  7. How to measure it what actually is pagespeed and ...

  8. How to find out where the problems are and ..

  9. Before we start ...

  10. A few words about me ...

  11. I work at ResearchGate, the professional network for scientists and

    researchers
  12. over 3 million users

  13. here some impressions of the page

  14. you may have read about us in the news recently

  15. have this, and also work on some cool stuff

  16. have this, and also work on some cool stuff

  17. have this, and also work on some cool stuff

  18. we are hiring

  19. Questions? Ask by the way, if you have any questions

    throughout this talk, if you don't understand something, just raise your hand and ask.
  20. http://speakerdeck.com/u/bastianhofmann the slides will be available on speakerdeck

  21. So speed…

  22. Why should you care?

  23. It is really important of course because ...

  24. None
  25. but seriously, in the last years multiple studies were made

    on the importance of speed for a web application. how does it affect usage and conversion? how long are people waiting for content? how does it affect sales? if people left because the site is slow, are they coming back?
  26. Every ms counts in short, the result of every study

    was the same: ...
  27. in detail

  28. in detail

  29. ..

  30. So what is pagespeed?

  31. Server the first thing the contributes to pagespeed is what

    happens on the server. this is also the easiest part, because it is completely under our control
  32. Your PHP application Request Response so what your server and

    your application is doing between incoming request and outgoing response
  33. Your PHP application Request Response Load balancer though this does

    not mean only your application, but also the rest of your setup, like a loadbalancer
  34. Your PHP application Request Response Load balancer and also your

    application is probably not a single small php script, but a big application with multiple components that each can affect speed differently. so getting a more detailed view on these components might be interesting as well. more to that later
  35. web server db http service http service cache user request

    additionally most bigger applications have some kind of a service oriented architecture, same things apply here. knowing about the speed of the different services is important.
  36. But there is more ... your application does not stop

    at your server. somehow it needs to get to your user
  37. so internet connectivity is also a big part, contributing to

    pagespeed, that means everything from dns lookup, over ssl handshake to actually transporting the content over the wire
  38. when your user received the content, he needs to display

    it. and some browsers are way slower than others in doing it
  39. so the dom needs to be rendered, css fetched and

    applied, images loaded
  40. and of course nearly no web application comes without javascript.

    this needs to be loaded and executed as well
  41. http://www.stevesouders.com/blog/2012/02/10/the-performance-golden-rule/

  42. So what happens on your server is really important

  43. But the rest is as well my point is: what

    is important is the pagespeed your user perceives. this contains everything from server to his browser. in the end it's your fault if the site is slow, even if the user's computer and browser is crappy.
  44. Step 1 how to deal with this...

  45. Measuring measuring your actual pagespeed

  46. who is doing it?

  47. if you start: it's going to hurt

  48. because although everything seems to be fine on your fast,

    2 month old machine, with lot's of ram, cpu power, latest chrome, from your 50mbit vdsl connection with a very low ping to your data center.
  49. reality is: Old computers

  50. Old feature phones, slower smart phones, mobile networks in general

    (edge anyone?)
  51. old browsers

  52. people in countries with big latencies to your datacenter and/or

    slow internet connections (rember dial up)
  53. So you need to measure at your users side

  54. Navigation Timing API https://developer.mozilla.org/en-US/docs/ Navigation_timing there is actually a great

    javascript up that is supported by a lot of modern browsers
  55. you get timestamps for all important events of a pageload

  56. so we want to get a graph like this

  57. Getting it back to the server so now you have

    all the timestamps in your javascript, you need to ...
  58. Tracking request

  59. https://tracking.example.com/? page=profile&backend=123&co mplete=890&domReady=580

  60. logstash http://logstash.net/ enter the next tool, logstash is a very

    powerful tool to handle log processing
  61. input filter output basic workflow is that you have some

    input where logstash gets log messages in, on this input you can execute multiple filters that modify the message and then you can output the filtered message somewhere
  62. Very rich plugin system to do this it offers a

    very large and rich plugin system for all kinds of inputs, filters and outputs, and you can also write your own
  63. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters for our purpose we can just have boomerang in the browser collect the timestamps and then send a small tracking request (inserting an image) to a tracking server. the timestamps are added as query parameters to this request. the server only returns an empty image and logs the request to his access log which logstash can parse.
  64. Graphing it we want to...

  65. Graphite http://graphite.wikidot.com/ again there are many tools available to collect

    and display these metrics: one i want to highlight is graphite
  66. graphite comes with a powerful interface where you can plot

    and aggregate this data into graphs and perform different mathematical functions on it to get it exactly the way you want to display your data
  67. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters graphite statsd and statsd is a small load balancing daemon for it. so this is your setup logstash sends these timestamps to statsd who aggregates the information and sends them to graphite
  68. and thats what such graphes then may look like

  69. Can we measure more? I said earlier we may need

    information about services etc
  70. Load balancer in a soa architecture we can do something

    similar with the access logs of our services, which also have timing information. or if we have a load balancer (like haproxy) in between as well. we can get useful information from there
  71. browser JS: boomerang logstash trackingServer access log requests tracking image

    with timing information as query parameters graphite statsd logstash load balancer access log logstash service access log logstash can analyse these logs and send them to statsd as well
  72. From within your PHP app What is also useful is

    to measure certain things from within your php app, e.g. rendering time. time database requests took. time spent of certain business logic etc. you can either just log this to a file and use the same logstash mechanism, or if you just need to have it for debugging, do it differently. more to that later.
  73. More fine grained metrics

  74. By pages but you should not only measure all your

    request, you should differentiate by...
  75. By browser

  76. By country

  77. Logged in / out

  78. Heavy users .. or just everything that makes sense for

    you
  79. Step 2 but before you can start fixing stuff, ...

  80. How to find out where the problems are finding out

    ...
  81. Profiling you can do this through ...

  82. first tool usefull for this is ... xdebug has quite

    a few functionalities like offering the ability to make breakpoints in your code, nicer error displays and so on. but one is also profiling of your app
  83. Webgrind https://github.com/jokkedk/webgrind this write so called cachegrind files. in order

    to view this you can use tools like kcachegrind or the easiest one ...
  84. you can see everything that happend in this request, every

    function that was invoked, how often this was and how long it took.
  85. Use it locally on your dev machine one thing with

    xdebug, it slows down php, so .. but not in production
  86. XHProf https://github.com/facebook/xhprof for production there is xhprof, developed by facebook

  87. Use it in production for a subset of requests you

    can safely use it in production, it comes with a performance overhead but only when used, so you can activate it, but only use for a small percentage of requests or when manually activated (e.g. by a cookie).
  88. XHGUI https://github.com/perftools/xhgui to display the xhprof profiles, there is a

    nice tool called xhgui
  89. in addition to the normal thinkgs like seeing

  90. the whole callstack

  91. you can also visualize this in a graph

  92. and can do analysis over multiple requests and compare them

    to each other
  93. DEMO http://localhost:8080/xhgui/webroot/

  94. php-meminfo https://github.com/BitOne/php-meminfo

  95. Symfony Debug Toolbar i said earlier, that there is another

    good way to get information about your applications internals, especially if you only need it for debugging and not in a graph. this is with the..
  96. it comes with standard symfony and you probably all have

    seen it already
  97. you can click on it and it gives you nice

    detailed information about the request. stuff like doctrine queries, a nice timeline, exceptions, routing, events etc.
  98. DEMO http://localhost:8080/profiling_talk/web/app_dev.php

  99. Extend it http://symfony.com/doc/current/cookbook/profiler/ data_collector.html but did you know that you

    can extend it? there are some good ready made extensions available, e.g. for caching, http calls, versioning etc. just check packagist, but you can also write your own easily.
  100. here are some examples how we at researchgate extended it

    (disclaimer: we are not even using full symfony, but only some components).
  101. None
  102. None
  103. None
  104. None
  105. None
  106. DEMO

  107. Step 3 now that you have all this debugging information

    to pinpoint your bottlenecks, let's get to ...
  108. Fix it

  109. That's something you have to do unfortunately ... since it

    is very dependent on your application and your setup
  110. None
  111. Some hints on better performance

  112. The obvious Of course you have the obvious things like

  113. GZIP

  114. CDNs

  115. DB Indexes

  116. Minify JS https://github.com/mishoo/UglifyJS

  117. Minify CSS http://sass-lang.com/

  118. Concatenate JS/ CSS/...

  119. Correct caching headers

  120. Opcache

  121. Data Caching

  122. APCU

  123. memcached

  124. PHP 5.5 https://blog.asmallorange.com/2013/08/php-roadmap- performance/

  125. The not so well known

  126. SPDY

  127. Minimize redirects

  128. Check image compression https://github.com/gruntjs/grunt-contrib-imagemin

  129. Compress HTML

  130. Check YSlow, Pagespeed

  131. DNS prefetch

  132. <link  rel="dns-­‐prefetch"   href="//host_name_to_prefetch.com">

  133. Move logic to async workers

  134. The "crazy" stuff

  135. Varnish

  136. ESI

  137. Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu Institution

  138. Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu <esi:include

    src="..." /> Institution because every component has it's own url you can just render out a esi placeholder instead of the widget to tell varnish to fetch it separately and provided it has caching headers, get it out of the cache
  139. Load content asynchronously

  140. Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu Institution

  141. Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu <div

    id="placeholder"></div> <script>loadWidget('/aboutMe', function(w) { w.render({ replace : '#placeholder' }); })</script> Institution so instead of rendering the widget you render a placeholder dom element and a script tag that loads the widget with an ajax request and then renders it on the client side
  142. Flush content early

  143. Move logic to shutdown handlers

  144. http://www.php.net/manual/en/function.register-shutdown- function.php

  145. http://www.php.net/manual/en/function.fastcgi-finish- request.php

  146. Promises / Futures https://github.com/facebook/libphutil

  147. $future  =  new  HTTPFuture(     'http://www.example.com/' ); list($status,  $body,

     $headers)  =     $future-­‐>resolve();
  148. pushState and if you are at that, when you switch

    pages, you can also just load the differences between and use pushState to change the url (if supported) to make your app faster
  149. Bigpipe

  150. if you look at your widget tree you can mostly

    identify larger parts, which are widgets itself
  151. Profile Menu Header LeftColumn RightColumn like this, so what you

    can do to dramatically increase the perceived load time is prioritizing the rendering
  152. so our http request looks like this, first you compute

    and render the important parts of the page, like the top menu and the profile header as well as the rest of the layout, for the left column and right column which are expensive to compute you just render placeholders and then flush the content to the client so that the browser already renders this
  153. still in the same http request you render out the

    javascript needed to make the already rendered components work, so people can use the menu for example
  154. still in the same http request you then compute the

    data for the left column and render out some javascript that takes this data and renders it into the components template client side and then replaces the placeholder with the rendered template
  155. still in the same request you then can do this

    with the right column -> flush content as early as possible, don't wait for the whole site to be computed
  156. Keep up to date

  157. http://www.perfplanet.com/

  158. Remember but

  159. Speed matters

  160. http://twitter.com/BastianHofmann http://lanyrd.com/people/BastianHofmann http://speakerdeck.com/u/bastianhofmann mail@bastianhofmann.de thanks, you can contact me on

    any of these platforms or via mail.