Slide 1

Slide 1 text

My site is slow, what can I do? @BastianHofmann Profiling PHP applications

Slide 2

Slide 2 text

This talk is all about... ...

Slide 3

Slide 3 text

Speed. And with that I mean, not the drug, the movie or any game, but

Slide 4

Slide 4 text

Speed of your web application the ..

Slide 5

Slide 5 text

We'll talk about...

Slide 6

Slide 6 text

Why it matters and why you should care about it

Slide 7

Slide 7 text

How to measure it what actually is pagespeed and ...

Slide 8

Slide 8 text

How to find out where the problems are and ..

Slide 9

Slide 9 text

Before we start ...

Slide 10

Slide 10 text

A few words about me ...

Slide 11

Slide 11 text

I work at ResearchGate, the professional network for scientists and researchers

Slide 12

Slide 12 text

over 3 million users

Slide 13

Slide 13 text

here some impressions of the page

Slide 14

Slide 14 text

you may have read about us in the news recently

Slide 15

Slide 15 text

have this, and also work on some cool stuff

Slide 16

Slide 16 text

have this, and also work on some cool stuff

Slide 17

Slide 17 text

have this, and also work on some cool stuff

Slide 18

Slide 18 text

we are hiring

Slide 19

Slide 19 text

Questions? Ask by the way, if you have any questions throughout this talk, if you don't understand something, just raise your hand and ask.

Slide 20

Slide 20 text

http://speakerdeck.com/u/bastianhofmann the slides will be available on speakerdeck

Slide 21

Slide 21 text

So speed…

Slide 22

Slide 22 text

Why should you care?

Slide 23

Slide 23 text

It is really important of course because ...

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

but seriously, in the last years multiple studies were made on the importance of speed for a web application. how does it affect usage and conversion? how long are people waiting for content? how does it affect sales? if people left because the site is slow, are they coming back?

Slide 26

Slide 26 text

Every ms counts in short, the result of every study was the same: ...

Slide 27

Slide 27 text

in detail

Slide 28

Slide 28 text

in detail

Slide 29

Slide 29 text

..

Slide 30

Slide 30 text

So what is pagespeed?

Slide 31

Slide 31 text

Server the first thing the contributes to pagespeed is what happens on the server. this is also the easiest part, because it is completely under our control

Slide 32

Slide 32 text

Your PHP application Request Response so what your server and your application is doing between incoming request and outgoing response

Slide 33

Slide 33 text

Your PHP application Request Response Load balancer though this does not mean only your application, but also the rest of your setup, like a loadbalancer

Slide 34

Slide 34 text

Your PHP application Request Response Load balancer and also your application is probably not a single small php script, but a big application with multiple components that each can affect speed differently. so getting a more detailed view on these components might be interesting as well. more to that later

Slide 35

Slide 35 text

web server db http service http service cache user request additionally most bigger applications have some kind of a service oriented architecture, same things apply here. knowing about the speed of the different services is important.

Slide 36

Slide 36 text

But there is more ... your application does not stop at your server. somehow it needs to get to your user

Slide 37

Slide 37 text

so internet connectivity is also a big part, contributing to pagespeed, that means everything from dns lookup, over ssl handshake to actually transporting the content over the wire

Slide 38

Slide 38 text

when your user received the content, he needs to display it. and some browsers are way slower than others in doing it

Slide 39

Slide 39 text

so the dom needs to be rendered, css fetched and applied, images loaded

Slide 40

Slide 40 text

and of course nearly no web application comes without javascript. this needs to be loaded and executed as well

Slide 41

Slide 41 text

http://www.stevesouders.com/blog/2012/02/10/the-performance-golden-rule/

Slide 42

Slide 42 text

So what happens on your server is really important

Slide 43

Slide 43 text

But the rest is as well my point is: what is important is the pagespeed your user perceives. this contains everything from server to his browser. in the end it's your fault if the site is slow, even if the user's computer and browser is crappy.

Slide 44

Slide 44 text

Step 1 how to deal with this...

Slide 45

Slide 45 text

Measuring measuring your actual pagespeed

Slide 46

Slide 46 text

who is doing it?

Slide 47

Slide 47 text

if you start: it's going to hurt

Slide 48

Slide 48 text

because although everything seems to be fine on your fast, 2 month old machine, with lot's of ram, cpu power, latest chrome, from your 50mbit vdsl connection with a very low ping to your data center.

Slide 49

Slide 49 text

reality is: Old computers

Slide 50

Slide 50 text

Old feature phones, slower smart phones, mobile networks in general (edge anyone?)

Slide 51

Slide 51 text

old browsers

Slide 52

Slide 52 text

people in countries with big latencies to your datacenter and/or slow internet connections (rember dial up)

Slide 53

Slide 53 text

So you need to measure at your users side

Slide 54

Slide 54 text

Navigation Timing API https://developer.mozilla.org/en-US/docs/ Navigation_timing there is actually a great javascript up that is supported by a lot of modern browsers

Slide 55

Slide 55 text

you get timestamps for all important events of a pageload

Slide 56

Slide 56 text

For older browsers you have to do it yourself though ... e.g. by manually measuring the time with javascript and on the server. this is actually kind of hard (clock offsets, etc)

Slide 57

Slide 57 text

https://github.com/lognormal/boomerang use something of people who already did this for you

Slide 58

Slide 58 text

so we want to get a graph like this

Slide 59

Slide 59 text

Getting it back to the server so now you have all the timestamps in your javascript, you need to ...

Slide 60

Slide 60 text

Tracking request

Slide 61

Slide 61 text

https://tracking.example.com/? page=profile&backend=123&co mplete=890&domReady=580

Slide 62

Slide 62 text

logstash http://logstash.net/ enter the next tool, logstash is a very powerful tool to handle log processing

Slide 63

Slide 63 text

input filter output basic workflow is that you have some input where logstash gets log messages in, on this input you can execute multiple filters that modify the message and then you can output the filtered message somewhere

Slide 64

Slide 64 text

Very rich plugin system to do this it offers a very large and rich plugin system for all kinds of inputs, filters and outputs, and you can also write your own

Slide 65

Slide 65 text

browser JS: boomerang logstash trackingServer access log requests tracking image with timing information as query parameters for our purpose we can just have boomerang in the browser collect the timestamps and then send a small tracking request (inserting an image) to a tracking server. the timestamps are added as query parameters to this request. the server only returns an empty image and logs the request to his access log which logstash can parse.

Slide 66

Slide 66 text

Graphing it we want to...

Slide 67

Slide 67 text

Graphite http://graphite.wikidot.com/ again there are many tools available to collect and display these metrics: one i want to highlight is graphite

Slide 68

Slide 68 text

graphite comes with a powerful interface where you can plot and aggregate this data into graphs and perform different mathematical functions on it to get it exactly the way you want to display your data

Slide 69

Slide 69 text

browser JS: boomerang logstash trackingServer access log requests tracking image with timing information as query parameters graphite statsd and statsd is a small load balancing daemon for it. so this is your setup logstash sends these timestamps to statsd who aggregates the information and sends them to graphite

Slide 70

Slide 70 text

input { file { type => "pagespeed-access" path => [ "/var/log/nginx/ access_log/monitoring-access.log" ] } } in logstash this is how to get the date from the log into logstash

Slide 71

Slide 71 text

filter{ grok { type => "pagespeed-access" pattern => "^.*\s\"[A-Z]+\s[^\?\s]+ \?page=%{DATA:page}\&connectTime=% {NUMBER:connectTime}...)?\sHTTP\/\d\.\d\".* $" } grok { type => "pagespeed-access" match => ["page", "^(profile| home|...)\.logged(In|Out)$"] exclude_tags => ["_grokparsefailure"] } } you can apply filters to put it into a structured form and validate it

Slide 72

Slide 72 text

output { statsd { type => "pagespeed-access" exclude_tags => ["_grokparsefailure"] host => "localhost" port => 8126 namespace => "pagespeed" sender => "" timing => [ "%{page}.connect", "% {connectTime}", ... ] } } and the put the data somehwere else. here we are sending it to statsd. what's that

Slide 73

Slide 73 text

and thats what such graphes then may look like

Slide 74

Slide 74 text

Can we measure more? I said earlier we may need information about services etc

Slide 75

Slide 75 text

Load balancer in a soa architecture we can do something similar with the access logs of our services, which also have timing information. or if we have a load balancer (like haproxy) in between as well. we can get useful information from there

Slide 76

Slide 76 text

input { file { type => "haproxy-http-log" path => [ "/var/log/ haproxy-http.log*" ] } } example config

Slide 77

Slide 77 text

filter { grok { type => "haproxy-http-log" pattern => "%{HAPROXYHTTP}" } mutate { type => "haproxy-http-log" gsub => [ "server_name", "\.", "_", "client_ip", "\.", "_" ] } } example config

Slide 78

Slide 78 text

output { statsd { type => "haproxy-http-log" exclude_tags => ["_grokparsefailure"] host => "localhost" port => 8125 namespace => "lb" sender => "" increment => [ ! ! "haproxy.%{backend_name}.%{server_name}.% {client_ip}.hits", ! ! "haproxy.%{backend_name}.%{server_name}.% {client_ip}.responses.%{http_status_code}" ! ! ] timing => [ "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_request", "%{time_request}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_backend_connect", "%{time_backend_connect}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_backend_response", "%{time_backend_response}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_queue", "%{time_queue}", "haproxy.%{backend_name}.%{server_name}.% {client_ip}.time_duration", "%{time_duration}" ] } } example config

Slide 79

Slide 79 text

browser JS: boomerang logstash trackingServer access log requests tracking image with timing information as query parameters graphite statsd logstash load balancer access log logstash service access log logstash can analyse these logs and send them to statsd as well

Slide 80

Slide 80 text

From within your PHP app What is also useful is to measure certain things from within your php app, e.g. rendering time. time database requests took. time spent of certain business logic etc. you can either just log this to a file and use the same logstash mechanism, or if you just need to have it for debugging, do it differently. more to that later.

Slide 81

Slide 81 text

More fine grained metrics

Slide 82

Slide 82 text

By pages but you should not only measure all your request, you should differentiate by...

Slide 83

Slide 83 text

By browser

Slide 84

Slide 84 text

By country

Slide 85

Slide 85 text

Logged in / out

Slide 86

Slide 86 text

Heavy users .. or just everything that makes sense for you

Slide 87

Slide 87 text

Define goals and after measuring everything and you see that you are slow somewhere, you should define goals, what performance you want to reach

Slide 88

Slide 88 text

Step 2 but before you can start fixing stuff, ...

Slide 89

Slide 89 text

How to find out where the problems are finding out ...

Slide 90

Slide 90 text

Profiling you can do this through ...

Slide 91

Slide 91 text

first tool usefull for this is ... xdebug has quite a few functionalities like offering the ability to make breakpoints in your code, nicer error displays and so on. but one is also profiling of your app

Slide 92

Slide 92 text

xdebug.profiler_enable_trigger = 1 http://url?XDEBUG_PROFILE you can either activate profiling for every request or selectively for all requests that have a GET, POST or COOKIE parameter called XDEBUG_PROFILE

Slide 93

Slide 93 text

Webgrind https://github.com/jokkedk/webgrind this write so called cachegrind files. in order to view this you can use tools like kcachegrind or the easiest one ...

Slide 94

Slide 94 text

you can see everything that happend in this request, every function that was invoked, how often this was and how long it took.

Slide 95

Slide 95 text

DEMO data/callgrind.example.out.1087

Slide 96

Slide 96 text

Use it locally on your dev machine one thing with xdebug, it slows down php, so .. but not in production

Slide 97

Slide 97 text

XHProf https://github.com/facebook/xhprof for production there is xhprof, developed by facebook

Slide 98

Slide 98 text

Use it in production for a subset of requests you can safely use it in production, it comes with a performance overhead but only when used, so you can activate it, but only use for a small percentage of requests or when manually activated (e.g. by a cookie).

Slide 99

Slide 99 text

XHGUI https://github.com/perftools/xhgui to display the xhprof profiles, there is a nice tool called xhgui

Slide 100

Slide 100 text

in addition to the normal thinkgs like seeing

Slide 101

Slide 101 text

the whole callstack

Slide 102

Slide 102 text

you can also visualize this in a graph

Slide 103

Slide 103 text

and can do analysis over multiple requests and compare them to each other

Slide 104

Slide 104 text

DEMO http://localhost:8080/xhgui/webroot/

Slide 105

Slide 105 text

php-meminfo https://github.com/BitOne/php-meminfo

Slide 106

Slide 106 text

DEMO

Slide 107

Slide 107 text

Symfony Debug Toolbar i said earlier, that there is another good way to get information about your applications internals, especially if you only need it for debugging and not in a graph. this is with the..

Slide 108

Slide 108 text

it comes with standard symfony and you probably all have seen it already

Slide 109

Slide 109 text

you can click on it and it gives you nice detailed information about the request. stuff like doctrine queries, a nice timeline, exceptions, routing, events etc.

Slide 110

Slide 110 text

DEMO http://localhost:8080/profiling_talk/web/app_dev.php

Slide 111

Slide 111 text

Extend it http://symfony.com/doc/current/cookbook/profiler/ data_collector.html but did you know that you can extend it? there are some good ready made extensions available, e.g. for caching, http calls, versioning etc. just check packagist, but you can also write your own easily.

Slide 112

Slide 112 text

here are some examples how we at researchgate extended it (disclaimer: we are not even using full symfony, but only some components).

Slide 113

Slide 113 text

No content

Slide 114

Slide 114 text

No content

Slide 115

Slide 115 text

No content

Slide 116

Slide 116 text

No content

Slide 117

Slide 117 text

No content

Slide 118

Slide 118 text

DEMO

Slide 119

Slide 119 text

Step 3 now that you have all this debugging information to pinpoint your bottlenecks, let's get to ...

Slide 120

Slide 120 text

Fix it

Slide 121

Slide 121 text

That's something you have to do unfortunately ... since it is very dependent on your application and your setup

Slide 122

Slide 122 text

No content

Slide 123

Slide 123 text

Keep up to date

Slide 124

Slide 124 text

http://www.perfplanet.com/

Slide 125

Slide 125 text

Remember but

Slide 126

Slide 126 text

Speed matters

Slide 127

Slide 127 text

Some hints on better performance

Slide 128

Slide 128 text

The obvious Of course you have the obvious things like

Slide 129

Slide 129 text

GZIP

Slide 130

Slide 130 text

CDNs

Slide 131

Slide 131 text

DB Indexes

Slide 132

Slide 132 text

Minify JS https://github.com/mishoo/UglifyJS

Slide 133

Slide 133 text

Minify CSS http://sass-lang.com/

Slide 134

Slide 134 text

Concatenate JS/ CSS/...

Slide 135

Slide 135 text

Correct caching headers

Slide 136

Slide 136 text

Opcache

Slide 137

Slide 137 text

Data Caching

Slide 138

Slide 138 text

APCU

Slide 139

Slide 139 text

memcached

Slide 140

Slide 140 text

PHP 5.5 https://blog.asmallorange.com/2013/08/php-roadmap- performance/

Slide 141

Slide 141 text

The not so well known

Slide 142

Slide 142 text

SPDY

Slide 143

Slide 143 text

Minimize redirects

Slide 144

Slide 144 text

Check image compression https://github.com/gruntjs/grunt-contrib-imagemin

Slide 145

Slide 145 text

Compress HTML

Slide 146

Slide 146 text

Check YSlow, Pagespeed

Slide 147

Slide 147 text

DNS prefetch

Slide 148

Slide 148 text

Slide 149

Slide 149 text

Move logic to async workers

Slide 150

Slide 150 text

The "crazy" stuff

Slide 151

Slide 151 text

Varnish

Slide 152

Slide 152 text

ESI

Slide 153

Slide 153 text

Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu Institution

Slide 154

Slide 154 text

Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu Institution because every component has it's own url you can just render out a esi placeholder instead of the widget to tell varnish to fetch it separately and provided it has caching headers, get it out of the cache

Slide 155

Slide 155 text

Load content asynchronously

Slide 156

Slide 156 text

Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu Institution

Slide 157

Slide 157 text

Profile Publications Publication Publication Publication AboutMe LeftColumn Image Menu
loadWidget('/aboutMe', function(w) { w.render({ replace : '#placeholder' }); }) Institution so instead of rendering the widget you render a placeholder dom element and a script tag that loads the widget with an ajax request and then renders it on the client side

Slide 158

Slide 158 text

Flush content early

Slide 159

Slide 159 text

Move logic to shutdown handlers

Slide 160

Slide 160 text

http://www.php.net/manual/en/function.register-shutdown- function.php

Slide 161

Slide 161 text

http://www.php.net/manual/en/function.fastcgi-finish- request.php

Slide 162

Slide 162 text

Promises / Futures https://github.com/facebook/libphutil

Slide 163

Slide 163 text

$future  =  new  HTTPFuture(     'http://www.example.com/' ); list($status,  $body,  $headers)  =     $future-­‐>resolve();

Slide 164

Slide 164 text

pushState and if you are at that, when you switch pages, you can also just load the differences between and use pushState to change the url (if supported) to make your app faster

Slide 165

Slide 165 text

Bigpipe

Slide 166

Slide 166 text

if you look at your widget tree you can mostly identify larger parts, which are widgets itself

Slide 167

Slide 167 text

Profile Menu Header LeftColumn RightColumn like this, so what you can do to dramatically increase the perceived load time is prioritizing the rendering

Slide 168

Slide 168 text

so our http request looks like this, first you compute and render the important parts of the page, like the top menu and the profile header as well as the rest of the layout, for the left column and right column which are expensive to compute you just render placeholders and then flush the content to the client so that the browser already renders this

Slide 169

Slide 169 text

still in the same http request you render out the javascript needed to make the already rendered components work, so people can use the menu for example

Slide 170

Slide 170 text

still in the same http request you then compute the data for the left column and render out some javascript that takes this data and renders it into the components template client side and then replaces the placeholder with the rendered template

Slide 171

Slide 171 text

still in the same request you then can do this with the right column -> flush content as early as possible, don't wait for the whole site to be computed

Slide 172

Slide 172 text

https://joind.in/10710

Slide 173

Slide 173 text

http://twitter.com/BastianHofmann http://lanyrd.com/people/BastianHofmann http://speakerdeck.com/u/bastianhofmann [email protected] thanks, you can contact me on any of these platforms or via mail.