Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Web Performance - MercadoLibre

Web Performance - MercadoLibre

Technics and tools we use in MELI for improve our app.

Nico Brizuela

July 29, 2013
Tweet

More Decks by Nico Brizuela

Other Decks in Programming

Transcript

  1. Techniques, tips and tools for improve and measure web performance.

    Web Performance MercadoLibre Santiago Aimetta Nicolas Brizuela
  2. Why performance? Reducing time to response, impact directly in your

    revenues. Impact directly in the bounce rate, conversions rate and very important for user experience.
  3. Some numbers of Meli Search • 75MM searches/day (870 searches/second)

    • Peak traffic 102k rpm (1.700 searches/second) • Avg response time: 320ms
  4. Amazon test 2008 • + 100ms >> -1% sales Bing

    test 2009 • + 2000ms >> -4.3% revenues/user MercadoLibre 2013 • + 3000ms >> + 3% in Bounce rate -1% in Revenues
  5. Performance golden rules • 80-90% of the end-user response time

    is spent on the frontend. We start there • Greater potential • Simple • Proven to work
  6. What is this? Is the amount of time between the

    client makes an HTTP request and the browser starts receiving the first byte. How much time is spent making the request until receive the first byte of the response.
  7. 1. DNS LOOKUP 2. INITIAL CONNECTION 3. WAITING 4. RECEIVING

    DATA 5. CLOSING CONNECTION << Time to first Byte = TTFB
  8. • Static content ◦ Such as Html, Js, Css and

    images ◦ Should be under 100 miliseconds • Dinamic content ◦ Includes all the server side processing plus the network infrastructure work ◦ Should be beetween 200 and 500 miliseconds
  9. Configuration check.. • Webserver config (Apache,Jboss,..) / Php config •

    Database settings • Network settings • Api / webservices latency
  10. Hardware • CDN- Content delivery network (Akamai, CloudFront, BitGravity) •

    Multiple servers with load balancing ( f5 , nginx ) • NAS - Filers ( T-com, IBM, HP) • Web caches ( Varnish, Polipo, Squid, TrafficServer )
  11. Software • Parallel processing • Database tuning • Sql tuning

    • API / Webservices response caching • NoSql (MongoDB, Bigtable, Redis) • Chunking - Early buffer flush
  12. • Group of servers distributed in multiple datacenters across the

    internet • The CDN serves the content using the servers that are closer to the client • The network latency is reduced by the proximity between client and server.
  13. • The resources can be cached • Multiple servers prevent

    bottlenecks • Useful for static resources like Html, Css, fonts , Js, videos , images, documents, etc
  14. • Type of http compression like deflate • This saves

    bandwidth and increases speed. • Web client (i.e Browser) sends an Accept-Encoding : gzip, deflate header • Web server responds Content-Encoding : gzip if the data is compressed
  15. • Reduce the 70%-90% of the response size • Use

    in Html, Css, Js, Xml, Json • Dont use in Pdf and images, they are already compressed • Better compression tips: ◦ Sorted key values : Css, html attributes ◦ use one type of quotes, " or ' ◦ Css and Js minification
  16. Benefits • Saves requests to resources that changes infrequently. •

    HTTP caching saves the resources in the browser or the proxy. • Should be cached: CSS, JS, Static HTML, Images, Flash, Pdf, media files.
  17. Response Headers • Strong ones: ◦ These headers express the

    resource lifetime. ◦ The value is a date or a timestamp. ◦ A resource is downloaded again when the expiration date is reached. ◦ Expires and Cache Control.
  18. Response Headers • Weak ones: ◦ Specifies characteristics to identify

    if the resource change ◦ The browser sends conditional GETs to check ◦ Last-Modified, Etag
  19. Expires • Sets an expiration date in the future. •

    if Cache-control and expires are set for the same. resource Cache-control takes precedence. • i.e: Expires: Mon, 8 Jul 2013 21:31:12 GMT.
  20. Last modified • Is a time based header. • The

    application specifies the last modified header i.e: Last-Modified: Tue, 09 jul 2013 17:45:57 GMT. • The next time the browser sends a conditional GET asking if the resource has changed i.e If-Modified-Since: Tue, 09 jul 2013 17:45:57 GMT. • If the resource hasn't changed the server return an empty response with the 304 code (Not Modified)
  21. Etag • Use an md5 hash to identify if the

    resource change. ETag: "15f0fff99ed5aae4edffdd6496d7131f". • In the next request the header If-None-Match is sent with the ETag value i.e: If-None-Match: "15f0fff99ed5aae4edffdd6496d7131f" • If the ETag match, the server responds 304
  22. Tools • Most of browser tools has a network analyzer

    • The example below were made with Chrome dev tool
  23. Tips • For static content: use Cache-Control. • Cache-Control is

    easy to check. • Avoid conditional Gets. • Use the app version or a fingerprint in the url.
  24. Tips • For private content: use Cache-Control : private to

    avoid proxy caching. • Prevent caching: use Cache-Control:no-cache, no-store. • Urls with query string.
  25. • Client and server keep the connection open, unless the

    client indicates otherwise (via Connection: close header). • Http connections are expensive. • Saves TCP handshake ( 150 ms average ).
  26. • Persistent connections send multiple request and response interactions over

    single connection. • If the connection is not persistent you can specify a time out.
  27. Advantages • CPU & memory savings, less tcp connections and

    fewer TCP control blocks. • Allows request and response pipelining. • Reduce network load, less packets sent. • Supported by modern browsers.
  28. • Loading steps ◦ downloading (can be parallel ) ◦

    parsing ◦ executing • Rules ◦ Scripts prevents other scripts to be downloaded and parsed ◦ Stylessheets prevent scripts to be downloaded and parsed ◦ Modern browsers start looking ahead in the document and pre-loading stylesheets and scripts
  29. The HTTP/1.1 RFC A single-user client SHOULD NOT maintain more

    than 2 connections with any server or proxy.
  30. The HTTP/1.1 RFC A single-user client SHOULD NOT maintain more

    than 2 connections with any server or proxy.
  31. IE 6 and 7: 2 IE 8: 6 IE 9:

    6 IE 10: 8 Firefox 2: 2 Firefox 3: 6 Firefox 4 to 17: 6 Opera 9.63: 4 Opera 10: 8 Opera 11 and 12: 6 Chrome 1 and 2: 6 Chrome 3: 4 Chrome 4 to 23: 6 Safari 3 and 4: 4 How browsers handle it? • Browsers don't have to follow this guideline. • Parallel connections.
  32. Nice trick! • The number of parallel connections applies to

    a server. • Use multiple domain names ◦ i.e resources1.domain.com, resources2. domain.com ◦ Expands per server connection limit. ◦ If the domains are CNAMEs of the same ip, works too!
  33. Trade off • DNS lookup ~ 150 ms • Browser

    cpu per parallel download • Bandwidth
  34. Objective • Minimize the request overhead • Cut down on

    client request time by reducing the number of bytes uploaded as request header data • Average request size is 1500 bytes.
  35. How • Keeping cookies and request headers as small as

    possible ensures that an HTTP request can fit into a single packet. • Small urls. • Small cookies. • Remove unused header.
  36. Static content • Objective: ◦ If you set a cookie

    in particular domain, all subsequent HTTP requests for that domain must include the cookie. ◦ Static content, such as images, JS and CSS files, don't need to be accompanied by cookies. ◦ Avoid caching user info.
  37. Static content • How: ◦ Create a domain for static

    content ◦ Use caching headers ◦ CDNs avoid cookies
  38. Objective • Use the browser idle time to download or

    prefetch documents that the user might visit in the near future.
  39. Which content prefetch? • Images commonly used. • The next

    page of the search results. • Prefetch common DNS.
  40. • Image: • Full page • DNS • Be aware

    of ◦ Bandwidth, website statistics
  41. Where and why? • In the head: RUM, analytics •

    before </body>: scripts needed by page load • After page load: scripts needed soon after page load • On demand: In reaction to users
  42. Several ways avoid it • XHR Eval • XHR Injection

    • Script in Iframe • Script DOM Element • Script Defer
  43. Avoid blocking - onload event • Blocks onload event until

    the script have been downloaded and executed ◦ script defer ◦ script async ◦ script dom element • Fix ◦ If you want to ensure that the JavaScript doesn't start to download or execute until after the load event, you can insert it using the window.onload event handler:
  44. Lossless optimizations • Are those that take an image and

    produce another image, which renders exactly the same and it's smaller in file size than the original • The lossless file size savings come from: ◦ Using better compression algorithms to store the pixel information. ◦ Removing unneeded metadata that goes with the image file.
  45. GIF • The best way to optimize a GIF image

    is to convert it to PNG8. • It can store up to 256 colors, just like GIF. • PNG8 supports alpha transparency. • Software: ◦ Photoshop ◦ OptiPNG • Animated GIF ◦ Don't convert to PNG ◦ Software: ▪ GIFSicle
  46. JPEG • Edit image metadata ◦ Software: JPEGTran, EXIFTool •

    Optimizing compression ◦ Software: JPEGTran • Cropping ◦ Rotation to 90, 180, 270 degrees
  47. PNG • Icons, illustrations and photos with high contrast. •

    Support transparency (alpha channel). • Optimizations ◦ Strip PNG chunks ◦ Better pixel compression • Software ◦ TinyPNG ◦ OptiPNG ◦ PNGOptimizer
  48. • Stoyan test over 1000 sites ◦ Convert GIFs to

    PNG ( -23% ) ◦ PNG optimization tools ( -17% ) ◦ Run JPEGTran on all JPEGs ( -13% ) ◦ Optimize animations with GIFSicle ( -4% )
  49. JPEG Progressive • Two types of images, baseline and progressive

    • Baseline jpeg: is a full-resolution top-to-bottom scan of the image
  50. JPEG Progressive • Progressive jpeg: is a series of scans

    of increasing quality, loads from low quality to high in several "passes"
  51. JPEG Progressive • The progressive jpeg’s first pass is low-resolution,

    but it contains as much information, or more, as the small image • Software: ◦ jpegtran ◦ jpegcrop • Images of file size 10K and over have a better chance of being smaller when using the progressive JPEG format
  52. WebP • What is? ◦ is a new image format

    that provides lossless and lossy compression for images on the web • 26% smaller than PNG • 25-34% smaller than JPEG • Supports transparency ( alpha channel )
  53. WebP • Software ◦ CwebP ◦ DwebP ◦ libwebp •

    Support ◦ Chrome 9+ ◦ Opera 12+ ◦ Android 4+ ◦ Opera mobile 11+ ◦ Chrome for android 27+
  54. Links • http://www.rackaid.com/resources/time-to- first-byte/ • http://jackwhitey.hubpages.com/hub/cdn • https://developers.google. com/speed/articles/gzip •

    https://devcenter.heroku. com/articles/increasing-application- performance-with-http-cache-headers#http- cache-headers • https://developers.google. com/speed/docs/best-practices/caching
  55. Links • http://www.nczonline. net/blog/2009/06/23/loading-javascript- without-blocking/ • http://www.stevesouders. com/blog/2009/04/27/loading-scripts-without- blocking/ •

    http://www.stevesouders. com/blog/2008/12/27/coupling-async-scripts/ • http://calendar.perfplanet.com/2010/the- truth-about-non-blocking-javascript/
  56. Links • http://www.catswhocode. com/blog/mastering-html5-prefetching • http://davidwalsh.name/dns-prefetching • http://davidwalsh.name/html5-prefetch • http://statichtml.com/2011/link-prefetching-

    broken-in-chrome.html • http://www.bookofspeed.com/chapter5.html • http://calendar.perfplanet. com/2012/progressive-jpegs-a-new-best- practice/
  57. Links • http://caniuse.com/webp • https://developers.google.com/speed/webp/? hl=es-ES • http://news.cnet.com/8301-1023_3- 57585114-93/google-cuts-network-usage- by-terabytes-by-switching-to-webp/

    • https://developers.google.com/speed/spdy/ • http://www.chromium.org/spdy • http://googlecode.blogspot.com. ar/2012/01/making-web-speedier-and-safer- with-spdy.html
  58. Links • http://dev.chromium.org/spdy/spdy-best- practices • http://www.slideshare.net/nzakas/enough- withthejavascriptalready • http://ejohn.org/blog/browser-page-load- performance/

    • http://www.stevesouders. com/blog/2008/03/20/roundup-on-parallel- connections/ • http://www.yuiblog. com/blog/2007/04/11/performance-research-