Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Performance: Not an Afterthough [DrupalSouth 2015]

Performance: Not an Afterthough [DrupalSouth 2015]

As Drupal continues to experience huge growth with government and enterprise clients, the scale and complexity of Drupal implementations also grows. A common issue affecting these bigger projects is poor website performance. Problems of this nature have huge and costly impacts on the customer; lost sales and advertising revenue, loss of consumer confidence and brand legitimacy, SEO penalties, increased hosting infrastructure costs - the list goes on. Not to mention the developers tasked with fixing the problem! The stakes are high, the pressure is on, and the technical challenges are real - not a position many of us would choose to be in.

Nick Santamaria

March 06, 2015
Tweet

More Decks by Nick Santamaria

Other Decks in Technology

Transcript

  1. Nick Santamaria • Senior Developer at Technocrat • Acquia Certified

    Backend Developer • drupal.org http://drupal.org/user/87915 • twitter @nicksanta • Github nicksantamaria
  2. Presentation Outline • Introduction to performance & scalability • Common

    problems • Strategies for success • Infrastructure design and considerations • Debugging performance and scalability issues • QA and discussion
  3. Performance & Scalability Performance The speed with which a single

    request can be executed. Scalability The ability of a request to maintain its performance under increasing load.
  4. What is Performance? Back-end Performance Components • PHP • Amount

    of code being executed (ie, number of modules) • Efficiency of code • Database • Schema design • Query execution time
  5. What is Performance? Back-end Performance Components • API Requests •

    PHP will wait until the request returns a result or times out • Caching • Drupal database • Memcached / Redis / MongoDB • Varnish
  6. What is Performance? Front-end Performance Components • Network Overhead •

    Local vs offshore datacenters • Number of requests • Payload Size • Image optimisation • CSS / JS Minification • Markup size & compression
  7. What is Performance? Front-end Performance Components • Javascript • Number

    of scripts being included • Synchronous vs asynchronous execution • Code efficiency
  8. What is Scalability? “Why is scalability so hard? Because scalability

    cannot be an after-thought.” - Werner Vogels, Amazon CTO
  9. What is Scalability? A system is said to be scalable

    if adding resources results in proportionally increased performance. 9 women can not make a baby in 1 month. Will doubling your site’s server resources double the traffic it can handle?
  10. What is Scalability? Scalability Components • Caching • Block cache

    • Page cache • Reverse proxy cache • Opcode caching • Infrastructure • Web server load balancing • Database clustering • Caching backends - redis, memcached etc..
  11. Common Problems Too many modules - AKA “Open Buffet Syndrome”

    Real life example • 365 enabled modules • 24 core modules • 51 custom modules • 72 exported features • 750 files loaded on every request. • 10 - 20% of PHP execution time was loading files, even with APC. • CPU cycles wasted - 25,000+ calls to module_implements() per request.
  12. • Pages with product/* paths are NEVER cached. • Anonymous

    users who visit this page bypass page cache on all subsequent pages. • … AND those visitors write to the database on every subsequent page view. Common Problems Anonymous users with sessions Seems innocent, but this one line has consequences.
  13. Common Problems Complicated entity & field architecture • Slows down

    form submission, rendering, views, and more.
  14. Strategies for Success Complicated entity & field architecture • How

    many INSERT queries per save? • node • node_revision • field_collection_item • field_collection_item_revision • field_data_field_collection_b • field_revision_field_collection_b • field_data_field_taxonomy_ref • field_revision_field_taxonomy_ref • field_data_field_collection_c • field_revision_field_collection_c • field_data_field_text • field_revision_field_text • file_managed • field_data_field_media • field_revision_field_media Real world field collection implementations are FAR more complicated than this example!
  15. Common Problems Others • Never use views_php module - create

    custom views handlers and plugins. • Complex faceted search using Drupal database - use Solr. • dblog module enabled on production - use syslog. • Carefully consider use of modules with node access functionality - they disable block caching.
  16. Common Problems Others • Never use views_php module - create

    custom views handlers and plugins. • Complex faceted search using Drupal database - use Solr. • dblog module enabled on production - use syslog. • Carefully consider use of modules with node access functionality - they disable block caching.
  17. On-Demand Cache Purging • Planning • Divide the site into

    page “types”. • For each type, build a list of events which would require a page to be cleared from cache. • Considerations • No relative dates, ie “time ago”. • Some page types may be more suited to periodic caching. • Create a spidering script to warm the caches! • Extend to other caches using CacheTags - drupal.org/project/cachetags Strategies for Success
  18. Strategies for Success Authcache (2.x branch) • Replaces Drupal’s default

    page caching allowing you to cache authenticated pages. • Huge scalability improvements for sites with a large proportion of authenticated visitors. • But also much, much more. • Personalisation - authcache_p13n • Form token magic - authcache_form • Store page cache in Varnish - authcache_varnish • Integrates with Cache Expiration
  19. Strategies for Success Authcache • Planning • Define which page

    types are cacheable. • Design how you will segment your visitors (from a cache perspective). • Identify all personalised information which must be displayed. • Considerations • Forms can be tricky - ensure you test thoroughly. • Ensure your analytics / marketing / tracking services are compatible. • See Commerce Kickstart for great out-of-the-box implementation.
  20. Strategies for Success Consuming Feeds & Web Services • Regularly

    importing data into Drupal can be resource intensive. • Feeds, migrate, custom PHP etc… All share the same fundamental problems: • Fetching large datasets, which hog i/o, memory, and CPU cycles. • Lots of slow INSERT and UPDATE operations on the database. • New data will not display immediately unless caches cleared. • The solution? Move to the front end!
  21. Strategies for Success Consuming Feeds & Web Services • PaRSS

    - drupal.org/project/parss • Integrates simple jQuery RSS parser with link fields. • AngularJS - angularjs.org • Very powerful front-end MVC framework. • Usual implementation may not be suitable for this problem. • Angular Blocks - drupal.org/node/2445795 • Allows other modules to expose AngularJS apps as blocks! • Used successfully on recent intranet project, some pages having 6 angular apps on a single page.
  22. Strategies for Success Load Testing • Make it part of

    your development process. • Dont leave it to the last minute or post-launch. • Tools • Apache jMeter • github.com/jacobSingh/Drupal-Performance-Testing-Suite • Blazemeter - blazemeter.com • Blitz - blitz.io • Web Page Test - webpagetest.org
  23. Strategies for Success Queues • Use queues when dealing with:

    • Batch processing large datasets. • Performing complex calculations. • Sequential processing of tasks. • Modules / Tools • Advanced Queue - drupal.org/project/advancedqueue • Advanced Queue Runner - github.com/nvahalik/advancedqueue-runner • Drupal Core Queues - system.queue.inc
  24. Strategies for Success Queues • Improves reliability. • If not

    using queues • There is no guarantee the process will be completed. • If the process fails, there is no easy way to repeat it. • If using queues • Each item is executed at least once. • If the process fails, the queue remains intact. • System load is stabilised because processing of complex or heavy operations is delayed.
  25. Strategies for Success Optimised Front-end • Image Sprites • Minimises

    the number of HTTP requests. • CSS • Think about what your sass / less becomes once compiled. • How complex and specific do the selectors become? • Consider architecting your CSS for conditional inclusion. • Does the site have “sections”? • CSS rendering is a blocking process.
  26. Strategies for Success Optimised Front-end • Asynchronous Javascript - drupal.org/project/async_js

    • Defers javascript execution. • Can improve responsiveness of “sluggish” JS-heavy sites. • Advanced Aggregation - drupal.org/project/advagg • Use CDN version of jQuery. • On-demand generation of aggregated assets.
  27. Strategies for Success Other Recommendations • Elysia Cron - drupal.org/project/elysia_cron

    • Configure scheduling and frequency of specific cron tasks. • Run heavy cron tasks during low traffic periods. • Entity Cache - drupal.org/project/entitycache • Stores complete entity objects in your caching backend. • Enable appropriate dependent modules such as commerce_entitycache, bean_entitycache etc.. • Apache Solr for search • drupal.org/project/search_api_solr • drupal.org/project/apachesolr
  28. Infrastructure Caching Backends • Memcached - drupal.org/project/memcache • Battle tested.

    • Widely deployed. • Volatile storage - not suitable for persistent data. • Redis - drupal.org/project/redis • Less “mature” than Memcached. • 1:1 featureset with Memcached. • Benchmarks slightly better than Memcached. • Commits data to disk by default, can be used for persistent data • Use PHP extension - github.com/phpredis/phpredis (not Predis class)
  29. Infrastructure Caching Backends • I recommend Redis • Store sessions

    in Redis rather than the database Session Proxy - drupal.org/project/session_proxy • Form cache can go straight into redis - no more need for this line: $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';
  30. Infrastructure Simplest Approach • Single server with all components •

    PHP • Web Server (Apache) • Database (MySQL) • Varnish (... sometimes) Varnish Apache PHP MySQL Instance #1
  31. Infrastructure Scaling Vertically • Increase instance size. • Change instance

    types: • CPU optimised • Memory optimised • I/O optimised • Will hit an endpoint eventually. “We’re going to need a bigger box”
  32. Infrastructure Splitting the box Break up stack components onto separate

    servers. Instance #2 Varnish Apache PHP MySQL Instance #1
  33. Infrastructure Horizontally Scalable Infrastructure • Overcomes CPU ceiling issues. •

    Considerations • Load balanced web servers • Database clustering • Shared / clustered file systems • Autoscaling - the holy grail. Load Balancer Varnish Apache PHP MySQL Redis Apache PHP Apache PHP Apache PHP
  34. Debugging Performance and Scalability Issues Tools • New Relic APM,

    browser & server monitoring • MySQL slow query log • Add following lines to my.cnf and restart mysql • log_slow_queries=/var/log/mysql/slow-query.log • long_query_time=20 • XHProf - PHP profiler • Great slides for getting set up here - http://msonnabaum.github.io/xhprof- presentation/ • Browser Developer Tools • Javascript profiler • Network Monitor
  35. Debugging Performance and Scalability Issues General Tips • Look beyond

    the symptoms to find the underlying cause. • Change one thing at a time. • Measure, change, measure. • Sometimes you just have to throw more RAM at the problem.