Slide 1

Slide 1 text

Performance: Not an Afterthought DrupalSouth 2015

Slide 2

Slide 2 text

Nick Santamaria • Senior Developer at Technocrat • Acquia Certified Backend Developer • drupal.org http://drupal.org/user/87915 • twitter @nicksanta • Github nicksantamaria

Slide 3

Slide 3 text

Presentation Outline • Introduction to performance & scalability • Common problems • Strategies for success • Infrastructure design and considerations • Debugging performance and scalability issues • QA and discussion

Slide 4

Slide 4 text

Performance & Scalability

Slide 5

Slide 5 text

Performance & Scalability Performance The speed with which a single request can be executed. Scalability The ability of a request to maintain its performance under increasing load.

Slide 6

Slide 6 text

What is Performance? Back-end Performance Components • PHP • Amount of code being executed (ie, number of modules) • Efficiency of code • Database • Schema design • Query execution time

Slide 7

Slide 7 text

What is Performance? Back-end Performance Components • API Requests • PHP will wait until the request returns a result or times out • Caching • Drupal database • Memcached / Redis / MongoDB • Varnish

Slide 8

Slide 8 text

What is Performance? Front-end Performance Components • Network Overhead • Local vs offshore datacenters • Number of requests • Payload Size • Image optimisation • CSS / JS Minification • Markup size & compression

Slide 9

Slide 9 text

What is Performance? Front-end Performance Components • Javascript • Number of scripts being included • Synchronous vs asynchronous execution • Code efficiency

Slide 10

Slide 10 text

What is Scalability? “Why is scalability so hard? Because scalability cannot be an after-thought.” - Werner Vogels, Amazon CTO

Slide 11

Slide 11 text

What is Scalability? A system is said to be scalable if adding resources results in proportionally increased performance. 9 women can not make a baby in 1 month. Will doubling your site’s server resources double the traffic it can handle?

Slide 12

Slide 12 text

What is Scalability? Scalability Components • Caching • Block cache • Page cache • Reverse proxy cache • Opcode caching • Infrastructure • Web server load balancing • Database clustering • Caching backends - redis, memcached etc..

Slide 13

Slide 13 text

Common Problems

Slide 14

Slide 14 text

Common Problems Too many modules - AKA “Open Buffet Syndrome” Real life example • 365 enabled modules • 24 core modules • 51 custom modules • 72 exported features • 750 files loaded on every request. • 10 - 20% of PHP execution time was loading files, even with APC. • CPU cycles wasted - 25,000+ calls to module_implements() per request.

Slide 15

Slide 15 text

• Pages with product/* paths are NEVER cached. • Anonymous users who visit this page bypass page cache on all subsequent pages. • … AND those visitors write to the database on every subsequent page view. Common Problems Anonymous users with sessions Seems innocent, but this one line has consequences.

Slide 16

Slide 16 text

Common Problems Complicated entity & field architecture ● Slows down form submission, rendering, views, and more.

Slide 17

Slide 17 text

Strategies for Success Complicated entity & field architecture • How many INSERT queries per save? • node • node_revision • field_collection_item • field_collection_item_revision • field_data_field_collection_b • field_revision_field_collection_b • field_data_field_taxonomy_ref • field_revision_field_taxonomy_ref • field_data_field_collection_c • field_revision_field_collection_c • field_data_field_text • field_revision_field_text • file_managed • field_data_field_media • field_revision_field_media Real world field collection implementations are FAR more complicated than this example!

Slide 18

Slide 18 text

Common Problems Others • Never use views_php module - create custom views handlers and plugins. • Complex faceted search using Drupal database - use Solr. • dblog module enabled on production - use syslog. • Carefully consider use of modules with node access functionality - they disable block caching.

Slide 19

Slide 19 text

Common Problems Others • Never use views_php module - create custom views handlers and plugins. • Complex faceted search using Drupal database - use Solr. • dblog module enabled on production - use syslog. • Carefully consider use of modules with node access functionality - they disable block caching.

Slide 20

Slide 20 text

Strategies for Success

Slide 21

Slide 21 text

On-Demand Cache Purging • Planning • Divide the site into page “types”. • For each type, build a list of events which would require a page to be cleared from cache. • Considerations • No relative dates, ie “time ago”. • Some page types may be more suited to periodic caching. • Create a spidering script to warm the caches! • Extend to other caches using CacheTags - drupal.org/project/cachetags Strategies for Success

Slide 22

Slide 22 text

Strategies for Success Authcache (2.x branch) • Replaces Drupal’s default page caching allowing you to cache authenticated pages. • Huge scalability improvements for sites with a large proportion of authenticated visitors. • But also much, much more. • Personalisation - authcache_p13n • Form token magic - authcache_form • Store page cache in Varnish - authcache_varnish • Integrates with Cache Expiration

Slide 23

Slide 23 text

Strategies for Success Authcache • Planning • Define which page types are cacheable. • Design how you will segment your visitors (from a cache perspective). • Identify all personalised information which must be displayed. • Considerations • Forms can be tricky - ensure you test thoroughly. • Ensure your analytics / marketing / tracking services are compatible. • See Commerce Kickstart for great out-of-the-box implementation.

Slide 24

Slide 24 text

Strategies for Success Consuming Feeds & Web Services • Regularly importing data into Drupal can be resource intensive. • Feeds, migrate, custom PHP etc… All share the same fundamental problems: • Fetching large datasets, which hog i/o, memory, and CPU cycles. • Lots of slow INSERT and UPDATE operations on the database. • New data will not display immediately unless caches cleared. • The solution? Move to the front end!

Slide 25

Slide 25 text

Strategies for Success Consuming Feeds & Web Services • PaRSS - drupal.org/project/parss • Integrates simple jQuery RSS parser with link fields. • AngularJS - angularjs.org • Very powerful front-end MVC framework. • Usual implementation may not be suitable for this problem. • Angular Blocks - drupal.org/node/2445795 • Allows other modules to expose AngularJS apps as blocks! • Used successfully on recent intranet project, some pages having 6 angular apps on a single page.

Slide 26

Slide 26 text

Strategies for Success Load Testing • Make it part of your development process. • Dont leave it to the last minute or post-launch. • Tools • Apache jMeter • github.com/jacobSingh/Drupal-Performance-Testing-Suite • Blazemeter - blazemeter.com • Blitz - blitz.io • Web Page Test - webpagetest.org

Slide 27

Slide 27 text

Strategies for Success Queues • Use queues when dealing with: • Batch processing large datasets. • Performing complex calculations. • Sequential processing of tasks. • Modules / Tools • Advanced Queue - drupal.org/project/advancedqueue • Advanced Queue Runner - github.com/nvahalik/advancedqueue-runner • Drupal Core Queues - system.queue.inc

Slide 28

Slide 28 text

Strategies for Success Queues • Improves reliability. • If not using queues • There is no guarantee the process will be completed. • If the process fails, there is no easy way to repeat it. • If using queues • Each item is executed at least once. • If the process fails, the queue remains intact. • System load is stabilised because processing of complex or heavy operations is delayed.

Slide 29

Slide 29 text

Strategies for Success Optimised Front-end • Image Sprites • Minimises the number of HTTP requests. • CSS • Think about what your sass / less becomes once compiled. • How complex and specific do the selectors become? • Consider architecting your CSS for conditional inclusion. • Does the site have “sections”? • CSS rendering is a blocking process.

Slide 30

Slide 30 text

Strategies for Success Optimised Front-end • Asynchronous Javascript - drupal.org/project/async_js • Defers javascript execution. • Can improve responsiveness of “sluggish” JS-heavy sites. • Advanced Aggregation - drupal.org/project/advagg • Use CDN version of jQuery. • On-demand generation of aggregated assets.

Slide 31

Slide 31 text

Strategies for Success Other Recommendations • Elysia Cron - drupal.org/project/elysia_cron • Configure scheduling and frequency of specific cron tasks. • Run heavy cron tasks during low traffic periods. • Entity Cache - drupal.org/project/entitycache • Stores complete entity objects in your caching backend. • Enable appropriate dependent modules such as commerce_entitycache, bean_entitycache etc.. • Apache Solr for search • drupal.org/project/search_api_solr • drupal.org/project/apachesolr

Slide 32

Slide 32 text

Infrastructure

Slide 33

Slide 33 text

Infrastructure Caching Backends • Memcached - drupal.org/project/memcache • Battle tested. • Widely deployed. • Volatile storage - not suitable for persistent data. • Redis - drupal.org/project/redis • Less “mature” than Memcached. • 1:1 featureset with Memcached. • Benchmarks slightly better than Memcached. • Commits data to disk by default, can be used for persistent data • Use PHP extension - github.com/phpredis/phpredis (not Predis class)

Slide 34

Slide 34 text

Infrastructure Caching Backends • I recommend Redis • Store sessions in Redis rather than the database Session Proxy - drupal.org/project/session_proxy • Form cache can go straight into redis - no more need for this line: $conf['cache_class_cache_form'] = 'DrupalDatabaseCache';

Slide 35

Slide 35 text

Infrastructure Simplest Approach • Single server with all components • PHP • Web Server (Apache) • Database (MySQL) • Varnish (... sometimes) Varnish Apache PHP MySQL Instance #1

Slide 36

Slide 36 text

Infrastructure Scaling Vertically • Increase instance size. • Change instance types: • CPU optimised • Memory optimised • I/O optimised • Will hit an endpoint eventually. “We’re going to need a bigger box”

Slide 37

Slide 37 text

Infrastructure Splitting the box Break up stack components onto separate servers. Instance #2 Varnish Apache PHP MySQL Instance #1

Slide 38

Slide 38 text

Infrastructure Instance #2 Varnish Apache PHP MySQL Instance #1 Instance #3 Redis

Slide 39

Slide 39 text

You will still eventually hit an endpoint! Infrastructure

Slide 40

Slide 40 text

Infrastructure Horizontally Scalable Infrastructure • Overcomes CPU ceiling issues. • Considerations • Load balanced web servers • Database clustering • Shared / clustered file systems • Autoscaling - the holy grail. Load Balancer Varnish Apache PHP MySQL Redis Apache PHP Apache PHP Apache PHP

Slide 41

Slide 41 text

Debugging Performance and Scalability Issues

Slide 42

Slide 42 text

Debugging Performance and Scalability Issues Tools • New Relic APM, browser & server monitoring • MySQL slow query log • Add following lines to my.cnf and restart mysql • log_slow_queries=/var/log/mysql/slow-query.log • long_query_time=20 • XHProf - PHP profiler • Great slides for getting set up here - http://msonnabaum.github.io/xhprof- presentation/ • Browser Developer Tools • Javascript profiler • Network Monitor

Slide 43

Slide 43 text

Debugging Performance and Scalability Issues General Tips • Look beyond the symptoms to find the underlying cause. • Change one thing at a time. • Measure, change, measure. • Sometimes you just have to throw more RAM at the problem.

Slide 44

Slide 44 text

Thanks! slideshare.net/TechnocratAu