Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Delayed operations with queues to improve performance

Delayed operations with queues to improve performance

Delaying work and deferring it to a queue handled asynchronously is one of the most efficient ways to improve full-page performance on complex page structures typical of content-oriented sites built with Drupal and other CMSes. These are the slides of the talk I have at DrupalCon Barcelona 2015 with Yuriy Gerasimov on this topic : learn about content cooking, deferred submits, anticipated content refresh, and other tricks to speed up your sites.

Frédéric G. MARAND

September 23, 2015

More Decks by Frédéric G. MARAND

Other Decks in Programming


  1. Yuriy Gerasimov • FFW • Drupal architect & developer •

    Contrib 7 modules: services, draggableviews • Founder at Backtrac.io ygerasimov
  2. Frédéric G. Marand • OSInet: performance/architecture consulting for internal teams

    at larger accounts • Core contributor 4.7 to 8.0.x, MongoDB + XMLRPC maintainer + others • Already 4 D8 customer projects before 8.0.0 • Customer D8 in production since 07/2015 • Frequently adds queueing to larger Drupal projects fgm
  3. Why use queues ? To have websites which are :

    • Faster for visitors • Snappier for editors • More scaleable To process time-consuming jobs : • Video encoding • High-resolution gallery uploads and processing
  4. Concrete use cases • Prepare content for non-Drupal front-ends •

    Anticipate content generation • Deferred submits, e.g. comments handling • Slow operations: node saves, previews, image processing • External data sources: pull, push • Multi-step operations: batch
  5. Anticipated content generation Blocks Ctools content types Controllers etc. Contrib

    : http://github.com/FGM/lazy Content created Served from cache Fresh Stale Expired t 0 t 1 t 2 Served from cache Regenerate cache time Usual Drupal Content created Served from cache Fresh Stale Fresh t 0 t 1 t 2 Served from cache + request update Store Served from cache time Anticipated content generation
  6. Job servers • How to get results • Rerun failed

    jobs • Separate queue for failed jobs • Monitoring queues, workers • Supervisor
  7. Some implementations Queue D6 D7 D8 Memory core core Database

    OK core core AdvancedQueue OK Not yet Amazon SQS (aws_sqs) OK Not yet Beanstalkd OK 8.1/8.2 evQueue Started Queue D6 D7 D8 IronMQ (iron.io) OK Not yet Gearman OK OK Not yet MongoDB OK Started PHPResque RabbitMQ OK Not yet Redis (redis_queue) OK OK Not yet
  8. Queues API: concepts Queue: a minimally-featured FIFO Worker: the code

    actually doing the work Item: a piece of workload submitted to the queue Runner: the process triggering/monitoring workers Batch subsystem: a high-level API on top of Queue API D8: Manager, Plugins
  9. D6/D7 Queue API D7: core D6: drupal_queue module Declaring queues:

    hook_cron_queue_info[_alter]() • “Skip on cron”: enable decoupling from cron runs • Time: max lifetime allocated to process items during a cron run, useless with skip on cron = TRUE • Worker callback: an implementation of callback_queue_worker (mixed queue_item): void API useable without cron Default Runner: • In the cron subsystem • Pokemon exception handling
  10. D8 Queue API API useable without cron Declaring queue workers:

    Service: plugin.manager.queue_worker Instantiates QueueWorker plugins Definition: • Cron, not enabled by default ◦ Time: max lifetime allocated to process items during a cron run • Core examples : AggregatorRefresh, LocaleTranslation • hook_queue_info_alter() Default Runner: In the cron subsystem: Drupal\Core\Cron::processQueues() SuspendQueueException: $q- >releaseItem()
  11. Queue API methods: Queue QueueInterface • Q::createItem(mixed $data: void •

    Q::claimItem($lease_time = 3600: mixed $item ◦ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp] ◦ $lease_time → Assumptions for runner, currently not used • Q::deleteItem($item): void -> work done • Q::releaseItem($item): bool • Q::numberOfItems(): int → best guess, unreliable • Q::createQueue() / Q::deleteQueue() ReliableQueueInterface: ordering, single execution
  12. Queue API methods: others Queue service → QueueFactory::get($name, $reliable) QueueManager:

    a vanilla plugin manager • In charge of hook_queue_info_alter() • createInstance($plugin_id, $configuration) QueueWorkerInterface: • processItem (mixed data) : void @throws SuspendQueueException
  13. Queue Runners Core / Contrib • Core Cron / Elysia

    Cron / Queue_Runner • Drush: queue-list / queue-run • Similar limitations: ◦ Default on in D6 / D7, default off in D8 ◦ Limited timeout support: non preemptive ◦ Single threaded, single process across queues Custom runners • Provided by queue modules or per-project one-offs • Preemption, parallel execution...
  14. Queue API limitations Limited FIFO paradigm • D8: non-Reliable QueueInterface:

    datagram No monitoring No queue disciplines • Priority management • Tagging • Delay, burying ... Implementations may provide more • Item structure is free-form: add richer interfaces No Peek(), no LIFO, no deduplication: hacks
  15. Performance edge Runners: • Avoid active polling à la core

    DB • Use a blocking layer + select() • Parallel handling of multiple queues → multiple runners, scheduling Workers: read after write • Write in the queue → cache invalidated • Read again→ cache primed
  16. Sprint: Friday https://www.flickr. com/photos/amazeelabs/9965814443/in/fav es-38914559@N03/ Sprint with the Community on

    Friday. We have tasks for every skillset. Mentors are available for new contributors. An optional Friday morning workshop for first- time sprinters will help you get set up. Follow @drupalmentoring.