Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Delayed operations with queues to improve performance

Delayed operations with queues to improve performance

Delaying work and deferring it to a queue handled asynchronously is one of the most efficient ways to improve full-page performance on complex page structures typical of content-oriented sites built with Drupal and other CMSes. These are the slides of the talk I have at DrupalCon Barcelona 2015 with Yuriy Gerasimov on this topic : learn about content cooking, deferred submits, anticipated content refresh, and other tricks to speed up your sites.

9770614c66b56331b6947c79b622a7af?s=128

Frédéric G. MARAND

September 23, 2015
Tweet

Transcript

  1. None
  2. Delayed operations with queues Yuriy Gerasimov Frédéric G. Marand Session

    track: PHP
  3. Who are we?

  4. Yuriy Gerasimov • FFW • Drupal architect & developer •

    Contrib 7 modules: services, draggableviews • Founder at Backtrac.io ygerasimov
  5. Frédéric G. Marand • OSInet: performance/architecture consulting for internal teams

    at larger accounts • Core contributor 4.7 to 8.0.x, MongoDB + XMLRPC maintainer + others • Already 4 D8 customer projects before 8.0.0 • Customer D8 in production since 07/2015 • Frequently adds queueing to larger Drupal projects fgm
  6. Why use queues ? To have websites which are :

    • Faster for visitors • Snappier for editors • More scaleable To process time-consuming jobs : • Video encoding • High-resolution gallery uploads and processing
  7. Concrete use cases • Prepare content for non-Drupal front-ends •

    Anticipate content generation • Deferred submits, e.g. comments handling • Slow operations: node saves, previews, image processing • External data sources: pull, push • Multi-step operations: batch
  8. Cooking for front-ends Front end

  9. Anticipated content generation Blocks Ctools content types Controllers etc. Contrib

    : http://github.com/FGM/lazy Content created Served from cache Fresh Stale Expired t 0 t 1 t 2 Served from cache Regenerate cache time Usual Drupal Content created Served from cache Fresh Stale Fresh t 0 t 1 t 2 Served from cache + request update Store Served from cache time Anticipated content generation
  10. Comment handling

  11. “Pull” data sources (aggregator)

  12. “Push” data sources

  13. Image processing

  14. Job servers • How to get results • Rerun failed

    jobs • Separate queue for failed jobs • Monitoring queues, workers • Supervisor
  15. Some implementations Queue D6 D7 D8 Memory core core Database

    OK core core AdvancedQueue OK Not yet Amazon SQS (aws_sqs) OK Not yet Beanstalkd OK 8.1/8.2 evQueue Started Queue D6 D7 D8 IronMQ (iron.io) OK Not yet Gearman OK OK Not yet MongoDB OK Started PHPResque RabbitMQ OK Not yet Redis (redis_queue) OK OK Not yet
  16. Queues API: concepts Queue: a minimally-featured FIFO Worker: the code

    actually doing the work Item: a piece of workload submitted to the queue Runner: the process triggering/monitoring workers Batch subsystem: a high-level API on top of Queue API D8: Manager, Plugins
  17. D6/D7 Queue API D7: core D6: drupal_queue module Declaring queues:

    hook_cron_queue_info[_alter]() • “Skip on cron”: enable decoupling from cron runs • Time: max lifetime allocated to process items during a cron run, useless with skip on cron = TRUE • Worker callback: an implementation of callback_queue_worker (mixed queue_item): void API useable without cron Default Runner: • In the cron subsystem • Pokemon exception handling
  18. D8 Queue API API useable without cron Declaring queue workers:

    Service: plugin.manager.queue_worker Instantiates QueueWorker plugins Definition: • Cron, not enabled by default ◦ Time: max lifetime allocated to process items during a cron run • Core examples : AggregatorRefresh, LocaleTranslation • hook_queue_info_alter() Default Runner: In the cron subsystem: Drupal\Core\Cron::processQueues() SuspendQueueException: $q- >releaseItem()
  19. Queue API methods: Queue QueueInterface • Q::createItem(mixed $data: void •

    Q::claimItem($lease_time = 3600: mixed $item ◦ FALSE | stdClass + [item_id => int, data => mixed, created => timestamp] ◦ $lease_time → Assumptions for runner, currently not used • Q::deleteItem($item): void -> work done • Q::releaseItem($item): bool • Q::numberOfItems(): int → best guess, unreliable • Q::createQueue() / Q::deleteQueue() ReliableQueueInterface: ordering, single execution
  20. Queue API methods: others Queue service → QueueFactory::get($name, $reliable) QueueManager:

    a vanilla plugin manager • In charge of hook_queue_info_alter() • createInstance($plugin_id, $configuration) QueueWorkerInterface: • processItem (mixed data) : void @throws SuspendQueueException
  21. Queue Runners Core / Contrib • Core Cron / Elysia

    Cron / Queue_Runner • Drush: queue-list / queue-run • Similar limitations: ◦ Default on in D6 / D7, default off in D8 ◦ Limited timeout support: non preemptive ◦ Single threaded, single process across queues Custom runners • Provided by queue modules or per-project one-offs • Preemption, parallel execution...
  22. Queue API limitations Limited FIFO paradigm • D8: non-Reliable QueueInterface:

    datagram No monitoring No queue disciplines • Priority management • Tagging • Delay, burying ... Implementations may provide more • Item structure is free-form: add richer interfaces No Peek(), no LIFO, no deduplication: hacks
  23. Performance edge Runners: • Avoid active polling à la core

    DB • Use a blocking layer + select() • Parallel handling of multiple queues → multiple runners, scheduling Workers: read after write • Write in the queue → cache invalidated • Read again→ cache primed
  24. Sprint: Friday https://www.flickr. com/photos/amazeelabs/9965814443/in/fav es-38914559@N03/ Sprint with the Community on

    Friday. We have tasks for every skillset. Mentors are available for new contributors. An optional Friday morning workshop for first- time sprinters will help you get set up. Follow @drupalmentoring.
  25. None
  26. None