Asynchronous processing with RabbitMQ

Asynchronous processing  with RabbitMQ Ondrej Mirtes PHPCon Poland 2016

Sending an email – directly HTTP request PHP application Send
the email SMTP server HTTP response There's a lot of approaches you can take when sending an email from your app. In this case, PHP application connects to the SMTP server while the user is still waiting for the HTTP response.

Sending an email – directly Pros • Email arrives instantly
Cons • User waits for the response longer • If the SMTP fails – show the error to the user or stay silent? • No chance to repeat the attempt in case of failure

Sending an email – through cron HTTP request PHP application
Write a row Database HTTP response When using cron, we're not sending the email directly, but saving data to the database to be able to generate and send the email later.

Sending an email – through cron Crontab PHP script SMTP
server Send the email A few seconds or minutes later, cron job is executed. PHP script picks up the queue of emails from the database and sends them one by one. Lot of idle time in between the runs.

Sending an email – through cron id user created sent
5 75501 2016-10-02 11:30 false Process A Process B If you run the cron every minute and the first script takes longer than a minute, suddenly you have two overlapping scripts reading the same data, risking possibility of duplicate emails sent.

Pros • User does not wait for the HTTP response
• Can be repeated in case of failure Cons • Email does not arrive instantly • Susceptible to duplicates • Does not scale  to multiple sending scripts Sending an email – through cron

Sending an email – through a message queue HTTP request
PHP application Send a message Message queue HTTP response Similar schema to cron but with a different result. The email is pickuped up instantly.

Sending an email – through a message queue Queue Consumer
SMTP server Send the email

Sending an email – through a message queue Pros •
User does not wait for the HTTP response • Can be repeated in case of failure • Delivers exactly once • It's instant • It's scalable Cons • Complex deployment • Watch out for gotchas

Open source technology written in Erlang. Communicates with your application
using AMQP. Has a really nice documentation with everything described really well.

Flow of a message Producer Exchange Queue Consumer Consumer Broker
Message is not sent to a queue, but to an exchange first. Exchanges consist of multiple types: direct, fanout, topic. Direct forwards message to a queue based on the queue name.

Broker is a central component of RabbitMQ. It has a
web UI that can be used to configure the broker, monitor your queues and also to publish and read messages.

Each queue should have a different purpose. You can use
RabbitMQ to send emails, do bulk operations on data, preparing huge files to download and also for inter-process communication.

You can see six consumers from multiple servers listening for
new messages from this queue. You can control the performance of the system by tuning the number of consumers.

RabbitMQ and PHP composer require php-amqplib/php-amqplib If you're already using
RabbitMQ, perhaps you've already heard about this library. It's a well established one, but its API shows its age.

RabbitMQ and PHP composer require bunny/bunny Bunny is the fastest
alternative library out there and it has a nice consistent API. It also has an asynchronous client which you can benefit from a lot if you're familiar with React.PHP.

Bunny – message producer $bunny = new \Bunny\Client($options); $bunny->connect(); $channel
= $bunny->channel(); $channel->queueDeclare('queue_name'); $channel->publish( $message, $headers, '', // default forwarding exchange 'queue_name' );

Bunny – message consumer Consumer is a long-running PHP process.
PHP is ready for long-running processes. Memory leaks do not happen because of language's fault anymore. But it certainly requires more discipline and thinking about memory allocation.

Bunny – message consumer $channel->run( function (Message $message, Channel $channel)
{ handleMessage($message); $channel->ack($message); // or requeue: $channel->reject($message, $requeue); }, 'queue_name' ); This is what you should run as a command line script. If everything went as expected, you should acknowledge the message. If you're no longer interested in the message, you should reject it and if you want to try processing it later, you should requeue it.

Prefetch count = 1 Consumer A Queue Consumer B Consumer
C Per-consumer setting on how many message it will preload. Receiving a single message represents an overhead, better to receive multiple messages at once.

Prefetch count = 3 Consumer A Queue Consumer B Consumer
C If you use higher prefetch count for time-consuming messages, they will all be prefetched by the first consumer and other consumers will not have anything to do. Not ideal.

Prefetch count = 10 or higher Consumer A Queue Consumer
B Consumer C Higher prefetch count is ideal for high number of quickly consumed messages.

Asynchronous consumer $httpClient->requestAsync('GET', $url) ->then( function (ResponseInterface $res) use ($channel,
$message) { $channel->ack($message); } ); With asynchronous processing, you can start consuming the next message while still waiting for something to finish for the previous message. This needs prefetch count > 1.

Deploying a consumer Use supervisord to keep the process alive
Restart the process when deploying a new version Implement pcntl_signal to handle kill signals

You specify how many times you want to run a
process at the same time – number of consumers of a queue. Supervisor web UI can be used to control the running processes.

Gotchas & Traps The downsides and problems of using RabbitMQ
in your application are not a fault of the technology but rather the fact of how different it is when compared to how we are used to code every day. It's essentially parallel programming.  New technology is sometimes blamed for our own bugs.

It's really fast! $databaseConnection->beginTransaction(); // do stuff, insert rows into
database… // and publish a message $bunny->publish($message, …); // the message will arrive to the consumer // before the transaction is committed! $databaseConnection->commit();

Clear in-memory caches before consuming next message Like the Doctrine
identity map. Otherwise the consumer will see older data than they are in the database.

Higher probability of deadlocks Two messages with the same content
id user created sent 5 75501 2016-10-02 false Message Message Consumer A Consumer B If you publish two exact same messages and they are consumed at the same time, they can both lock the same rows resulting in a deadlock.

Higher probability of deadlocks Two messages with the same content
Message Message Aggregating consumer Queue Unique message every 5 min. Consumer A Consumer B If you encounter this situation (it's hard to discover beforehand), push messages to a queue with a single consumer, filter the messages and forward them to another queue.

Time travel? Producer correct time Message Consumer correct time  minus
3 minutes Message can be consumed on a different server than it's produced. You should synchronize time on your servers regularly and also set up monitoring comparing time on a server to a single source.

Check if data is still valid Create order Publish message
Cancel order Consume message (?) Messages can be consumed even after several hours after producing them. State of the referenced data can change meanwhile and you should check if they still make sense for you.

Backwards compatibility for old messages New version of a consumer
can receive  a message for the old version

Clustering & High Availability Redundancy is important. You don't have
to worry about RabbitMQ going down (didn't happen to me once), but you should have a backup in case the whole server goes down.

Clustering Same Erlang cookie (a file) on each server Start
independent nodes from CLI Set them to join the cluster from CLI Check cluster_status https://www.rabbitmq.com/clustering.html

High Availability Server A Server B Consumer Queue 1 Queue
2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Master Slave Each queue has one master server and mirrored synchronized slaves that can differ across queues.

2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 1 Master Slave Doesn't matter to which server you connect, the cluster will automatically reroute your connection.

2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 2 Master Slave Doesn't matter to which server you connect, the cluster will automatically reroute your connection.

High Availability https://www.rabbitmq.com/ha.html Server A Queue 1 Queue 2 Queue
3 Queue 4 Queue 5 Master Slave When a server with the master queue goes down, one of the slaves is promoted to master. Clustering & High Availability is a complex topic, check the documentation for details.

@OndrejMirtes feedback: https://joind.in/talk/674b0

Asynchronous processing with RabbitMQ

Asynchronous processing with RabbitMQ

More Decks by Ondřej Mirtes

Other Decks in Programming

Featured

Transcript