Slide 1

Slide 1 text

Asynchronous processing
 with RabbitMQ Ondrej Mirtes PHPCon Poland 2016

Slide 2

Slide 2 text

Sending an email – directly HTTP request PHP application Send the email SMTP server HTTP response There's a lot of approaches you can take when sending an email from your app. In this case, PHP application connects to the SMTP server while the user is still waiting for the HTTP response.

Slide 3

Slide 3 text

Sending an email – directly Pros ● Email arrives instantly Cons ● User waits for the response longer ● If the SMTP fails – show the error to the user or stay silent? ● No chance to repeat the attempt in case of failure

Slide 4

Slide 4 text

Sending an email – through cron HTTP request PHP application Write a row Database HTTP response When using cron, we're not sending the email directly, but saving data to the database to be able to generate and send the email later.

Slide 5

Slide 5 text

Sending an email – through cron Crontab PHP script SMTP server Send the email A few seconds or minutes later, cron job is executed. PHP script picks up the queue of emails from the database and sends them one by one. Lot of idle time in between the runs.

Slide 6

Slide 6 text

Sending an email – through cron id user created sent 5 75501 2016-10-02 11:30 false Process A Process B If you run the cron every minute and the first script takes longer than a minute, suddenly you have two overlapping scripts reading the same data, risking possibility of duplicate emails sent.

Slide 7

Slide 7 text

Pros ● User does not wait for the HTTP response ● Can be repeated in case of failure Cons ● Email does not arrive instantly ● Susceptible to duplicates ● Does not scale
 to multiple sending scripts Sending an email – through cron

Slide 8

Slide 8 text

Sending an email – through a message queue HTTP request PHP application Send a message Message queue HTTP response Similar schema to cron but with a different result. The email is pickuped up instantly.

Slide 9

Slide 9 text

Sending an email – through a message queue Queue Consumer SMTP server Send the email

Slide 10

Slide 10 text

Sending an email – through a message queue Pros ● User does not wait for the HTTP response ● Can be repeated in case of failure ● Delivers exactly once ● It's instant ● It's scalable Cons ● Complex deployment ● Watch out for gotchas

Slide 11

Slide 11 text

Open source technology written in Erlang. Communicates with your application using AMQP. Has a really nice documentation with everything described really well.

Slide 12

Slide 12 text

Flow of a message Producer Exchange Queue Consumer Consumer Broker Message is not sent to a queue, but to an exchange first. Exchanges consist of multiple types: direct, fanout, topic. Direct forwards message to a queue based on the queue name.

Slide 13

Slide 13 text

Broker is a central component of RabbitMQ. It has a web UI that can be used to configure the broker, monitor your queues and also to publish and read messages.

Slide 14

Slide 14 text

Each queue should have a different purpose. You can use RabbitMQ to send emails, do bulk operations on data, preparing huge files to download and also for inter-process communication.

Slide 15

Slide 15 text

You can see six consumers from multiple servers listening for new messages from this queue. You can control the performance of the system by tuning the number of consumers.

Slide 16

Slide 16 text

RabbitMQ and PHP composer require php-amqplib/php-amqplib If you're already using RabbitMQ, perhaps you've already heard about this library. It's a well established one, but its API shows its age.

Slide 17

Slide 17 text

RabbitMQ and PHP composer require bunny/bunny Bunny is the fastest alternative library out there and it has a nice consistent API. It also has an asynchronous client which you can benefit from a lot if you're familiar with React.PHP.

Slide 18

Slide 18 text

Bunny – message producer $bunny = new \Bunny\Client($options); $bunny->connect(); $channel = $bunny->channel(); $channel->queueDeclare('queue_name'); $channel->publish( $message, $headers, '', // default forwarding exchange 'queue_name' );

Slide 19

Slide 19 text

Bunny – message consumer Consumer is a long-running PHP process. PHP is ready for long-running processes. Memory leaks do not happen because of language's fault anymore. But it certainly requires more discipline and thinking about memory allocation.

Slide 20

Slide 20 text

Bunny – message consumer $channel->run( function (Message $message, Channel $channel) { handleMessage($message); $channel->ack($message); // or requeue: $channel->reject($message, $requeue); }, 'queue_name' ); This is what you should run as a command line script. If everything went as expected, you should acknowledge the message. If you're no longer interested in the message, you should reject it and if you want to try processing it later, you should requeue it.

Slide 21

Slide 21 text

Prefetch count = 1 Consumer A Queue Consumer B Consumer C Per-consumer setting on how many message it will preload. Receiving a single message represents an overhead, better to receive multiple messages at once.

Slide 22

Slide 22 text

Prefetch count = 3 Consumer A Queue Consumer B Consumer C If you use higher prefetch count for time-consuming messages, they will all be prefetched by the first consumer and other consumers will not have anything to do. Not ideal.

Slide 23

Slide 23 text

Prefetch count = 10 or higher Consumer A Queue Consumer B Consumer C Higher prefetch count is ideal for high number of quickly consumed messages.

Slide 24

Slide 24 text

Asynchronous consumer $httpClient->requestAsync('GET', $url) ->then( function (ResponseInterface $res) use ($channel, $message) { $channel->ack($message); } ); With asynchronous processing, you can start consuming the next message while still waiting for something to finish for the previous message. This needs prefetch count > 1.

Slide 25

Slide 25 text

Deploying a consumer Use supervisord to keep the process alive Restart the process when deploying a new version Implement pcntl_signal to handle kill signals

Slide 26

Slide 26 text

You specify how many times you want to run a process at the same time – number of consumers of a queue. Supervisor web UI can be used to control the running processes.

Slide 27

Slide 27 text

Gotchas & Traps The downsides and problems of using RabbitMQ in your application are not a fault of the technology but rather the fact of how different it is when compared to how we are used to code every day. It's essentially parallel programming.
 New technology is sometimes blamed for our own bugs.

Slide 28

Slide 28 text

It's really fast! $databaseConnection->beginTransaction(); // do stuff, insert rows into database… // and publish a message $bunny->publish($message, …); // the message will arrive to the consumer // before the transaction is committed! $databaseConnection->commit();

Slide 29

Slide 29 text

Clear in-memory caches before consuming next message Like the Doctrine identity map. Otherwise the consumer will see older data than they are in the database.

Slide 30

Slide 30 text

Higher probability of deadlocks Two messages with the same content id user created sent 5 75501 2016-10-02 false Message Message Consumer A Consumer B If you publish two exact same messages and they are consumed at the same time, they can both lock the same rows resulting in a deadlock.

Slide 31

Slide 31 text

Higher probability of deadlocks Two messages with the same content Message Message Aggregating consumer Queue Unique message every 5 min. Consumer A Consumer B If you encounter this situation (it's hard to discover beforehand), push messages to a queue with a single consumer, filter the messages and forward them to another queue.

Slide 32

Slide 32 text

Time travel? Producer correct time Message Consumer correct time
 minus 3 minutes Message can be consumed on a different server than it's produced. You should synchronize time on your servers regularly and also set up monitoring comparing time on a server to a single source.

Slide 33

Slide 33 text

Check if data is still valid Create order Publish message Cancel order Consume message (?) Messages can be consumed even after several hours after producing them. State of the referenced data can change meanwhile and you should check if they still make sense for you.

Slide 34

Slide 34 text

Backwards compatibility for old messages New version of a consumer can receive
 a message for the old version

Slide 35

Slide 35 text

Clustering & High Availability Redundancy is important. You don't have to worry about RabbitMQ going down (didn't happen to me once), but you should have a backup in case the whole server goes down.

Slide 36

Slide 36 text

Clustering Same Erlang cookie (a file) on each server Start independent nodes from CLI Set them to join the cluster from CLI Check cluster_status https://www.rabbitmq.com/clustering.html

Slide 37

Slide 37 text

High Availability Server A Server B Consumer Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Master Slave Each queue has one master server and mirrored synchronized slaves that can differ across queues.

Slide 38

Slide 38 text

High Availability Server A Server B Consumer Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 1 Master Slave Doesn't matter to which server you connect, the cluster will automatically reroute your connection.

Slide 39

Slide 39 text

High Availability Server A Server B Consumer Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Queue 2 Master Slave Doesn't matter to which server you connect, the cluster will automatically reroute your connection.

Slide 40

Slide 40 text

High Availability https://www.rabbitmq.com/ha.html Server A Queue 1 Queue 2 Queue 3 Queue 4 Queue 5 Master Slave When a server with the master queue goes down, one of the slaves is promoted to master. Clustering & High Availability is a complex topic, check the documentation for details.

Slide 41

Slide 41 text

@OndrejMirtes feedback: https://joind.in/talk/674b0