Full AI potential with async PHP

Slide 1

Slide 1 text

Full AI potential with async PHP Nemanja Marić maki10

Slide 2

Slide 2 text

About me ● PHP Developer ● Member of PHP Serbia Community ● Co-organizer Laravel Serbia Meetup’s and PHP Serbia Meetup’s ● Working with PHP since 2014 ● In Laravel world from 2016 ● Open source contributor ● Contributing to the Laravel Framework ● And most !important: Husband and father of two little angels ● Love dad jokes

Slide 3

Slide 3 text

Latest moto

Slide 4

Slide 4 text

Story: What is our representative module?

Slide 5

Slide 5 text

No content

Slide 6

Slide 6 text

Story: Let’s assume what is our representative module?

Slide 7

Slide 7 text

Imagine an application that can harvest all data for one person or company. For example, if you want to know more about Tesla, Cybertruck or Elon Musk. It is someone who mentioned good for the company or bad, and a lot more data. So, you will harvest all available data and process it. What is all? Articles, news, blogs, images, Twitter, documents etc. It will collect all available data, convert it to AI language and proceed with the desired data. AI-based app for processing news and documents

Slide 8

Slide 8 text

What do you think when you type Elon Musk, how much data do we get?

Slide 9

Slide 9 text

How I will describe this app? “Beauty and the Beast”

Slide 10

Slide 10 text

Beauty mode: Well engineered. An application is designed 90% as you think it should be designed.

Slide 11

Slide 11 text

Beast mode: Multi server applications that CPU peak roughly 90% of all time, small RAM usage.

Slide 12

Slide 12 text

“Ana” Main applications Our main application is a Laravel API-based application. It's on a droplet with 64GB of RAM and 32 CPU cores. Capable of scraping 10k jobs per hour with only 10% of fully capable. Two MySQL read instances and one write. MySQL is capable of 6400 parallel connections. Zero cache. Scaled down from 128GB of RAM with the cut of 50% revenue. For the presentation, we will focus on Laravel Queue and later on Open Search (elastic search).

Slide 13

Slide 13 text

How we make PHP async?

Slide 14

Slide 14 text

Laravel Queue (via Horizon) While building your web application, you may have some tasks, such as parsing and storing an uploaded CSV file, that take too long to perform during a typical web request. Thankfully, Laravel allows you to easily create queued jobs that may be processed in the background. By moving time intensive tasks to a queue, your application can respond to web requests with blazing speed and provide a better user experience to your customers.

Slide 15

Slide 15 text

Scaling Horizon Scaling by horizon workers Scaling by queue workers

Slide 16

Slide 16 text

Good scaling example

Slide 17

Slide 17 text

No content

Slide 18

Slide 18 text

Pitfall 1 Create a script to send a large number of emails to N users, incorporating intricate logic in the email content. Specify the content of the emails, including subject, body, and any attachments. How to handle? API Call Job Command

Slide 19

Slide 19 text

Wrong, you should optimize logic first

Slide 20

Slide 20 text

Pitfall 2 Create a script to generate two thumbnail images for each of N pages of PDFs, specifying the desired dimensions and format for the thumbnails. How to handle? API Call Job Command

Slide 21

Slide 21 text

We need to scale up?!? We are optimized maximum. It's clear that this is a heavy task, and we need to scale the server to process this.

Slide 22

Slide 22 text

Bill increases by 2x. Who is going tell to the client? Check list: 1. Our fault (Hell no) 2. PHP is not right choice (Hell yes) How to solve: 1. Python 2. Node.js 3. Call a senior? (Huh no) PM: Who gonna implement changes? Devs: Senior dev. PM: So let’s call a senior.

Slide 23

Slide 23 text

Senior joining the project with the first task: Write Python microservice to support logic for thumbnails in XY component.

Slide 24

Slide 24 text

Senior joining the project with the first task: Write Python microservice to support logic for thumbnails in XY component. Optimizing here and there and scaling down the server.

Slide 25

Slide 25 text

Bill decreased by 2x. Conference call Always try to optimize your code as much as possible. Before adopting another language in your codebase check if it is needed. When do you need to consider another language? ● When a language has proven benefits. ● When a language has better support and can process much more data in a given time.

Slide 26

Slide 26 text

Microservice 1 “Ceca”

Slide 27

Slide 27 text

“Ceca” is async Node.JS scraper (news, blog, twitter..) Droplet with 24GB of RAM and 32 CPU cores. At the moment I have joined a project capable of scraping 10k jobs in an hour. Pm2 running NodeJS.

Slide 28

Slide 28 text

Unlock full potential. 10k in minute or 500k hour.

Slide 29

Slide 29 text

Pitfall 3 How to scrape for example Tesla but company, not a person? Apple company, not fruit. Mars company for chocolate, not a planet. How to handle this for Tesla for example?

Slide 30

Slide 30 text

Pitfall 3 How to scrape for example Tesla but company, not a person? Apple company, not fruit. Mars company for chocolate, not a planet. How we handle this for Tesla for example? 1. Adding custom indicators in search query (-person) 2. Adding company aliases in search query (+tsla) 3. Adding CEO in search query (+“Elon Musk”)

Slide 31

Slide 31 text

Microservice 2 “Hana”

Slide 32

Slide 32 text

Droplet with 4GB of RAM and 4 CPU cores. “Hana” Purpose is to slice pdf into images, and then send each page to AWS textextract. Then AWS returns parsed table data and all text that we will use the later in our app. So “Ana” communicates asynchronies with “Hana”. “Hana” is an async Python (FastApi) microservice for processing PDFs

Slide 33

Slide 33 text

Slide 34

Slide 34 text

Limitations pitfall 4 We have PDFs with 10 and 400 pages. 90% of PDFs have tables or images. So, in that case, each PDF has a different execution time. How to bypass AWS Quotas (50 Transactions Per Second) in async mode? How we handle this? 1. Switching “Hana” asynchronies to synchronise HTTP 2. Asynchrony slicing PDF pages and sending them to AWS. 3. With Math

Slide 35

Slide 35 text

So, we know that AWS can process up to 50 images per second (Transactions Per Second), but if the image has more than one table, their calculations will be increased by 1. Let's assume that a pdf with 10 pages has 8 tables in total. In that case, we know that one PDF will need 8 (Transactions Per Second).To download from S3 and put back all images it took roughly 2 seconds per page. Also, there are several factors that we need to keep track of. Our Math on the end is 70 workers that work asynchronies from a master application (“Ana”) that hits "Hana" and waits for a response before dispatching a new job. How we keep limitations in the threshold

Slide 36

Slide 36 text

Results: Threshold that is from 70% to 95% all time.

Slide 37

Slide 37 text

Microservice 3 “Era”

Slide 38

Slide 38 text

“Era” is an async Python (FastApi) microservice for sentiment analysis Droplet with 4GB of RAM and 4 CPU cores. “Era” Purpose is for Machine Learning. “Ana” communicates asynchronies with “Era”.

Slide 39

Slide 39 text

How “Era” works? It took a sentence we want to check and later will train our model from a pre-trained sentence. Tokenize it in a way that Machine Learning can understand.

Slide 40

Slide 40 text

Slide 41

Slide 41 text

Pitfall 5 So, since our model for Machine Learning is written in Python and our main app is a PHP-based app, we need a way to communicate with each other. Yes, we all know that PHP can communicate with Python, but it's not compatible with our training model in a hugging face. How we handle this? "Era" is our adapter pattern microservice that allows us to train our model as we wish. Since tokenisation on the hugging face for our model is written in Python, the best way is to train our model with Python.

Slide 42

Slide 42 text

Pack all together

Slide 43

Slide 43 text

When all microservices on heavy load “Ana” are only on 40% reserved capacity allows the end user to work as usual.

Slide 44

Slide 44 text

No content

Slide 45

Slide 45 text

Pitfall 6 “Ana” is on heavy load from microservices, but needs to prepare all data to be searchable via Open Search (Elastic). Since we are getting 10k results in minute from "Ceca" Elastic will be DDoS-ed. How not to tear down Elastic? How to handle this?

Slide 46

Slide 46 text

Pitfall 6 “Ana” is on heavy load from microservices, but needs to prepare all data to be searchable via Open Search (Elastic). Since we are getting 10k results in minutes from "Ceca" Elastic will be DDoS-ed. How not to tear down Elastic? How we handle this? “Ana” will always prepare a small amount of data. When the user demands more data "Ana" will prepare itself and push all search data to Elastic. This is achieved in zero downtime. The end user doesn't have a clue. If it is scrapping in progress "Ana" will update Elastic every 5 minutes.

Slide 47

Slide 47 text

Final Pitfall 7 When you have multiple queue workers, and they have multiple pending jobs. How to add your job to the top? How make it a first executable job? How to handle this?

Slide 48

Slide 48 text

Final Pitfall 7 When you have multiple queue workers, and they have multiple pending jobs. How to add your job to the top? How make it a first executable job? How we handle this? Let's play `nice`.

Slide 49

Slide 49 text

Questions? Links: www.google.com https://laravel.com/docs/10.x/ https://fastapi.tiangolo.com/ https://huggingface.co/ Slides: https://speakerdeck.com/maki10 Twitter: https://twitter.com/NemanjaMaki10