projects Beyond the LAMP stack By Thijs Feryn PHP

Hi, I’m Thijs

I’m @ThijsFeryn on Twitter

I’m an Evangelist At

I’m an at Evangelist

I’m a at board member

Jan 29th & 30th Antwerpen, Belgium

I need feedback

LAMP Stack?

Linux Apache MySQL PHP

✓ ~81% of the web ✓ Easy to learn ✓ Mature (PHP renaissance) ✓ Frameworks & CMS’s ✓ Lots of online resources ✓ THE COMMUNITY

➡ Still considered a scripting language (by some) ➡ Slow(-ish) ➡ Internal variable structure causes overhead ➡ Everyone can program in PHP, unfortunately everyone does ➡ Doesn’t scale that well 
 (without the tricks)

CPU memory I/O PHP consumes lots of

It gets worse “at scale”

Faster PHP

✓ Fast ✓ Just In Time compiler ✓ FastCGI support ✓ Drop-in replacement for PHP-FPM ✓ Hack language HVVM

Link HHVM to Nginx server { listen 80 default_server; root /var/www/html; index index.php server_name; location / { try_files $uri $uri/ /index.php?$args; } location ~ \.php$ { include snippets/fastcgi-php.conf; fastcgi_pass; } }

Drop-in replacement for PHP-FPM

HACK language

HACK language $x . $y; } function test(): void { $fn = foo(); echo $fn('baz'); // barbaz } Lambdas != closures

HACK language { $x = await \HH\Asio\curl_exec(""); return $x; } async function curl_B(): Awaitable { $y = await \HH\Asio\curl_exec(""); return $y; } async function async_curl(): Awaitable { $start = microtime(true); list($a, $b) = await \HH\Asio\v(array(curl_A(), curl_B())); $end = microtime(true); echo "Total time taken: " . strval($end - $start) . " seconds" . PHP_EOL; } \HH\Asio\join(async_curl()); Async

What kind of PHP code do we run? Why does it slow down?

Waiting for DB or API

HHVM won’t really speed up that process.

Different approach

Different tools

Varnish


Optimize database Optimize runtime A void A void

Don’t recompute if the data hasn’t changed

Varnish


Reverse caching Proxy

Normally User Server

Forward proxy User Proxy Server Proxy in the office

Reverse proxy User Varnish Server Proxy in the data center

It caches pages It’s all about HTTP

✓ Reverse caching proxy ✓ Load balancer ✓ Web application firewall (if you wish) ✓ HTTP accelerator Varnish

✓ Varnish Configuration Language ✓ Edge Side Include support ✓ Gzip compression/decompression ✓ Cache purging ✓ HTTP streaming ✓ Grace mode ✓ Configure backends ✓ Backend loadbalancing ✓ ACL protection ✓ VMODs in C Varnish

On the request side ✓ Only GET & HEAD ✓ No cookies ✓ No auth headers On the response side ✓ No “no-cache, no store” ✓ TTL > 0 ✓ No set-cookies When does Varnish cache?

No content

Extend the default behavior in VCL

This is the default behavior Even if you don’t use the VCL

vcl 4.0; sub vcl_recv { if (req.method != "GET" && req.method != "HEAD" && req.method != "PUT" && req.method != "POST" && req.method != "TRACE" && req.method != "OPTIONS" && req.method != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); } if (req.method != "GET" && req.method != "HEAD") { /* We only deal with GET and HEAD by default */ return (pass); } if (req.http.Authorization || req.http.Cookie) { /* Not cacheable by default */ return (pass); } return (hash); } Incoming request

sub vcl_hash { hash_data(req.url); if ( { hash_data(; } else { hash_data(server.ip); } return (lookup); } sub vcl_purge { return (synth(200, "Purged")); } Compose hash key Evict cache keys

sub vcl_hit { if (obj.ttl >= 0s) { // A pure unadultered hit, deliver it return (deliver); } if (obj.ttl + obj.grace > 0s) { // Object is in grace, deliver it // Automatically triggers a background fetch return (deliver); } // fetch & deliver once we get the result return (fetch); } sub vcl_miss { return (fetch); } sub vcl_deliver { return (deliver); } Deliver output to client Fetch data from backend Or fetch if it’s stale Deliver stored object

sub vcl_backend_response { if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Surrogate-control ~ "no-store" || (!beresp.http.Surrogate-Control && beresp.http.Cache-Control ~ "no-cache|no-store| private") || beresp.http.Vary == "*") { /* * Mark as "Hit-For-Pass" for the next 2 minutes */ set beresp.ttl = 120s; set beresp.uncacheable = true; } return (deliver); } Response from the backend

What to extend?

✓Strip tracking cookies (Google Analytics, …) ✓Sanitize URL ✓URL whitelist/blacklist ✓PURGE ACLs ✓Edge Side Include rules ✓Alway cache static files ✓Extend hash keys ✓Override TTL ✓Define grace mode What to extend?

Caching in your architecture

Respect HTTP

Cache-control: public, max- age=3600, s-maxage=7200 Cache-control: no-cache, no- store VS

Varnish


The example

Don’t cache header

Cache products for 1 hour

Cache menus for 1 day

Edge Side Includes

The Demo
{% block content %}{% endblock %}
ESI tags

The Demo {{ render_esi(url('nav')) }}
{{ render_esi(url('jumbotron')) }}
{% block content %}{% endblock %}
{{ render_esi(url('footer')) }}
Twig template in Silex ESI or internal subrequest Uses HttpFragmen tServiceProv ider

Or just use AJAX Async Graceful degradition

Hit rate

Minimal VCL Let the application handle it

Where does it fit in?

✓Cache pages and static assets ✓Fastest way ✓Hit rate may vary ✓Chop your content up in pieces ✓Use ESI or AJAX ✓Gateway to your application Where does Varnish fit in?

✓Cache all images, js, css, wof, … ✓Cache product pages ✓Cache category pages ✓Cache parts of the layout (via ESI) ✓Cache CMS-ish pages Where does Varnish fit in?

You can also host your static files on a separate set of Nginx servers

Content Delivery Network

CDNs are nothing but a bunch of reverse caching proxies

Put the content where your user is

A voids network saturation

User specific content

Keeping track of state

Varnish


Slide 81 text

Stateful caching vs high hit rate

Not all pages can be cached

Make the data retrieval faster

Typical MySQL issues

Data is stored for flexibility, not for performance SQL (joins) allow different compositions of the same data

Offload the database

Varnish

No content

✓ Key-value store ✓ Fast ✓ Lightweight ✓ Data stored in RAM ✓ ~Memcached ✓ Data types ✓ Data persistance ✓ Replication ✓ Clustering Redis

Redis $ redis-cli> ping PONG> set mykey somevalue OK> get mykey "somevalue

✓ Strings ✓ Hashes ✓ Lists ✓ Sets ✓ Sorted sets ✓ Geo ✓ … Redis data types

Redis $ redis-cli> hset customer_1234 id 1234 (integer) 1> hset customer_1234 items_in_cart 2 (integer) 1> hmset customer_1234 firstname Thijs lastname Feryn OK> hgetall customer_1234 1) "id" 2) "1234" 3) "items_in_cart" 4) "2" 5) "firstname" 6) "Thijs" 7) "lastname" 8) "Feryn">

$ redis-cli> lpush products_for_customer_1234 5 (integer) 1> lpush products_for_customer_1234 345 (integer) 2> lpush products_for_customer_1234 78 12 345 (integer) 5> llen products_for_customer_1234 (integer) 5> lindex products_for_customer_1234 1 "12"> lindex products_for_customer_1234 2 "78"> rpop products_for_customer_1234 "5"> rpop products_for_customer_1234 "345"> rpop products_for_customer_1234 "78"> rpop products_for_customer_1234 "12"> rpop products_for_customer_1234 "345"> rpop products_for_customer_1234 (nil)> Redis

daemonize yes pidfile /var/run/ port 6379 databases 16 maxmemory 1gb maxmemory-policy volatile-lru save 900 1 save 300 10 save 60 10000 dbfilename dump.rdb dir ./ slaveof 6379 appendonly yes appendfilename "appendonly.aof" Redis server config

Varnish


Where does it fit in?

✓ Database/API cache ✓ PHP session storage ✓ Message queue (lists) ✓ NoSQL database ✓ Real-time data retrieval Where does Redis fit in?

✓ Stock quantities ✓ Variable pricing information ✓ Shopping cart ✓ User profile information Where does Redis fit in?

Basically: Real-time & volatile data

Slide 100 text

BUT: MySQL can still remain “source of truth” database

✓Full-text search engine ✓Analytics engine ✓NoSQL database ✓Lucene based ✓Built-in clustering, replication, sharding ✓RESTful interface ✓Schemaless ElasticSearch

{ "name" : "Hijacker", "cluster_name" : "elasticsearch", "version" : { "number" : "2.1.0", "build_hash" : "72cd1f1a3eee09505e036106146dc1949dc5dc87", "build_timestamp" : "2015-11-18T22:40:03Z", "build_snapshot" : false, "lucene_version" : "5.3.1" }, "tagline" : "You Know, for Search" } http://localhost: 9200

POST /my-index {"acknowledged":true} POST/my-index/my-type { "key" : "value", "date" : "2015-05-10", "counter" : 1, "tags" : ["tag1","tag2","tag3"] } { "_index": "my-index", "_type": "my-type", "_id": "AU089olr9oI99a_rK9fi", "_version": 1, "created": true } Confirmation

GET/my-index/my-type/AU089olr9oI99a_rK9fi?pretty { "_index": "my-index", "_type": "my-type", "_id": "AU089olr9oI99a_rK9fi", "_version": 1, "found": true, "_source": { "key": "value", "date": "2015-05-10", "counter": 1, "tags": [ "tag1", "tag2", "tag3" ] } } Retrieve document by id Document & meta data

GET /my-index/_mapping?pretty { "my-index": { "mappings": { "my-type": { "properties": { "counter": { "type": "long" }, "date": { "type": "date", "format": "dateOptionalTime" }, "key": { "type": "string" }, "tags": { "type": "string" } } } } } } Schemaless? Not really … “Guesses” mapping on insert

POST /products { "mappings": { "product" : { "_id" : { "path" : "entity_id" }, "properties" : { "entity_id" : {"type" : "integer"}, "name" : { "type" : "string", "index" : "not_analyzed", "fields" : { "raw" : { "type" : "string", "analyzer": "english" } } }, "description" : { "type" : "string", "index" : "not_analyzed", "fields" : { "raw" : { "type" : "string", "analyzer": "english" } } }, "price" : {"type" : "double"}, "sku" : {"type" : "string", "index" : "not_analyzed"}, "created_at" : {"type" : "date", "format" : "YYYY-MM-dd HH:mm:ss"}, "updated_at" : {"type" : "date", "format" : "YYYY-MM-dd HH:mm:ss"} , "category" : { "type" : "string", "index" : "not_analyzed" } } } } } Explicit mapping at index creation time

Analyzed vs non-analyzed Full-text vs exact value Filter vs Query

ElasticSearch


Slide 110 text

POST /products/product/_search?pretty { "query": { "match": { "name.raw": "Linen Blazer" } } } POST /products/product/_search?pretty { "query": { "filtered": { "query": { "match_all": {} }, "filter": { "term": { "name": "Linen Blazer" } } } } } Matches 2 products Matches 1 product

POST /products/product/_search?pretty { "query": { "filtered": { "filter": { "bool": { "must": [ { "range": { "price": { "gte": 100, "lte": 400 } } } ], "must_not": [ { "term": { "name": "Convertible Dress" } } ], "should": [ { "term": { "category": "Women" } }, { "term": { "category": "New Arrivals" } } ] } } } } }

ElasticSearch


Slide 113 text

Group by on steroids

POST /products/product/_search?pretty { "fields": ["category","price","name"], "query": { "match": { "name.raw": "blazer" } }, "aggs": { "avg_price": { "avg": { "field": "price" } }, "min_price" : { "min": { "field": "price" } }, "max_price" : { "max": { "field": "price" } }, "number_of_products_per_category" : { "terms": { "field": "category", "size": 10 } } } } Multi-group by & query

"aggregations": { "min_price": { "value": 455 }, "number_of_products_per_category": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Blazers", "doc_count": 2 }, { "key": "Default Category", "doc_count": 2 }, { "key": "Men", "doc_count": 2 } ] }, "max_price": { "value": 490 }, "avg_price": { "value": 472.5 } } Aggregation output

ElasticSearch


Single node 2 node cluster 3 node cluster

Where does it fit in?

✓ Full-text search engine with drill-down search ✓ NoSQL database ✓ Big data analytics tool using Kibana Where does ElasticSearch fit in?

✓ All product information ✓ All categories & attributes ✓ Log archive ✓ Both NoSQL DB & search engine Where does ElasticSearch fit in?

CPU memory I/O PHP consumes lots of

Depends on what you do with it

✓ Image resizing ✓ Logging/reporting ✓ PDF generation ✓ Check-out on super busy event sites ✓ …

Offload the webserver

Worker scripts

✓ Uses PHP-CLI ✓ Runs continuously ✓ Process forking ✓ Pthreads ✓ Run worker scripts in parallel ✓ Managed by supervisord Worker scripts

✓ Sync MySQL & Redis ✓ Resize images ✓ Async logging & metrics ✓ Update quantities & prices Worker scripts

How does your app communicate with the workers?

Message queues

✓ Pub/sub ✓ Speaks AMQP protocol ✓ Supported by Pivotal ✓ Channels/Exchanges/ Queues ✓ Built-in clustering ✓ Reliable messaging RabbitMQ

RabbitMQ


videlalvaro/php-amqplib

channel(); $channel->queue_declare('hello', false, false, false, false); $msg = new AMQPMessage('Hello World!'); $channel->basic_publish($msg, '', 'hello'); echo " [x] Sent 'Hello World!'\n"; $channel->close(); $connection->close(); Send to queue

body, "\n"; }; $connection = new AMQPConnection('', 5672, 'guest', 'guest'); $channel = $connection->channel(); echo ' [*] Waiting for messages. To exit press CTRL+C', "\n"; $channel->basic_consume('hello', '', false, true, false, false, $callback); while(count($channel->callbacks)) { $channel->wait(); } $channel->close(); $connection->close(); Receive from queue

Where does it fit in?

✓ Take load away from user process ✓ Free up resources on frontend servers ✓ Elaborate messaging strategies ✓ Async event-based actions Where do RabbitMQ/workers fit in?

✓ Synchronize stock and price changes between Redis & MySQL ✓ Synchronize product changes between ElasticSearch & MysQL ✓ Stock/price/sales notifications ✓ Async checkout for busy event sites Where do RabbitMQ/workers fit in?

CPU memory I/O PHP consumes lots of

But with most of these tools we still go through the PHP runtime …

Other backend tools to handle stateful data

SOA Microservices AJAX Websockets

Slide 147 text

✓ Javascript runtime ✓ Async ✓ Event-driven ✓ Non-blocking I/O ✓ Callbacks ✓ Lightweight ✓ NPM packages ✓ Backend-code in Javascript NodeJS

const http = require('http'); const hostname = ''; const port = 1337; http.createServer((req, res) => { res.writeHead(200, { 'Content-Type': 'text/plain' }); res.end('Hello World\n'); }).listen(port, hostname, () => { console.log(`Server running at http://${hostname}:${port}/`); });

ExpressJS framework for Node

var express = require('express'); var app = express(); app.get('/', function (req, res) { res.send('Hello World!'); });'/', function (req, res) { res.send('Got a POST request'); }); app.put('/user', function (req, res) { res.send('Got a PUT request at /user'); }); app.delete('/user', function (req, res) { res.send('Got a DELETE request at /user'); }); var server = app.listen(3000, function () { var host = server.address().address; var port = server.address().port; console.log('Example app listening at http://%s:%s', host, port); });

Gateway to ElasticSearch, RabbitMQ & Redis

var elasticsearch = require('elasticsearch');
 var express = require('express');
 var bodyParser = require('body-parser')
 var redis = require("redis");
 var amqp = require('amqplib/callback_api');
 var elasticsearchClient = new elasticsearch.Client({
 host: 'localhost:9200',
 log: 'error'
 var redisClient = redis.createClient();
 redisClient.on("error", function (err) {
 console.log("Error " + err);
 var app = express();
 app.use(bodyParser.urlencoded({ extended: false }))
 app.use(bodyParser.json()) Initialize

app.get('/products', function (req, res) {{
 index: 'thedemo',
 type: 'product',
 body: {
 query: {
 match_all: {}
 }).then(function (resp) {
 }, function (err) {
 }); app.get('/products/:id([0-9]+)/stock', function (req, res) {
 redisClient.get(':stock', function(err, reply) {
 }); Get products from ES Get stock from Redis

app.put('/products/:id([0-9]+)/stock', function (req, res) {
 var stock = req.body.stock;
 var action = req.body.action;
 if(action == 'increment') {
 redisClient.incrby(':stock',stock, function(err, reply) {
 amqp.connect('amqp://localhost', function(err, conn) {
 conn.createChannel(function(err, ch) {
 ch.assertExchange('stock', 'direct', {durable: false});
 ch.publish('stock', 'info', new Buffer(JSON.stringify({id:, stock: parseInt(reply)})));
 res.json('Stock for product '' is now '+parseInt(reply));
 } else {
 redisClient.decrby(':stock',stock, function(err, reply) {
 amqp.connect('amqp://localhost', function(err, conn) {
 conn.createChannel(function(err, ch) {
 ch.assertExchange('stock', 'direct', {durable: false});
 ch.publish('stock', 'info', new Buffer(JSON.stringify({id:, stock: parseInt(reply)})));
 res.json('Stock for product '' is now '+parseInt(reply));
 }); Update stock in Redis Send message to queue

➜ ~ node node.js Example app listening at http://:::3000 Running node.js

➜ ~ curl -XPUT localhost:3000/products/1/ stock -d"stock=2&action=increment" "Stock for product 1 is now 7” ➜ ~ php stock.php info [*] Waiting for logs. To exit press CTRL+C [x] info:{"id":"1","stock":7} API call PHP queue worker

✓ Compiled language for the web ✓ Invented by Google ✓ Strictly typed ✓ Feels like your average interpreted language ✓ Async features ✓ Built for systems programming ✓ REALLY fast ✓ Not object oriented Go(lang)

Where does it fit in?

Really fast workers Really fast APIs Could replace PHP workers Could replace NodeJS

Go(lang)


Slide 163 text

package main import ( "fmt" "log" "" ) func failOnError(err error, msg string) { if err != nil { log.Fatalf("%s: %s", msg, err) panic(fmt.Sprintf("%s: %s", msg, err)) } } Initialize

func main() { conn, err := amqp.Dial("amqp://guest:guest@localhost:5672/") failOnError(err, "Failed to connect to RabbitMQ") defer conn.Close() ch, err := conn.Channel() failOnError(err, "Failed to open a channel") defer ch.Close() err = ch.ExchangeDeclare( "stock", // name "direct", // type false, // durable false, // auto-deleted false, // internal false, // no-wait nil, // arguments ) failOnError(err, "Failed to declare an exchange") q, err := ch.QueueDeclare( "", // name false, // durable false, // delete when usused true, // exclusive false, // no-wait Runs from main function

conn, err := amqp.Dial("amqp://guest:guest@localhost:5672/") failOnError(err, "Failed to connect to RabbitMQ") defer conn.Close() ch, err := conn.Channel() failOnError(err, "Failed to open a channel") defer ch.Close() err = ch.ExchangeDeclare( "stock", // name "direct", // type false, // durable false, // auto-deleted false, // internal false, // no-wait nil, // arguments ) failOnError(err, "Failed to declare an exchange") Initialize connection Initialize exchange

q, err := ch.QueueDeclare( "", // name false, // durable false, // delete when usused true, // exclusive false, // no-wait nil, // arguments ) failOnError(err, "Failed to declare a queue") err = ch.QueueBind( q.Name, // queue name "info", // routing key "stock", // exchange false, nil) failOnError(err, "Failed to bind a queue") Declare queue Bind to queue

msgs, err := ch.Consume( q.Name, // queue "", // consumer true, // auto-ack false, // exclusive false, // no-local false, // no-wait nil, // args ) failOnError(err, "Failed to register a consumer") forever := make(chan bool) go func() { for d := range msgs { log.Printf(" [x] %s", d.Body) } }() log.Printf(" [*] Waiting for messages. To exit press CTRL+C") <-forever } Consume messages Async processing

✓ Go get ✓ Go run worker.go ✓ Go install worker.go ✓ env GOOS=linux GOARCH=amd64 go build worker.go Useful Go commands

The end game

✓ Cache pages (Varnish) ✓ Assemble content via ESI or AJAX ✓ Static assets on Nginx or CDN ✓ Business logic in lightweight API calls (NodeJS, Go) ✓ Key-value stores for volatile & real-time data in Redis ✓ ElasticSearch as a NoSQL database ✓ RabbitMQ for async communication ✓ Worker processes read from message queue End game

Use the right tool for the job

Use the tools you like

I need feedback