Slide 1

Slide 1 text

C.R.E.A.M CASH RULES EVERYTHING AROUND ME CACHE Thijs Feryn

Slide 2

Slide 2 text

Hi, I’m Thijs

Slide 3

Slide 3 text

I’m @ThijsFeryn on Twitter

Slide 4

Slide 4 text

I’m an Evangelist At

Slide 5

Slide 5 text

I’m a at board member

Slide 6

Slide 6 text

Slow websites suck

Slide 7

Slide 7 text

Web performance is an essential part of the user experience

Slide 8

Slide 8 text

Infrastructure

Slide 9

Slide 9 text

Code

Slide 10

Slide 10 text

Slow database

Slide 11

Slide 11 text

Browser rendering

Slide 12

Slide 12 text

User location

Slide 13

Slide 13 text

Down Slowdown ~ downtime

Slide 14

Slide 14 text

Code efficiently

Slide 15

Slide 15 text

Identify slowest parts

Slide 16

Slide 16 text

Optimize database

Slide 17

Slide 17 text

Optimize runtime

Slide 18

Slide 18 text

After a while you hit the limits

Slide 19

Slide 19 text

Optimize database Optimize runtime A void A void

Slide 20

Slide 20 text

Don’t recompute if the data hasn’t changed

Slide 21

Slide 21 text

Cache

Slide 22

Slide 22 text

3 x 2 = ?

Slide 23

Slide 23 text

What can you cache?

Slide 24

Slide 24 text

What can you cache? Byte code Database output External services Files from disk Pages

Slide 25

Slide 25 text

Caching is not a compensation for poor code

Slide 26

Slide 26 text

Caching is an essential architectural strategy

Slide 27

Slide 27 text

The goal

Slide 28

Slide 28 text

Performance != Scalability

Slide 29

Slide 29 text

Performance: speed

Slide 30

Slide 30 text

Scalability: constant speed with increasing load

Slide 31

Slide 31 text

Caching toolkit

Slide 32

Slide 32 text

✓Varnish ✓Redis ✓Shared memory ✓ElasticSearch * Caching toolkit

Slide 33

Slide 33 text

Quick overview

Slide 34

Slide 34 text

Varnish

Slide 35

Slide 35 text

Normally User Server

Slide 36

Slide 36 text

With Varnish User Varnish Server

Slide 37

Slide 37 text

No content

Slide 38

Slide 38 text

Stores HTTP output in memory

Slide 39

Slide 39 text

Respects cache-control headers

Slide 40

Slide 40 text

Varnish Configuration Language

Slide 41

Slide 41 text

sub vcl_recv { if (req.method == "PRI") { /* We do not support SPDY or HTTP/2.0 */ return (synth(405)); } if (req.method != "GET" && req.method != "HEAD" && req.method != "PUT" && req.method != "POST" && req.method != "TRACE" && req.method != "OPTIONS" && req.method != "DELETE") { /* Non-RFC2616 or CONNECT which is weird. */ return (pipe); } if (req.method != "GET" && req.method != "HEAD") { /* We only deal with GET and HEAD by default */ return (pass); } if (req.http.Authorization || req.http.Cookie) { /* Not cacheable by default */ return (pass); } return (hash); }

Slide 42

Slide 42 text

sub vcl_pipe { # By default Connection: close is set on all piped requests, to stop # connection reuse from sending future requests directly to the # (potentially) wrong backend. If you do want this to happen, you can undo # it here. # unset bereq.http.connection; return (pipe); } sub vcl_pass { return (fetch); } sub vcl_hash { hash_data(req.url); if (req.http.host) { hash_data(req.http.host); } else { hash_data(server.ip); } return (lookup); }

Slide 43

Slide 43 text

sub vcl_purge { return (synth(200, "Purged")); } sub vcl_hit { if (obj.ttl >= 0s) { // A pure unadultered hit, deliver it return (deliver); } if (obj.ttl + obj.grace > 0s) { // Object is in grace, deliver it // Automatically triggers a background fetch return (deliver); } // fetch & deliver once we get the result return (miss); } sub vcl_miss { return (fetch); } sub vcl_deliver { return (deliver); }

Slide 44

Slide 44 text

sub vcl_backend_fetch { return (fetch); } sub vcl_backend_response { if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Surrogate-control ~ "no-store" || (!beresp.http.Surrogate-Control && beresp.http.Cache-Control ~ "no-cache|no-store|private") || beresp.http.Vary == "*") { /* * Mark as "Hit-For-Pass" for the next 2 minutes */ set beresp.ttl = 120s; set beresp.uncacheable = true; } return (deliver); }

Slide 45

Slide 45 text

No content

Slide 46

Slide 46 text

✓Caching ✓Proxying ✓Loadbalancing ✓Edge Side Includes ✓Streaming ✓Compression ✓Invalidation ✓VMODS ✓Logging tools Varnish features

Slide 47

Slide 47 text

HTTP accelerator

Slide 48

Slide 48 text

No content

Slide 49

Slide 49 text

✓ Key-value store ✓ Fast ✓ Lightweight ✓ Data stored in RAM ✓ ~Memcached ✓ Data types ✓ Data persistance ✓ Replication ✓ Clustering Redis

Slide 50

Slide 50 text

Redis $ redis-cli 127.0.0.1:6379> ping PONG 127.0.0.1:6379> set mykey somevalue OK 127.0.0.1:6379> get mykey "somevalue

Slide 51

Slide 51 text

✓ Strings ✓ Hashes ✓ Lists ✓ Sets ✓ Sorted sets ✓ Geo ✓ … Redis data types

Slide 52

Slide 52 text

$ redis-cli 127.0.0.1:6379> hset customer_1234 id 1234 (integer) 1 127.0.0.1:6379> hset customer_1234 items_in_cart 2 (integer) 1 127.0.0.1:6379> hmset customer_1234 firstname Thijs lastname Feryn OK 127.0.0.1:6379> hgetall customer_1234 1) "id" 2) "1234" 3) "items_in_cart" 4) "2" 5) "firstname" 6) "Thijs" 7) "lastname" 8) "Feryn" 127.0.0.1:6379> Redis

Slide 53

Slide 53 text

$ redis-cli 127.0.0.1:6379> lpush products_for_customer_1234 5 (integer) 1 127.0.0.1:6379> lpush products_for_customer_1234 345 (integer) 2 127.0.0.1:6379> lpush products_for_customer_1234 78 12 345 (integer) 5 127.0.0.1:6379> llen products_for_customer_1234 (integer) 5 127.0.0.1:6379> lindex products_for_customer_1234 1 "12" 127.0.0.1:6379> lindex products_for_customer_1234 2 "78" 127.0.0.1:6379> rpop products_for_customer_1234 "5" 127.0.0.1:6379> rpop products_for_customer_1234 "345" 127.0.0.1:6379> rpop products_for_customer_1234 "78" 127.0.0.1:6379> rpop products_for_customer_1234 "12" 127.0.0.1:6379> rpop products_for_customer_1234 "345" 127.0.0.1:6379> rpop products_for_customer_1234 (nil) 127.0.0.1:6379> Redis

Slide 54

Slide 54 text

Shared memory

Slide 55

Slide 55 text

No content

Slide 56

Slide 56 text

I’m a PHP guy

Slide 57

Slide 57 text

APCu

Slide 58

Slide 58 text

15]); apc_store('password',$hash,15); $hit = 'miss'; } $end = microtime(); $time = abs(round($start - $end,2)); echo "[$time] -> ($hit) $hash\n"; APCu

Slide 59

Slide 59 text

OPCache

Slide 60

Slide 60 text

No content

Slide 61

Slide 61 text

Not really a cache

Slide 62

Slide 62 text

✓Full-text search engine ✓Analytics engine ✓NoSQL database ✓Lucene based ✓Built-in clustering, replication, sharding ✓RESTful interface ✓JSON output ✓Schemaless ElasticSearch

Slide 63

Slide 63 text

Fast retrieval Fast search All REST

Slide 64

Slide 64 text

POST/blog/post/6160 { "language": "en-US", "title": "WordPress 4.4 is available! And these are the new features…", "date": "Tue, 15 Dec 2015 13:28:23 +0000", "author": "Romy", "category": [ "News", "PHP", "Sector news", "Webdesign & development", "CMS", "content management system", "wordpress", "WordPress 4.4" ], "guid": "6160" }

Slide 65

Slide 65 text

GET /blog/post/6160 { "_index": "blog", "_type": "post", "_id": "6160", "_version": 1, "found": true, "_source": { "language": "en-US", "title": "WordPress 4.4 is available! And these are the new features…", "date": "Tue, 15 Dec 2015 13:28:23 +0000", "author": "Romy", "category": [ "News", "PHP", "Sector news", "Webdesign & development", "CMS", "content management system", "wordpress", "WordPress 4.4" ], "guid": "6160" } } Retrieve document by id Document & meta data

Slide 66

Slide 66 text

POST /blog/post/_search { "fields": ["title"], "query": { "match": { "title": "working" } } }

Slide 67

Slide 67 text

What can we cache (reminder) Byte code Database output External services Files from disk Pages

Slide 68

Slide 68 text

Byte code Byte code Database output External services Files from disk Pages

Slide 69

Slide 69 text

OPCache

Slide 70

Slide 70 text

1.Read file from disk 2.Tokenize 3.Compile into bytecode 4.Execute Byte code caching

Slide 71

Slide 71 text

Read file from disk Tokenize Compile into bytecode 1.Read bytecode from shared memory 2.Execute Byte code caching

Slide 72

Slide 72 text

Files from disk Byte code Database output External services Files from disk Pages

Slide 73

Slide 73 text

A RAMDisk could solve that problem

Slide 74

Slide 74 text

Or put (some of) that data in Redis

Slide 75

Slide 75 text

Or maybe even ElasticSearch

Slide 76

Slide 76 text

External services Byte code Database output External services Files from disk Pages

Slide 77

Slide 77 text

Potential slow down

Slide 78

Slide 78 text

{ "disclaimer": "Exchange rates provided for informational purposes only and do not constitute financial advice of any kind. Although every attempt is made to ensure quality, no guarantees are made of accuracy, validity, availability, or fitness for any purpose. All usage subject to acceptance of Terms: https://openexchangerates.org/ terms/", "license": "Data sourced from various providers; resale prohibited; no warranties given of any kind. All usage subject to License Agreement: https://openexchangerates.org/license/", "timestamp": 1463137208, "base": "USD", "rates": { "AED": 3.67297, "AFN": 68.589998, "ALL": 121.4755, "AMD": 479.452503, "ANG": 1.78875, "AOA": 165.784832, "ARS": 14.19349, "AUD": 1.372985, "AWG": 1.793333, https://openexchangerates.org/api/latest.json?app_id=123

Slide 79

Slide 79 text

hgetall(‘rates'); if(count($rates) == 0) { $client = new GuzzleHttp\Client(); $response= $client->get('https://openexchangerates.org/api/latest.json?app_id=123'); $data = json_decode($response->getBody()->getContents()); $reflect = new ReflectionObject($data->rates); foreach($reflect->getProperties(ReflectionProperty::IS_PUBLIC) as $property) { $rates[$property->getName()] = $property->getValue($data->rates); } $predis->hmset('rates',$rates); $predis->expire('rates',15); } echo $rates['EUR'].PHP_EOL; Caching external services

Slide 80

Slide 80 text

hgetall(‘rates'); if(count($rates) == 0) { $client = new GuzzleHttp\Client(); $response= $client->get('https://openexchangerates.org/api/latest.json?app_id=123'); $data = json_decode($response->getBody()->getContents()); $reflect = new ReflectionObject($data->rates); foreach($reflect->getProperties(ReflectionProperty::IS_PUBLIC) as $property) { $rates[$property->getName()] = $property->getValue($data->rates); } $predis->hmset('rates',$rates); $predis->expire('rates',15); } echo $rates['EUR'].PHP_EOL; Caching external services

Slide 81

Slide 81 text

Flexibility $rates = $predis->hgetall('rates'); $rates = $predis->hget('rates', 'EUR');

Slide 82

Slide 82 text

Database output Byte code Database output External services Files from disk Pages

Slide 83

Slide 83 text

Potential slow down

Slide 84

Slide 84 text

smembers('products'); $products = $predis->pipeline(function($pipe) use ($productSkus){ foreach($productSkus as $sku) { $pipe->hgetall($sku); } }); if(count($productSkus) == 0) { $db = new PDO('mysql:host=localhost;dbname=sample', 'root', ''); $statement = $db->query('SELECT sku,name,short_description,price FROM catalog_product_flat_1'); $products = $statement->fetchAll(PDO::FETCH_ASSOC); $productSkus = []; foreach($products as $row) { $productSkus[] = $row['sku']; $predis->hmset($row['sku'],$row); $predis->sadd('products',$row['sku']); $predis->expire('products',15); $predis->expire($row['sku'],15); } } foreach($products as $product) { echo $product['sku'] . ' '.$product['name'].PHP_EOL; } Caching database output

Slide 85

Slide 85 text

smembers('products'); $products = $predis->pipeline(function($pipe) use ($productSkus){ foreach($productSkus as $sku) { $pipe->hgetall($sku); } }); if(count($productSkus) == 0) { $db = new PDO('mysql:host=localhost;dbname=sample', 'root', ''); $statement = $db->query('SELECT sku,name,short_description,price FROM catalog_product_flat_1'); $products = $statement->fetchAll(PDO::FETCH_ASSOC); $productSkus = []; foreach($products as $row) { $productSkus[] = $row['sku']; $predis->hmset($row['sku'],$row); $predis->sadd('products',$row['sku']); $predis->expire('products',15); $predis->expire($row['sku'],15); } } foreach($products as $product) { echo $product['sku'] . ' '.$product['name'].PHP_EOL; } Caching database output

Slide 86

Slide 86 text

Let’s try this with ElasticSearch

Slide 87

Slide 87 text

build(); $params = [ 'index' => 'products', 'type' => 'product', 'body' => [ 'size'=>10000, 'query' => [ 'match_all' => [ ] ] ] ]; $response = $client->search($params); if(!isset($response['hits']['hits']) || count($response['hits']['hits']) == 0) { $db = new PDO('mysql:host=localhost;dbname=sample', 'root', ''); $statement = $db->query('SELECT sku,name,short_description,price FROM products'); $products = $statement->fetchAll(PDO::FETCH_ASSOC); foreach($products as $row) { $client->index([ 'index' => 'products', 'type' => 'product', 'id' => $row['sku'], 'body' => $row ]); } } else { $products = array_map(function($doc){ return $doc['_source']; },$response['hits']['hits']); } foreach($products as $product) { echo $product['sku'] . ' '.$product['name'].PHP_EOL; }

Slide 88

Slide 88 text

build(); $params = [ 'index' => 'products', 'type' => 'product', 'body' => [ 'size'=>10000, 'query' => [ 'match_all' => [ ] ] ] ]; $response = $client->search($params); if(!isset($response['hits']['hits']) || count($response['hits']['hits']) == 0) { $db = new PDO('mysql:host=localhost;dbname=sample', 'root', ''); $statement = $db->query('SELECT sku,name,short_description,price FROM products'); $products = $statement->fetchAll(PDO::FETCH_ASSOC); foreach($products as $row) { $client->index([ 'index' => 'products', 'type' => 'product', 'id' => $row['sku'], 'body' => $row ]); } } else { $products = array_map(function($doc){ return $doc['_source']; },$response['hits']['hits']); } foreach($products as $product) { echo $product['sku'] . ' '.$product['name'].PHP_EOL; }

Slide 89

Slide 89 text

Pages Byte code Database output External services Files from disk Pages

Slide 90

Slide 90 text

all the way

Slide 91

Slide 91 text

Easy peasy right?

Slide 92

Slide 92 text

Not really

Slide 93

Slide 93 text

There are rules

Slide 94

Slide 94 text

✓Only GET & HEAD ✓No authorization headers ✓No cookies ✓No set-cookies ✓Valid cache-control/expires headers When does Varnish cache? Some rules …

Slide 95

Slide 95 text

It’s all about state

Slide 96

Slide 96 text

Lots of developers don’t follow those rules

Slide 97

Slide 97 text

Cookies everywhere

Slide 98

Slide 98 text

Cache-control quoi?

Slide 99

Slide 99 text

Out of the box These are the main reasons why Varnish will not work

Slide 100

Slide 100 text

Write VCL

Slide 101

Slide 101 text

vcl 4.0; backend default { .host = "127.0.0.1"; .port = "8080"; } Minimal VCL

Slide 102

Slide 102 text

Normalize

Slide 103

Slide 103 text

vcl 4.0; import std; sub vcl_recv { set req.http.Host = regsub(req.http.Host, ":[0-9]+", ""); set req.url = std.querysort(req.url); if (req.url ~ "\#") { set req.url = regsub(req.url, "\#.*$", ""); } if (req.url ~ "\?$") { set req.url = regsub(req.url, "\?$", ""); } if (req.restarts == 0) { if (req.http.Accept-Encoding) { if (req.http.User-Agent ~ "MSIE 6") { unset req.http.Accept-Encoding; } elsif (req.http.Accept-Encoding ~ "gzip") { set req.http.Accept-Encoding = "gzip"; } elsif (req.http.Accept-Encoding ~ "deflate") { set req.http.Accept-Encoding = "deflate"; } else { unset req.http.Accept-Encoding; } } } } Normalize

Slide 104

Slide 104 text

Static assets

Slide 105

Slide 105 text

vcl 4.0; sub vcl_recv { if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg| jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf| svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip) (\?.*)?$") { unset req.http.Cookie; return (hash); } } sub vcl_backend_response { if (bereq.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico| jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar| rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip) (\?.*)?$") { unset beresp.http.set-cookie; } if (bereq.url ~ "^[^?]*\.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm| opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip)(\?.*)?$") { unset beresp.http.set-cookie; set beresp.do_stream = true; set beresp.do_gzip = false; } } Cache static assets

Slide 106

Slide 106 text

Do you really want to cache static assets?

Slide 107

Slide 107 text

Nginx or Apache can be fast enough for that

Slide 108

Slide 108 text

Memory consumption vs Speed improvement

Slide 109

Slide 109 text

vcl 4.0; import std; sub vcl_recv { if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot| flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg| mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf| tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz| zip)(\?.*)?$") { unset req.http.Cookie; return (pass); } } Don’t cache static assets

Slide 110

Slide 110 text

URL whitelist/blacklist

Slide 111

Slide 111 text

sub vcl_recv { if (req.url ~ "^/status\.php$" || req.url ~ "^/update\.php$" || req.url ~ "^/admin$" || req.url ~ "^/admin/.*$" || req.url ~ "^/user$" || req.url ~ "^/user/.*$" || req.url ~ "^/flag/.*$" || req.url ~ "^.*/ajax/.*$" || req.url ~ "^.*/ahah/.*$") { return (pass); } } URL blacklist

Slide 112

Slide 112 text

sub vcl_recv { if (req.url ~ "^/products/?" return (hash); } } URL whitelist

Slide 113

Slide 113 text

Those damn cookies again!

Slide 114

Slide 114 text

vcl 4.0; sub vcl_recv { set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "__gads=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "__qc.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "__atuv.=[^;]+(; )?", ""); set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", ""); if (req.http.cookie ~ "^\s*$") { unset req.http.cookie; } } Remove tracking cookies

Slide 115

Slide 115 text

vcl 4.0; sub vcl_recv { if (req.http.Cookie) { set req.http.Cookie = ";" + req.http.Cookie; set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";"); set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1="); set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", ""); set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", ""); if (req.http.cookie ~ "^\s*$") { unset req.http.cookie; } } } Only keep session cookie

Slide 116

Slide 116 text

sub vcl_recv { if (req.http.Cookie) { set req.http.Cookie = ";" + req.http.Cookie; set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";"); set req.http.Cookie = regsuball(req.http.Cookie, ";(language)=", "; \1="); set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", ""); set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", ""); if (req.http.cookie ~ "^\s*$") { unset req.http.cookie; return(pass); } return(hash); } } sub vcl_hash { hash_data(regsub( req.http.Cookie, "^.*language=([^;]*);*.*$", "\1" )); } Language cookie cache variation

Slide 117

Slide 117 text

Alternative language cache variation

Slide 118

Slide 118 text

sub vcl_hash { hash_data(req.http.Accept-Language); } Language cookie cache variation Or just send a “Vary:Accept-Language” header

Slide 119

Slide 119 text

sub vcl_hash { hash_data(req.http.Cookie); } Hash all cookies

Slide 120

Slide 120 text

Edge Side Includes

Slide 121

Slide 121 text

header.php menu.php main.php footer.php TTL 5s No caching TTL 10s TTL 2s

Slide 122

Slide 122 text

sub vcl_recv { set req.http.Surrogate-Capability = "key=ESI/1.0"; } sub vcl_backend_response { if (beresp.http.Surrogate-Control ~ "ESI/1.0") { unset beresp.http.Surrogate-Control; set beresp.do_esi = true; } } Edge Side Includes

Slide 123

Slide 123 text

'; '.PHP_EOL; echo "Date in the main page: ".date('Y-m-d H:i:s').'
'; Main page ESI frame: esi.php Cached for 10 seconds Not cached

Slide 124

Slide 124 text

ESI vs AJAX

Slide 125

Slide 125 text

Control Time To Live

Slide 126

Slide 126 text

sub vcl_backend_response { set beresp.ttl = 3h; } Control Time To Live

Slide 127

Slide 127 text

sub vcl_backend_response { if (beresp.ttl <= 0s || beresp.http.Set-Cookie || beresp.http.Vary == "*") { set beresp.ttl = 120s; set beresp.uncacheable = true; return (deliver); } } Control Time To Live

Slide 128

Slide 128 text

Debugging

Slide 129

Slide 129 text

sub vcl_deliver { if (obj.hits > 0) { set resp.http.X-Cache = "HIT"; } else { set resp.http.X-Cache = "MISS"; } } Debugging

Slide 130

Slide 130 text

Purging

Slide 131

Slide 131 text

acl purge { "localhost"; "127.0.0.1"; "::1"; } sub vcl_recv { if (req.method == "PURGE") { if (!client.ip ~ purge) { return (synth(405, “Not allowed.”)); } return (purge); } } Purging

Slide 132

Slide 132 text

acl purge { "localhost"; "127.0.0.1"; "::1"; } sub vcl_backend_response { set beresp.http.x-url = bereq.url; set beresp.http.x-host = bereq.http.host; } sub vcl_deliver { unset resp.http.x-url; unset resp.http.x-host; } sub vcl_recv { if (req.method == "PURGE") { if (!client.ip ~ purge) { return (synth(405, "Not allowed")); } if(req.http.x-purge-regex) { ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url ~ " + req.http.x-purge-regex); } else { ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url == " + req.url); } return (synth(200, "Purged")); } } Banning

Slide 133

Slide 133 text

Banning curl -XPURGE -H "x-purge-regex:/products" "http://example.com" curl -XPURGE "http://example.com/products"

Slide 134

Slide 134 text

Grace mode

Slide 135

Slide 135 text

sub vcl_backend_response { set beresp.grace = 6h; } Grace mode

Slide 136

Slide 136 text

Let’s talk more about

Slide 137

Slide 137 text

GET /blog/post/6160 { "_index": "blog", "_type": "post", "_id": "6160", "_version": 1, "found": true, "_source": { "language": "en-US", "title": "WordPress 4.4 is available! And these are the new features…", "date": "Tue, 15 Dec 2015 13:28:23 +0000", "author": "Romy", "category": [ "News", "PHP", "Sector news", "Webdesign & development", "CMS", "content management system", "wordpress", "WordPress 4.4" ], "guid": "6160" } } Remember this one?

Slide 138

Slide 138 text

GET /blog/_mapping { "blog": { "mappings": { "post": { "properties": { "author": { "type": "string" }, "category": { "type": "string" }, "date": { "type": "string" }, "guid": { "type": "string" }, "language": { "type": "string" }, "title": { "type": "string" } } } } } } Schemaless? Not really … “Guesses” mapping on insert

Slide 139

Slide 139 text

Explicit mapping

Slide 140

Slide 140 text

POST /blog { "mappings" : { "post" : { "properties": { "title" : { "type" : "string" }, "date" : { "type" : "date", "format": "E, dd MMM YYYY HH:mm:ss Z" }, "author": { "type": "string" }, "category": { "type": "string" }, "guid": { "type": "integer" } } } } } Explicit mapping at index creation time

Slide 141

Slide 141 text

POST /blog { "mappings": { "post": { "properties": { "author": { "type": "string", "index": "not_analyzed" }, "category": { "type": "string", "index": "not_analyzed" }, "date": { "type": "date", "format": "E, dd MMM YYYY HH:mm:ss Z" }, "guid": { "type": "integer" }, "language": { "type": "string", "index": "not_analyzed" }, "title": { "type": "string", "fields": { "en": { "type": "string", "analyzer": "english" }, "nl": { "type": "string", "analyzer": "dutch" }, "raw": { "type": "string", "index": "not_analyzed" } } } } } } } Alternative mapping

Slide 142

Slide 142 text

POST /blog { "mappings": { "post": { "properties": { "author": { "type": "string", "index": "not_analyzed" }, "category": { "type": "string", "index": "not_analyzed" }, "date": { "type": "date", "format": "E, dd MMM YYYY HH:mm:ss Z" }, "guid": { "type": "integer" }, "language": { "type": "string", "index": "not_analyzed" }, "title": { "type": "string", "fields": { "en": { "type": "string", "analyzer": "english" }, "nl": { "type": "string", "analyzer": "dutch" }, "raw": { "type": "string", "index": "not_analyzed" } } } } } } } What’s with the analyzers?

Slide 143

Slide 143 text

Analyzed vs non-analyzed

Slide 144

Slide 144 text

Full-text vs exact value

Slide 145

Slide 145 text

By default strings are analyzed … unless you mention it in the mapping

Slide 146

Slide 146 text

Analyzer •Character filters •Tokenizers •Token filters Replaces characters for analyzed text Break text down into terms Add/modify/ delete tokens

Slide 147

Slide 147 text

Built-in analyzers •Standard •Simple •Whitespace •Stop •Keyword •Pattern •Language •Snowball •Custom Standard tokenizer Lowercase token filter English stop word token filter

Slide 148

Slide 148 text

Hey man, how are you doing? hey man how are you doing Hey man, how are you doing? hei man how you do English Whitespace Standard

Slide 149

Slide 149 text

POST /blog/post/_search { "fields": ["title"], "query": { "match": { "title": "working" } } }

Slide 150

Slide 150 text

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 1, "max_score": 1.7562683, "hits": [ { "_index": "blog", "_type": "post", "_id": "2742", "_score": 1.7562683, "fields": { "title": [ "Hosted SharePoint 2010: working efficiently as a team" ] } } ] } }

Slide 151

Slide 151 text

POST /blog/post/_search { "fields": ["title"], "query": { "match": { "title.en": "working" } } }

Slide 152

Slide 152 text

{ "took": 1, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 6, "max_score": 2.4509864, "hits": [ { "_index": "blog", "_type": "post", "_id": "828", "_score": 2.4509864, "fields": { "title": [ "Still a lot of work in store" ] } }, { "_index": "blog", "_type": "post", "_id": "3873", "_score": 2.144613, "fields": { "title": [ "SSL: what is it and how does it work?" ] } }, { "_index": "blog", "_type": "post", "_id": "5586", "_score": 2.1184452, "fields": { "title": [ "WebAssembly: several world players work on a faster Internet" ]

Slide 153

Slide 153 text

Search

Slide 154

Slide 154 text

POST /blog/post/_count { "query": { "match": { "title": "PROXY protocol support in Varnish" } } } 162 posts POST /blog/post/_count { "query": { "filtered": { "filter": { "term": { "title.raw": "PROXY protocol support in Varnish" } } } } } 1 post

Slide 155

Slide 155 text

Filter vs Query

Slide 156

Slide 156 text

Match Query Multi Match Query Bool Query Boosting Query Common Terms Query Constant Score Query Dis Max Query Filtered Query Fuzzy Like This Query Fuzzy Like This Field Query Function Score Query Fuzzy Query GeoShape Query Has Child Query Has Parent Query Ids Query Indices Query Match All Query More Like This Query Nested Query Prefix Query Query String Query Simple Query String Query Range Query Regexp Query Span First Query Span Multi Term Query Span Near Query Span Not Query Span Or Query Span Term Query Term Query Terms Query Top Children Query Wildcard Query Minimum Should Match Multi Term Query Rewrite Template Query

Slide 157

Slide 157 text

And Filter Bool Filter Exists Filter Geo Bounding Box Filter Geo Distance Filter Geo Distance Range Filter Geo Polygon Filter GeoShape Filter Geohash Cell Filter Has Child Filter Has Parent Filter Ids Filter Indices Filter Limit Filter Match All Filter Missing Filter Nested Filter Not Filter Or Filter Prefix Filter Query Filter Range Filter Regexp Filter Script Filter Term Filter Terms Filter Type Filter

Slide 158

Slide 158 text

Aggregations

Slide 159

Slide 159 text

Group by on steroids

Slide 160

Slide 160 text

SELECT author, COUNT(guid) FROM blog.post GROUP BY author Metric Bucket

Slide 161

Slide 161 text

POST /blog/post/_search?pretty&search_type=count { "aggs": { "popular_bloggers": { "terms": { "field": "author" } } } } Only aggs, no docs

Slide 162

Slide 162 text

"aggregations": { "popular_bloggers": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Romy", "doc_count": 415 }, { "key": "Combell", "doc_count": 184 }, { "key": "Tom", "doc_count": 184 }, { "key": "Jimmy Cappaert", "doc_count": 157 }, { "key": "Christophe", "doc_count": 23 } ] } } Aggregation output

Slide 163

Slide 163 text

POST /blog/_search { "query": { "match": { "title": "varnish" } }, "aggs": { "popular_bloggers": { "terms": { "field": "author", "size": 10 }, "aggs": { "used_languages": { "terms": { "field": "language", "size": 10 } } } } } } Nested multi-group by alongside query

Slide 164

Slide 164 text

"aggregations": { "popular_bloggers": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "Romy", "doc_count": 4, "used_languages": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "en-US", "doc_count": 3 }, { "key": "nl-NL", "doc_count": 1 } ] } }, { "key": "Combell", "doc_count": 3, "used_languages": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "nl-NL", "doc_count": 3 } ] } }, Aggregation output

Slide 165

Slide 165 text

Min Aggregation Max Aggregation Sum Aggregation Avg Aggregation Stats Aggregation Extended Stats Aggregation Value Count Aggregation Percentiles Aggregation Percentile Ranks Aggregation Cardinality Aggregation Geo Bounds Aggregation Top hits Aggregation Scripted Metric Aggregation Global Aggregation Filter Aggregation Filters Aggregation Missing Aggregation Nested Aggregation Reverse nested Aggregation Children Aggregation Terms Aggregation Significant Terms Aggregation Range Aggregation Date Range Aggregation IPv4 Range Aggregation Histogram Aggregation Date Histogram Aggregation Geo Distance Aggregation GeoHash grid Aggregation

Slide 166

Slide 166 text

Where does all of this fit in?

Slide 167

Slide 167 text

✓Cache all images, js, css, woff, … ✓Cache dynamic pages ✓ESI or AJAX for user-specific content ✓Sanitize HTTP input/output ✓Gateway to your application Where does Varnish fit in?

Slide 168

Slide 168 text

✓Secondary database (NoSQL) ✓RDBMS can remain the source of truth ✓Store in fixed format (no joins) ✓Full-text search ✓Fast retrieval of data projections ✓Aggregations Where does ElasticSearch fit in?

Slide 169

Slide 169 text

✓Real-time information ✓Key-value gets, not searches ✓Volatile data ✓When data changes a lot ✓RDMBS is still source of truth Where does Redis fit in?

Slide 170

Slide 170 text

No content

Slide 171

Slide 171 text

https://blog.feryn.eu https://talks.feryn.eu https://youtube.com/thijsferyn https://soundcloud.com/thijsferyn https://twitter.com/thijsferyn http://itunes.feryn.eu