Upgrade to Pro — share decks privately, control downloads, hide ads and more …

C.R.E.A.M. - Cache Rules Everything Around Me - infoShare 2016

C.R.E.A.M. - Cache Rules Everything Around Me - infoShare 2016

C.R.E.A.M. - Cache Rules Everything Around Me - infoShare 2016 https://infoshare.pl

Thijs Feryn

May 19, 2016
Tweet

More Decks by Thijs Feryn

Other Decks in Technology

Transcript

  1. C.R.E.A.M
    CASH RULES EVERYTHING
    AROUND ME
    CACHE
    Thijs Feryn

    View Slide

  2. Hi, I’m Thijs

    View Slide

  3. I’m
    @ThijsFeryn
    on Twitter

    View Slide

  4. I’m an
    Evangelist
    At

    View Slide

  5. I’m a
    at
    board member

    View Slide

  6. Slow websites suck

    View Slide

  7. Web performance is
    an essential part of
    the user experience

    View Slide

  8. Infrastructure

    View Slide

  9. Code

    View Slide

  10. Slow database

    View Slide

  11. Browser
    rendering

    View Slide

  12. User location

    View Slide

  13. Down
    Slowdown ~ downtime

    View Slide

  14. Code efficiently

    View Slide

  15. Identify slowest parts

    View Slide

  16. Optimize database

    View Slide

  17. Optimize runtime

    View Slide

  18. After a while you
    hit the limits

    View Slide

  19. Optimize database
    Optimize runtime
    A
    void
    A
    void

    View Slide

  20. Don’t
    recompute
    if the data
    hasn’t
    changed

    View Slide

  21. Cache

    View Slide

  22. 3 x 2 = ?

    View Slide

  23. What
    can
    you
    cache?

    View Slide

  24. What can you cache?
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  25. Caching is not a
    compensation for
    poor code

    View Slide

  26. Caching is an
    essential
    architectural
    strategy

    View Slide

  27. The goal

    View Slide

  28. Performance
    !=
    Scalability

    View Slide

  29. Performance: speed

    View Slide

  30. Scalability: constant speed with
    increasing load

    View Slide

  31. Caching toolkit

    View Slide

  32. ✓Varnish
    ✓Redis
    ✓Shared memory
    ✓ElasticSearch *
    Caching toolkit

    View Slide

  33. Quick overview

    View Slide

  34. Varnish

    View Slide

  35. Normally
    User Server

    View Slide

  36. With Varnish
    User Varnish Server

    View Slide

  37. View Slide

  38. Stores
    HTTP output
    in memory

    View Slide

  39. Respects
    cache-control
    headers

    View Slide

  40. Varnish
    Configuration
    Language

    View Slide

  41. sub vcl_recv {
    if (req.method == "PRI") {
    /* We do not support SPDY or HTTP/2.0 */
    return (synth(405));
    }
    if (req.method != "GET" &&
    req.method != "HEAD" &&
    req.method != "PUT" &&
    req.method != "POST" &&
    req.method != "TRACE" &&
    req.method != "OPTIONS" &&
    req.method != "DELETE") {
    /* Non-RFC2616 or CONNECT which is weird. */
    return (pipe);
    }
    if (req.method != "GET" && req.method != "HEAD") {
    /* We only deal with GET and HEAD by default */
    return (pass);
    }
    if (req.http.Authorization || req.http.Cookie) {
    /* Not cacheable by default */
    return (pass);
    }
    return (hash);
    }

    View Slide

  42. sub vcl_pipe {
    # By default Connection: close is set on all piped requests, to stop
    # connection reuse from sending future requests directly to the
    # (potentially) wrong backend. If you do want this to happen, you can undo
    # it here.
    # unset bereq.http.connection;
    return (pipe);
    }
    sub vcl_pass {
    return (fetch);
    }
    sub vcl_hash {
    hash_data(req.url);
    if (req.http.host) {
    hash_data(req.http.host);
    } else {
    hash_data(server.ip);
    }
    return (lookup);
    }

    View Slide

  43. sub vcl_purge {
    return (synth(200, "Purged"));
    }
    sub vcl_hit {
    if (obj.ttl >= 0s) {
    // A pure unadultered hit, deliver it
    return (deliver);
    }
    if (obj.ttl + obj.grace > 0s) {
    // Object is in grace, deliver it
    // Automatically triggers a background fetch
    return (deliver);
    }
    // fetch & deliver once we get the result
    return (miss);
    }
    sub vcl_miss {
    return (fetch);
    }
    sub vcl_deliver {
    return (deliver);
    }

    View Slide

  44. sub vcl_backend_fetch {
    return (fetch);
    }
    sub vcl_backend_response {
    if (beresp.ttl <= 0s ||
    beresp.http.Set-Cookie ||
    beresp.http.Surrogate-control ~ "no-store" ||
    (!beresp.http.Surrogate-Control &&
    beresp.http.Cache-Control ~ "no-cache|no-store|private") ||
    beresp.http.Vary == "*") {
    /*
    * Mark as "Hit-For-Pass" for the next 2 minutes
    */
    set beresp.ttl = 120s;
    set beresp.uncacheable = true;
    }
    return (deliver);
    }

    View Slide

  45. View Slide

  46. ✓Caching
    ✓Proxying
    ✓Loadbalancing
    ✓Edge Side Includes
    ✓Streaming
    ✓Compression
    ✓Invalidation
    ✓VMODS
    ✓Logging tools
    Varnish features

    View Slide

  47. HTTP
    accelerator

    View Slide

  48. View Slide

  49. ✓ Key-value store
    ✓ Fast
    ✓ Lightweight
    ✓ Data stored in RAM
    ✓ ~Memcached
    ✓ Data types
    ✓ Data persistance
    ✓ Replication
    ✓ Clustering
    Redis

    View Slide

  50. Redis
    $ redis-cli
    127.0.0.1:6379> ping
    PONG
    127.0.0.1:6379> set mykey somevalue
    OK
    127.0.0.1:6379> get mykey
    "somevalue

    View Slide

  51. ✓ Strings
    ✓ Hashes
    ✓ Lists
    ✓ Sets
    ✓ Sorted sets
    ✓ Geo
    ✓ …
    Redis data types

    View Slide

  52. $ redis-cli
    127.0.0.1:6379> hset customer_1234 id 1234
    (integer) 1
    127.0.0.1:6379> hset customer_1234 items_in_cart 2
    (integer) 1
    127.0.0.1:6379> hmset customer_1234 firstname Thijs
    lastname Feryn
    OK
    127.0.0.1:6379> hgetall customer_1234
    1) "id"
    2) "1234"
    3) "items_in_cart"
    4) "2"
    5) "firstname"
    6) "Thijs"
    7) "lastname"
    8) "Feryn"
    127.0.0.1:6379>
    Redis

    View Slide

  53. $ redis-cli
    127.0.0.1:6379> lpush products_for_customer_1234 5
    (integer) 1
    127.0.0.1:6379> lpush products_for_customer_1234 345
    (integer) 2
    127.0.0.1:6379> lpush products_for_customer_1234 78 12 345
    (integer) 5
    127.0.0.1:6379> llen products_for_customer_1234
    (integer) 5
    127.0.0.1:6379> lindex products_for_customer_1234 1
    "12"
    127.0.0.1:6379> lindex products_for_customer_1234 2
    "78"
    127.0.0.1:6379> rpop products_for_customer_1234
    "5"
    127.0.0.1:6379> rpop products_for_customer_1234
    "345"
    127.0.0.1:6379> rpop products_for_customer_1234
    "78"
    127.0.0.1:6379> rpop products_for_customer_1234
    "12"
    127.0.0.1:6379> rpop products_for_customer_1234
    "345"
    127.0.0.1:6379> rpop products_for_customer_1234
    (nil)
    127.0.0.1:6379>
    Redis

    View Slide

  54. Shared
    memory

    View Slide

  55. View Slide

  56. I’m a PHP guy

    View Slide

  57. APCu

    View Slide

  58. $start = microtime();
    $hash = apc_fetch('password',$success);
    $hit = 'hit';
    if(!$success){
    $hash = password_hash('azerty1!',PASSWORD_BCRYPT,['cost' => 15]);
    apc_store('password',$hash,15);
    $hit = 'miss';
    }
    $end = microtime();
    $time = abs(round($start - $end,2));
    echo "[$time] -> ($hit) $hash\n";
    APCu

    View Slide

  59. OPCache

    View Slide

  60. View Slide

  61. Not really a
    cache

    View Slide

  62. ✓Full-text search engine
    ✓Analytics engine
    ✓NoSQL database
    ✓Lucene based
    ✓Built-in clustering, replication,
    sharding
    ✓RESTful interface
    ✓JSON output
    ✓Schemaless
    ElasticSearch

    View Slide

  63. Fast retrieval
    Fast search
    All REST

    View Slide

  64. POST/blog/post/6160
    {
    "language": "en-US",
    "title": "WordPress 4.4 is available! And these are the new
    features…",
    "date": "Tue, 15 Dec 2015 13:28:23 +0000",
    "author": "Romy",
    "category": [
    "News",
    "PHP",
    "Sector news",
    "Webdesign & development",
    "CMS",
    "content management system",
    "wordpress",
    "WordPress 4.4"
    ],
    "guid": "6160"
    }

    View Slide

  65. GET /blog/post/6160
    {
    "_index": "blog",
    "_type": "post",
    "_id": "6160",
    "_version": 1,
    "found": true,
    "_source": {
    "language": "en-US",
    "title": "WordPress 4.4 is available! And these are the new features…",
    "date": "Tue, 15 Dec 2015 13:28:23 +0000",
    "author": "Romy",
    "category": [
    "News",
    "PHP",
    "Sector news",
    "Webdesign & development",
    "CMS",
    "content management system",
    "wordpress",
    "WordPress 4.4"
    ],
    "guid": "6160"
    }
    }
    Retrieve
    document by
    id
    Document &
    meta data

    View Slide

  66. POST /blog/post/_search
    {
    "fields": ["title"],
    "query": {
    "match": {
    "title": "working"
    }
    }
    }

    View Slide

  67. What can we cache
    (reminder)
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  68. Byte code
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  69. OPCache

    View Slide

  70. 1.Read file from disk
    2.Tokenize
    3.Compile into bytecode
    4.Execute
    Byte code caching

    View Slide

  71. Read file from disk
    Tokenize
    Compile into bytecode
    1.Read bytecode from shared memory
    2.Execute
    Byte code caching

    View Slide

  72. Files from disk
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  73. A RAMDisk
    could solve
    that problem

    View Slide

  74. Or put
    (some of)
    that data in
    Redis

    View Slide

  75. Or maybe
    even
    ElasticSearch

    View Slide

  76. External services
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  77. Potential slow down

    View Slide

  78. {
    "disclaimer": "Exchange rates provided for informational purposes
    only and do not constitute financial advice of any kind. Although
    every attempt is made to ensure quality, no guarantees are made of
    accuracy, validity, availability, or fitness for any purpose. All
    usage subject to acceptance of Terms: https://openexchangerates.org/
    terms/",
    "license": "Data sourced from various providers; resale prohibited;
    no warranties given of any kind. All usage subject to License
    Agreement: https://openexchangerates.org/license/",
    "timestamp": 1463137208,
    "base": "USD",
    "rates": {
    "AED": 3.67297,
    "AFN": 68.589998,
    "ALL": 121.4755,
    "AMD": 479.452503,
    "ANG": 1.78875,
    "AOA": 165.784832,
    "ARS": 14.19349,
    "AUD": 1.372985,
    "AWG": 1.793333,
    https://openexchangerates.org/api/latest.json?app_id=123

    View Slide

  79. require ‘vendor/autoload.php';
    $predis = new Predis\Client();
    $rates = $predis->hgetall(‘rates');
    if(count($rates) == 0) {
    $client = new GuzzleHttp\Client();
    $response= $client->get('https://openexchangerates.org/api/latest.json?app_id=123');
    $data = json_decode($response->getBody()->getContents());
    $reflect = new ReflectionObject($data->rates);
    foreach($reflect->getProperties(ReflectionProperty::IS_PUBLIC) as $property) {
    $rates[$property->getName()] = $property->getValue($data->rates);
    }
    $predis->hmset('rates',$rates);
    $predis->expire('rates',15);
    }
    echo $rates['EUR'].PHP_EOL;
    Caching external services

    View Slide

  80. require ‘vendor/autoload.php';
    $predis = new Predis\Client();
    $rates = $predis->hgetall(‘rates');
    if(count($rates) == 0) {
    $client = new GuzzleHttp\Client();
    $response= $client->get('https://openexchangerates.org/api/latest.json?app_id=123');
    $data = json_decode($response->getBody()->getContents());
    $reflect = new ReflectionObject($data->rates);
    foreach($reflect->getProperties(ReflectionProperty::IS_PUBLIC) as $property) {
    $rates[$property->getName()] = $property->getValue($data->rates);
    }
    $predis->hmset('rates',$rates);
    $predis->expire('rates',15);
    }
    echo $rates['EUR'].PHP_EOL;
    Caching external services

    View Slide

  81. Flexibility
    $rates = $predis->hgetall('rates');
    $rates = $predis->hget('rates', 'EUR');

    View Slide

  82. Database output
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  83. Potential slow down

    View Slide

  84. require 'vendor/autoload.php';
    $predis = new Predis\Client();
    $productSkus = $predis->smembers('products');
    $products = $predis->pipeline(function($pipe) use ($productSkus){
    foreach($productSkus as $sku) {
    $pipe->hgetall($sku);
    }
    });
    if(count($productSkus) == 0) {
    $db = new PDO('mysql:host=localhost;dbname=sample', 'root', '');
    $statement = $db->query('SELECT sku,name,short_description,price FROM catalog_product_flat_1');
    $products = $statement->fetchAll(PDO::FETCH_ASSOC);
    $productSkus = [];
    foreach($products as $row) {
    $productSkus[] = $row['sku'];
    $predis->hmset($row['sku'],$row);
    $predis->sadd('products',$row['sku']);
    $predis->expire('products',15);
    $predis->expire($row['sku'],15);
    }
    }
    foreach($products as $product) {
    echo $product['sku'] . ' '.$product['name'].PHP_EOL;
    }
    Caching database output

    View Slide

  85. require 'vendor/autoload.php';
    $predis = new Predis\Client();
    $productSkus = $predis->smembers('products');
    $products = $predis->pipeline(function($pipe) use ($productSkus){
    foreach($productSkus as $sku) {
    $pipe->hgetall($sku);
    }
    });
    if(count($productSkus) == 0) {
    $db = new PDO('mysql:host=localhost;dbname=sample', 'root', '');
    $statement = $db->query('SELECT sku,name,short_description,price FROM catalog_product_flat_1');
    $products = $statement->fetchAll(PDO::FETCH_ASSOC);
    $productSkus = [];
    foreach($products as $row) {
    $productSkus[] = $row['sku'];
    $predis->hmset($row['sku'],$row);
    $predis->sadd('products',$row['sku']);
    $predis->expire('products',15);
    $predis->expire($row['sku'],15);
    }
    }
    foreach($products as $product) {
    echo $product['sku'] . ' '.$product['name'].PHP_EOL;
    }
    Caching database output

    View Slide

  86. Let’s try this
    with
    ElasticSearch

    View Slide

  87. require 'vendor/autoload.php';
    $client = Elasticsearch\ClientBuilder::create()->build();
    $params = [
    'index' => 'products',
    'type' => 'product',
    'body' => [
    'size'=>10000,
    'query' => [
    'match_all' => [
    ]
    ]
    ]
    ];
    $response = $client->search($params);
    if(!isset($response['hits']['hits']) || count($response['hits']['hits']) == 0) {
    $db = new PDO('mysql:host=localhost;dbname=sample', 'root', '');
    $statement = $db->query('SELECT sku,name,short_description,price FROM products');
    $products = $statement->fetchAll(PDO::FETCH_ASSOC);
    foreach($products as $row) {
    $client->index([
    'index' => 'products',
    'type' => 'product',
    'id' => $row['sku'],
    'body' => $row
    ]);
    }
    } else {
    $products = array_map(function($doc){
    return $doc['_source'];
    },$response['hits']['hits']);
    }
    foreach($products as $product) {
    echo $product['sku'] . ' '.$product['name'].PHP_EOL;
    }

    View Slide

  88. require 'vendor/autoload.php';
    $client = Elasticsearch\ClientBuilder::create()->build();
    $params = [
    'index' => 'products',
    'type' => 'product',
    'body' => [
    'size'=>10000,
    'query' => [
    'match_all' => [
    ]
    ]
    ]
    ];
    $response = $client->search($params);
    if(!isset($response['hits']['hits']) || count($response['hits']['hits']) == 0) {
    $db = new PDO('mysql:host=localhost;dbname=sample', 'root', '');
    $statement = $db->query('SELECT sku,name,short_description,price FROM products');
    $products = $statement->fetchAll(PDO::FETCH_ASSOC);
    foreach($products as $row) {
    $client->index([
    'index' => 'products',
    'type' => 'product',
    'id' => $row['sku'],
    'body' => $row
    ]);
    }
    } else {
    $products = array_map(function($doc){
    return $doc['_source'];
    },$response['hits']['hits']);
    }
    foreach($products as $product) {
    echo $product['sku'] . ' '.$product['name'].PHP_EOL;
    }

    View Slide

  89. Pages
    Byte code
    Database output External services
    Files from disk
    Pages

    View Slide

  90. all the way

    View Slide

  91. Easy peasy
    right?

    View Slide

  92. Not
    really

    View Slide

  93. There are rules

    View Slide

  94. ✓Only GET & HEAD
    ✓No authorization headers
    ✓No cookies
    ✓No set-cookies
    ✓Valid cache-control/expires headers
    When does Varnish
    cache? Some rules …

    View Slide

  95. It’s all
    about
    state

    View Slide

  96. Lots of
    developers
    don’t follow
    those rules

    View Slide

  97. Cookies
    everywhere

    View Slide

  98. Cache-control quoi?

    View Slide

  99. Out of the box
    These are the main reasons why
    Varnish will not work

    View Slide

  100. Write VCL

    View Slide

  101. vcl 4.0;
    backend default {
    .host = "127.0.0.1";
    .port = "8080";
    }
    Minimal VCL

    View Slide

  102. Normalize

    View Slide

  103. vcl 4.0;
    import std;
    sub vcl_recv {
    set req.http.Host = regsub(req.http.Host, ":[0-9]+", "");
    set req.url = std.querysort(req.url);
    if (req.url ~ "\#") {
    set req.url = regsub(req.url, "\#.*$", "");
    }
    if (req.url ~ "\?$") {
    set req.url = regsub(req.url, "\?$", "");
    }
    if (req.restarts == 0) {
    if (req.http.Accept-Encoding) {
    if (req.http.User-Agent ~ "MSIE 6") {
    unset req.http.Accept-Encoding;
    } elsif (req.http.Accept-Encoding ~ "gzip") {
    set req.http.Accept-Encoding = "gzip";
    } elsif (req.http.Accept-Encoding ~ "deflate") {
    set req.http.Accept-Encoding = "deflate";
    } else {
    unset req.http.Accept-Encoding;
    }
    }
    }
    }
    Normalize

    View Slide

  104. Static assets

    View Slide

  105. vcl 4.0;
    sub vcl_recv {
    if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|jpeg|
    jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf|
    svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)
    (\?.*)?$") {
    unset req.http.Cookie;
    return (hash);
    }
    }
    sub vcl_backend_response {
    if (bereq.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|flac|flv|gif|gz|ico|
    jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|
    rtf|svg|svgz|swf|tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|zip)
    (\?.*)?$") {
    unset beresp.http.set-cookie;
    }
    if (bereq.url ~ "^[^?]*\.(7z|avi|bz2|flac|flv|gz|mka|mkv|mov|mp3|mp4|mpeg|mpg|ogg|ogm|
    opus|rar|tar|tgz|tbz|txz|wav|webm|xz|zip)(\?.*)?$") {
    unset beresp.http.set-cookie;
    set beresp.do_stream = true;
    set beresp.do_gzip = false;
    }
    }
    Cache static assets

    View Slide

  106. Do you really
    want to cache
    static assets?

    View Slide

  107. Nginx or
    Apache can
    be fast
    enough for
    that

    View Slide

  108. Memory
    consumption
    vs
    Speed improvement

    View Slide

  109. vcl 4.0;
    import std;
    sub vcl_recv {
    if (req.url ~ "^[^?]*\.(7z|avi|bmp|bz2|css|csv|doc|docx|eot|
    flac|flv|gif|gz|ico|jpeg|jpg|js|less|mka|mkv|mov|mp3|mp4|mpeg|
    mpg|odt|otf|ogg|ogm|opus|pdf|png|ppt|pptx|rar|rtf|svg|svgz|swf|
    tar|tbz|tgz|ttf|txt|txz|wav|webm|webp|woff|woff2|xls|xlsx|xml|xz|
    zip)(\?.*)?$") {
    unset req.http.Cookie;
    return (pass);
    }
    }
    Don’t cache static assets

    View Slide

  110. URL whitelist/blacklist

    View Slide

  111. sub vcl_recv {
    if (req.url ~ "^/status\.php$" ||
    req.url ~ "^/update\.php$" ||
    req.url ~ "^/admin$" ||
    req.url ~ "^/admin/.*$" ||
    req.url ~ "^/user$" ||
    req.url ~ "^/user/.*$" ||
    req.url ~ "^/flag/.*$" ||
    req.url ~ "^.*/ajax/.*$" ||
    req.url ~ "^.*/ahah/.*$") {
    return (pass);
    }
    }
    URL blacklist

    View Slide

  112. sub vcl_recv {
    if (req.url ~ "^/products/?"
    return (hash);
    }
    }
    URL whitelist

    View Slide

  113. Those damn
    cookies again!

    View Slide

  114. vcl 4.0;
    sub vcl_recv {
    set req.http.Cookie = regsuball(req.http.Cookie, "has_js=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "__utm.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "_ga=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "_gat=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmctr=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmcmd.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "utmccn.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "__gads=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "__qc.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "__atuv.=[^;]+(; )?", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^;\s*", "");
    if (req.http.cookie ~ "^\s*$") {
    unset req.http.cookie;
    }
    }
    Remove tracking cookies

    View Slide

  115. vcl 4.0;
    sub vcl_recv {
    if (req.http.Cookie) {
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    if (req.http.cookie ~ "^\s*$") {
    unset req.http.cookie;
    }
    }
    }
    Only keep session cookie

    View Slide

  116. sub vcl_recv {
    if (req.http.Cookie) {
    set req.http.Cookie = ";" + req.http.Cookie;
    set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
    set req.http.Cookie = regsuball(req.http.Cookie, ";(language)=", "; \1=");
    set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
    set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
    if (req.http.cookie ~ "^\s*$") {
    unset req.http.cookie;
    return(pass);
    }
    return(hash);
    }
    }
    sub vcl_hash {
    hash_data(regsub( req.http.Cookie, "^.*language=([^;]*);*.*$", "\1" ));
    }
    Language cookie cache variation

    View Slide

  117. Alternative
    language
    cache
    variation

    View Slide

  118. sub vcl_hash {
    hash_data(req.http.Accept-Language);
    }
    Language cookie cache variation
    Or just send a
    “Vary:Accept-Language”
    header

    View Slide

  119. sub vcl_hash {
    hash_data(req.http.Cookie);
    }
    Hash all cookies

    View Slide

  120. Edge Side Includes

    View Slide

  121. header.php
    menu.php main.php
    footer.php
    TTL 5s
    No caching
    TTL 10s
    TTL 2s

    View Slide

  122. sub vcl_recv {
    set req.http.Surrogate-Capability = "key=ESI/1.0";
    }
    sub vcl_backend_response {
    if (beresp.http.Surrogate-Control ~ "ESI/1.0") {
    unset beresp.http.Surrogate-Control;
    set beresp.do_esi = true;
    }
    }
    Edge Side Includes

    View Slide

  123. header("Cache-Control: public,must-revalidate,s-maxage=10");
    echo "Date in the ESI tag: ".date('Y-m-d H:i:s').'
    ';
    header("Cache-Control: no-store");
    header(“Surrogate-Control: content='ESI/1.0'");
    echo ''.PHP_EOL;
    echo "Date in the main page: ".date('Y-m-d H:i:s').'
    ';
    Main page
    ESI frame:
    esi.php
    Cached for
    10 seconds
    Not cached

    View Slide

  124. ESI
    vs
    AJAX

    View Slide

  125. Control
    Time
    To
    Live

    View Slide

  126. sub vcl_backend_response {
    set beresp.ttl = 3h;
    }
    Control Time To Live

    View Slide

  127. sub vcl_backend_response {
    if (beresp.ttl <= 0s || beresp.http.Set-Cookie ||
    beresp.http.Vary == "*") {
    set beresp.ttl = 120s;
    set beresp.uncacheable = true;
    return (deliver);
    }
    }
    Control Time To Live

    View Slide

  128. Debugging

    View Slide

  129. sub vcl_deliver {
    if (obj.hits > 0) {
    set resp.http.X-Cache = "HIT";
    } else {
    set resp.http.X-Cache = "MISS";
    }
    }
    Debugging

    View Slide

  130. Purging

    View Slide

  131. acl purge {
    "localhost";
    "127.0.0.1";
    "::1";
    }
    sub vcl_recv {
    if (req.method == "PURGE") {
    if (!client.ip ~ purge) {
    return (synth(405, “Not allowed.”));
    }
    return (purge);
    }
    }
    Purging

    View Slide

  132. acl purge {
    "localhost";
    "127.0.0.1";
    "::1";
    }
    sub vcl_backend_response {
    set beresp.http.x-url = bereq.url;
    set beresp.http.x-host = bereq.http.host;
    }
    sub vcl_deliver {
    unset resp.http.x-url;
    unset resp.http.x-host;
    }
    sub vcl_recv {
    if (req.method == "PURGE") {
    if (!client.ip ~ purge) {
    return (synth(405, "Not allowed"));
    }
    if(req.http.x-purge-regex) {
    ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url ~ " + req.http.x-purge-regex);
    } else {
    ban("obj.http.x-host == " + req.http.host + " && obj.http.x-url == " + req.url);
    }
    return (synth(200, "Purged"));
    }
    }
    Banning

    View Slide

  133. Banning
    curl -XPURGE -H "x-purge-regex:/products" "http://example.com"
    curl -XPURGE "http://example.com/products"

    View Slide

  134. Grace mode

    View Slide

  135. sub vcl_backend_response {
    set beresp.grace = 6h;
    }
    Grace mode

    View Slide

  136. Let’s talk more about

    View Slide

  137. GET /blog/post/6160
    {
    "_index": "blog",
    "_type": "post",
    "_id": "6160",
    "_version": 1,
    "found": true,
    "_source": {
    "language": "en-US",
    "title": "WordPress 4.4 is available! And these are the new features…",
    "date": "Tue, 15 Dec 2015 13:28:23 +0000",
    "author": "Romy",
    "category": [
    "News",
    "PHP",
    "Sector news",
    "Webdesign & development",
    "CMS",
    "content management system",
    "wordpress",
    "WordPress 4.4"
    ],
    "guid": "6160"
    }
    }
    Remember
    this one?

    View Slide

  138. GET /blog/_mapping
    {
    "blog": {
    "mappings": {
    "post": {
    "properties": {
    "author": {
    "type": "string"
    },
    "category": {
    "type": "string"
    },
    "date": {
    "type": "string"
    },
    "guid": {
    "type": "string"
    },
    "language": {
    "type": "string"
    },
    "title": {
    "type": "string"
    }
    }
    }
    }
    }
    }
    Schemaless?
    Not really …
    “Guesses”
    mapping on
    insert

    View Slide

  139. Explicit
    mapping

    View Slide

  140. POST /blog
    {
    "mappings" : {
    "post" : {
    "properties": {
    "title" : {
    "type" : "string"
    },
    "date" : {
    "type" : "date",
    "format": "E, dd MMM YYYY HH:mm:ss Z"
    },
    "author": {
    "type": "string"
    },
    "category": {
    "type": "string"
    },
    "guid": {
    "type": "integer"
    }
    }
    }
    }
    }
    Explicit
    mapping at
    index creation
    time

    View Slide

  141. POST /blog
    {
    "mappings": {
    "post": {
    "properties": {
    "author": {
    "type": "string",
    "index": "not_analyzed"
    },
    "category": {
    "type": "string",
    "index": "not_analyzed"
    },
    "date": {
    "type": "date",
    "format": "E, dd MMM YYYY HH:mm:ss Z"
    },
    "guid": {
    "type": "integer"
    },
    "language": {
    "type": "string",
    "index": "not_analyzed"
    },
    "title": {
    "type": "string",
    "fields": {
    "en": {
    "type": "string",
    "analyzer": "english"
    },
    "nl": {
    "type": "string",
    "analyzer": "dutch"
    },
    "raw": {
    "type": "string",
    "index": "not_analyzed"
    }
    }
    }
    }
    }
    }
    }
    Alternative
    mapping

    View Slide

  142. POST /blog
    {
    "mappings": {
    "post": {
    "properties": {
    "author": {
    "type": "string",
    "index": "not_analyzed"
    },
    "category": {
    "type": "string",
    "index": "not_analyzed"
    },
    "date": {
    "type": "date",
    "format": "E, dd MMM YYYY HH:mm:ss Z"
    },
    "guid": {
    "type": "integer"
    },
    "language": {
    "type": "string",
    "index": "not_analyzed"
    },
    "title": {
    "type": "string",
    "fields": {
    "en": {
    "type": "string",
    "analyzer": "english"
    },
    "nl": {
    "type": "string",
    "analyzer": "dutch"
    },
    "raw": {
    "type": "string",
    "index": "not_analyzed"
    }
    }
    }
    }
    }
    }
    }
    What’s with
    the analyzers?

    View Slide

  143. Analyzed
    vs
    non-analyzed

    View Slide

  144. Full-text
    vs
    exact value

    View Slide

  145. By default strings are
    analyzed
    … unless you mention
    it in the mapping

    View Slide

  146. Analyzer
    •Character filters
    •Tokenizers
    •Token filters
    Replaces
    characters
    for analyzed
    text
    Break text
    down into
    terms
    Add/modify/
    delete tokens

    View Slide

  147. Built-in analyzers
    •Standard
    •Simple
    •Whitespace
    •Stop
    •Keyword
    •Pattern
    •Language
    •Snowball
    •Custom
    Standard
    tokenizer
    Lowercase
    token filter
    English
    stop word
    token filter

    View Slide

  148. Hey man, how are you doing?
    hey man how are you doing
    Hey man, how are you doing?
    hei man how you do
    English
    Whitespace
    Standard

    View Slide

  149. POST /blog/post/_search
    {
    "fields": ["title"],
    "query": {
    "match": {
    "title": "working"
    }
    }
    }

    View Slide

  150. {
    "took": 1,
    "timed_out": false,
    "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
    },
    "hits": {
    "total": 1,
    "max_score": 1.7562683,
    "hits": [
    {
    "_index": "blog",
    "_type": "post",
    "_id": "2742",
    "_score": 1.7562683,
    "fields": {
    "title": [
    "Hosted SharePoint 2010: working efficiently as a team"
    ]
    }
    }
    ]
    }
    }

    View Slide

  151. POST /blog/post/_search
    {
    "fields": ["title"],
    "query": {
    "match": {
    "title.en": "working"
    }
    }
    }

    View Slide

  152. {
    "took": 1,
    "timed_out": false,
    "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
    },
    "hits": {
    "total": 6,
    "max_score": 2.4509864,
    "hits": [
    {
    "_index": "blog",
    "_type": "post",
    "_id": "828",
    "_score": 2.4509864,
    "fields": {
    "title": [
    "Still a lot of work in store"
    ]
    }
    },
    {
    "_index": "blog",
    "_type": "post",
    "_id": "3873",
    "_score": 2.144613,
    "fields": {
    "title": [
    "SSL: what is it and how does it work?"
    ]
    }
    },
    {
    "_index": "blog",
    "_type": "post",
    "_id": "5586",
    "_score": 2.1184452,
    "fields": {
    "title": [
    "WebAssembly: several world players work on a faster Internet"
    ]

    View Slide

  153. Search

    View Slide

  154. POST /blog/post/_count
    {
    "query": {
    "match": {
    "title": "PROXY protocol support in Varnish"
    }
    }
    }
    162 posts
    POST /blog/post/_count
    {
    "query": {
    "filtered": {
    "filter": {
    "term": {
    "title.raw": "PROXY protocol support in Varnish"
    }
    }
    }
    }
    }
    1 post

    View Slide

  155. Filter
    vs
    Query

    View Slide

  156. Match Query
    Multi Match Query
    Bool Query
    Boosting Query
    Common Terms Query
    Constant Score Query
    Dis Max Query
    Filtered Query
    Fuzzy Like This Query
    Fuzzy Like This Field Query
    Function Score Query
    Fuzzy Query
    GeoShape Query
    Has Child Query
    Has Parent Query
    Ids Query
    Indices Query
    Match All Query
    More Like This Query
    Nested Query
    Prefix Query
    Query String Query
    Simple Query String Query
    Range Query
    Regexp Query
    Span First Query
    Span Multi Term Query
    Span Near Query
    Span Not Query
    Span Or Query
    Span Term Query
    Term Query
    Terms Query
    Top Children Query
    Wildcard Query
    Minimum Should Match
    Multi Term Query Rewrite
    Template Query

    View Slide

  157. And Filter
    Bool Filter
    Exists Filter
    Geo Bounding Box Filter
    Geo Distance Filter
    Geo Distance Range Filter
    Geo Polygon Filter
    GeoShape Filter
    Geohash Cell Filter
    Has Child Filter
    Has Parent Filter
    Ids Filter
    Indices Filter
    Limit Filter
    Match All Filter
    Missing Filter
    Nested Filter
    Not Filter
    Or Filter
    Prefix Filter
    Query Filter
    Range Filter
    Regexp Filter
    Script Filter
    Term Filter
    Terms Filter
    Type Filter

    View Slide

  158. Aggregations

    View Slide

  159. Group by on steroids

    View Slide

  160. SELECT author, COUNT(guid)
    FROM blog.post
    GROUP BY author Metric
    Bucket

    View Slide

  161. POST /blog/post/_search?pretty&search_type=count
    {
    "aggs": {
    "popular_bloggers": {
    "terms": {
    "field": "author"
    }
    }
    }
    }
    Only aggs,
    no docs

    View Slide

  162. "aggregations": {
    "popular_bloggers": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
    {
    "key": "Romy",
    "doc_count": 415
    },
    {
    "key": "Combell",
    "doc_count": 184
    },
    {
    "key": "Tom",
    "doc_count": 184
    },
    {
    "key": "Jimmy Cappaert",
    "doc_count": 157
    },
    {
    "key": "Christophe",
    "doc_count": 23
    }
    ]
    }
    }
    Aggregation
    output

    View Slide

  163. POST /blog/_search
    {
    "query": {
    "match": {
    "title": "varnish"
    }
    },
    "aggs": {
    "popular_bloggers": {
    "terms": {
    "field": "author",
    "size": 10
    },
    "aggs": {
    "used_languages": {
    "terms": {
    "field": "language",
    "size": 10
    }
    }
    }
    }
    }
    }
    Nested
    multi-group
    by
    alongside
    query

    View Slide

  164. "aggregations": {
    "popular_bloggers": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
    {
    "key": "Romy",
    "doc_count": 4,
    "used_languages": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
    {
    "key": "en-US",
    "doc_count": 3
    },
    {
    "key": "nl-NL",
    "doc_count": 1
    }
    ]
    }
    },
    {
    "key": "Combell",
    "doc_count": 3,
    "used_languages": {
    "doc_count_error_upper_bound": 0,
    "sum_other_doc_count": 0,
    "buckets": [
    {
    "key": "nl-NL",
    "doc_count": 3
    }
    ]
    }
    },
    Aggregation
    output

    View Slide

  165. Min Aggregation
    Max Aggregation
    Sum Aggregation
    Avg Aggregation
    Stats Aggregation
    Extended Stats Aggregation
    Value Count Aggregation
    Percentiles Aggregation
    Percentile Ranks Aggregation
    Cardinality Aggregation
    Geo Bounds Aggregation
    Top hits Aggregation
    Scripted Metric Aggregation
    Global Aggregation
    Filter Aggregation
    Filters Aggregation
    Missing Aggregation
    Nested Aggregation
    Reverse nested Aggregation
    Children Aggregation
    Terms Aggregation
    Significant Terms Aggregation
    Range Aggregation
    Date Range Aggregation
    IPv4 Range Aggregation
    Histogram Aggregation
    Date Histogram Aggregation
    Geo Distance Aggregation
    GeoHash grid Aggregation

    View Slide

  166. Where does all of this fit in?

    View Slide

  167. ✓Cache all images, js, css, woff, …
    ✓Cache dynamic pages
    ✓ESI or AJAX for user-specific content
    ✓Sanitize HTTP input/output
    ✓Gateway to your application
    Where does Varnish fit in?

    View Slide

  168. ✓Secondary database (NoSQL)
    ✓RDBMS can remain the source of truth
    ✓Store in fixed format (no joins)
    ✓Full-text search
    ✓Fast retrieval of data projections
    ✓Aggregations
    Where does ElasticSearch fit in?

    View Slide

  169. ✓Real-time information
    ✓Key-value gets, not searches
    ✓Volatile data
    ✓When data changes a lot
    ✓RDMBS is still source of truth
    Where does Redis fit in?

    View Slide

  170. View Slide

  171. https://blog.feryn.eu
    https://talks.feryn.eu
    https://youtube.com/thijsferyn
    https://soundcloud.com/thijsferyn
    https://twitter.com/thijsferyn
    http://itunes.feryn.eu

    View Slide