Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Optimizing client side performance in highly dynamic content websites

Ori Hoch
October 10, 2013

Optimizing client side performance in highly dynamic content websites

In this talk Ori shares his experience as a development team leader in YIT while working on ynet - the most popular news website in Israel.

The talk focuses on the challenges he and his team faced trying to optimize client-side performance of a highly dynamic content management system.

The talk can provide valuable input to anyone who is interested in improving client-side performance and especially to those interested in optimizing highly dynamic content websites.

Ori Hoch

October 10, 2013
Tweet

Other Decks in Programming

Transcript

  1. Optimizing client-side performance for
    highly dynamic content websites
    Hello, my name is Ori Hoch.
    A few words about myself -

    View full-size slide

  2. ● Ori Hoch
    ● github.com/astupidog
    I have many years of experience in web
    development in various positions – from developer
    to system architect to team leader.

    View full-size slide

  3. ● Ori Hoch
    ● github.com/astupidog
    ● kaltura.org
    I work at Kaltura the leading open source online
    video platform. I work on Kaltura Mediaspace which
    is a product that can be used to create video and
    rich media web portals.

    View full-size slide

  4. ● Ori Hoch
    ● github.com/astupidog
    ● kaltura.org
    ● hasadna.org.il/en
    In my spare time – which I don't have much of
    because of 2 children - I volunteer for the public
    knowledge workshop – ירוביצ עדיל הנדסה.

    View full-size slide

  5. The Public Knowledge Workshop
    We are hacking for a better Israel. You can help!
    The public knowledge workshop is a non-profit,
    non-partisan organization that makes Israeli
    goverment data and other data of public interest
    openly accessible on the internet.
    We are always looking for volunteers, so feel free
    to approach me after the lecture for further details.

    View full-size slide

  6. Until about 2 months ago I worked at yit. Yit started
    about 15 years ago as the IT department for the
    Yedioth group which owns newspapers, magazines
    and internet sites, including Yedioth Aharonot which
    is one of the leading daily newspapers in Israel.

    View full-size slide

  7. yit.co.il/eng
    Since then YIT grown from an IT department to an
    independent company which develops some of the
    biggest, most popular websites in Israel.
    In yit I was the web development team leader for
    ynet.

    View full-size slide

  8. ynet.co.il
    ynetnews.com
    Ynet is the most popular news website in Israel
    and during my time working there me and my team
    were very lucky to have the chance to rewrite
    mostly the front-end of the site almost from scratch.

    View full-size slide

  9. during that process we learnt a lot and in this
    presentation I want to share with you some of the
    challenges we faced.

    View full-size slide

  10. walla.co.il ynet.co.il mako.co.il nana10.co.il tapuz.co.il one.co.il
    0
    100000
    200000
    300000
    400000
    500000
    600000
    700000
    800000
    900000
    749,045 – ynet average daily visitors
    749,045 – ynet average daily visitors
    Feb 2012, Israel Audience Research Board
    Feb 2012, Israel Audience Research Board
    The biggest challenge in working on ynet was not
    the popularity of the site – although it is one of the
    most popular in Israel with hundreds of thousands
    of daily visitors.
    (http://www.globes.co.il/news/article.aspx?
    did=1000734364)

    View full-size slide

  11. 1395 ynet
    ynet.co.il
    Compared to international sites it's nothing special.

    View full-size slide

  12. The most challenging part was the dynamic
    nature of the site – the site is based on a very
    sophisticated content management system that
    allows a very high degree of customization to the
    site editors.
    (http://en.m.wikipedia.org/wiki/File:Customized_Ca
    n-Am_Spyder.jpg)

    View full-size slide

  13. The editors make changes to the site all the time
    and in addition always want to add new features.
    All of that needs to happen as soon as possible.
    (baby godfather meme)

    View full-size slide

  14. This make the basic optimizations much harder
    to perform and in this presentation I will focus on
    the problems we encountered and possible
    solutions.

    View full-size slide

  15. The plan
    ● Images
    ● Sprites
    ● CSS / JS
    ● Caching
    I'm assuming everyone here knows the basics of
    these optimizations and I will focus on more
    advanced problems and solutions.
    Feel free to ask questions or shout out comments
    during the presentation.

    View full-size slide

  16. Images
    On ynet, being primariliy a news website there are a lot
    of images the editors upload and the images change
    frequently.
    (http://ftvlive.com/todays-news/2013/9/20/tv-news-pet-peeves)

    View full-size slide

  17. Images
    If there is an ongoing event, the editors will get a lot of
    photos from the photographers in the event and will
    want to use those photos as soon as possible.

    View full-size slide

  18. Images of course take a lot of bandwidth and to lower
    costs and improve client-side performance we want the
    images to be as small as possible but without losing
    quality.
    (http://what-if.xkcd.com/31/)

    View full-size slide

  19. Also we may want to serve the same image in different
    sizes for different devices – using media queries or for
    external applications.
    So ideally you want to scale down the original image to
    match the specific resolution on each device.
    (http://www.neobytesolutions.com/responsive-web-design-part
    -3/)

    View full-size slide

  20. DELEGATE
    DELEGATE
    DELEGATE
    DELEGATE
    DELEGATE
    DELEGATE
    !
    The easiest solution for the developers is to let the
    editors do all the work – crop and scale the images
    manually on the desktop and require to upload an image
    in the correct size. Of course the editors don't like that
    solution.. Also, not all editors are photoshop masters and
    might not scale or crop the images properly.

    View full-size slide

  21. CROP
    SCALE
    A better solution is to add an easy-to-use image editing
    interface online in the sites admin. So the editor uploads
    a full-resolution file and then presented with an
    easy-to-use interface that allows to crop and scale the
    images accordingly.

    View full-size slide

  22. SMALL
    BIG
    400x300
    800x600
    We can make the interface even easier to use and more
    fool-proof by limiting only to predefined target sizes.
    If for example we have 2 size targets – big and small,
    then the user just selects which size he wants to produce
    and the interface limits him only to that size and does the
    crop or scaling as needed.

    View full-size slide

  23. KEEP CALM
    AND
    FIND THE
    USER ERORR
    This protects against user error and ensures the quality
    of the resulting images but It's still not good enough
    when there are a lot of images being uploaded all the
    time and used in different places.

    View full-size slide

  24. The editors need to have images ready as soon as
    possible – no time to waste trying to manually crop or
    scale the images.
    Also, they sometime upload a lot of images in advance
    so when they need them they can use them immediately.
    But usually only a fraction of those images are actually
    used.
    (http://knowyourmeme.com/memes/soon)

    View full-size slide

  25. So the natural solution is to crop or scale automatically.
    This presents a few problems.
    (http://memecrunch.com/meme/P8ER/automate-it)

    View full-size slide

  26. Usually the editors upload a full-resolution image they
    get from the photographer – for example this image of a
    speaker at a conference. In the full-size image you can
    see who the speaker is and maybe even read the text
    behind him.
    (http://www.mysqlperformanceblog.com/2013/04/29/percona-l
    ive-mysql-conference-2013-wrap-up/)

    View full-size slide

  27. But if we want to use this image for a mobile device, it
    will be tiny – and none of the details are visible.

    View full-size slide

  28. So for smaller devices you need to crop the image – so
    that at least some details are visible, but it can't be done
    automatically (or very hard to do) because you need to
    identify the relevant part of the image you want to focus
    on.

    View full-size slide

  29. Another problem is different proportions. For example,
    if you have a vertical image but you want to use it on a
    horizontal location.
    If you crop it automatically, you might chop someone's
    head off and if you don't crop it you have wasted space.

    View full-size slide

  30. Manual crop/scale in the online editing interface
    to predefined sizes
    So the solution we implemented is a semi-auotomatic
    process. The editors upload the full-resolution image and
    then in the image editing interface they crop and scale
    into as few as possible predefined sizes.

    View full-size slide

  31. From these sizes we can safely scale automatically to
    other predefined sizes which will be used on the site and
    will be more specifically fitted to the relevant targets.

    View full-size slide

  32. This way we can achieve our goal of serving each user
    the specific image resolution required and the editors
    have an efficient workflow for image uploading which
    minimzes manual work.

    View full-size slide

  33. sprites
    Every site has a lot of icons, logos, background
    images etc. that repeat on different pages.
    (http://forum.mgbr.net/index.php?showtopic=52640)

    View full-size slide

  34. sprites
    To reduce the amount of http requests we can
    combine all those images into one big image which
    is downloaded once, cached on the client-side
    and then reused on all the site's pages.

    View full-size slide

  35. .rssicon {
    background: url('sprites.png') no-repeat 0 0;
    background-position: -38px -511px;
    }
    http://brandonsetter.com/60-beautiful-css-sprite-social-media-icons/
    Normally to use sprites you take all your images,
    put them on one big image (there are many tools to
    do that) and then use that image throughout your
    site using css.

    View full-size slide

  36. On ynet it was very hard to do because there are a
    lot of components and different layouts that the site
    editors can use.
    (http://knowyourmeme.com/memes/computer-reacti
    on-faces)

    View full-size slide

  37. homepage article
    As an example I'll use 2 of the most popular pages on
    ynet – the homepage and the article page. In addition to
    those there are thousands of other pages in varying
    degrees of popularity.

    View full-size slide

  38. Header
    component
    Top story
    component
    Search bar
    component
    Strip
    component
    Each of these pages is made up of components that the
    site's editors can add, remove and edit. The editors can
    choose from hundreds of different components.

    View full-size slide

  39. Belong to
    components on the
    homepage
    Belong to
    components on
    articles
    Not used on
    popular pages at
    all, only on special
    pages or not at all
    If we were to create one big image with all those
    sprites it will be too big and only a few of those
    sprites will be used at any given moment.

    View full-size slide

  40. Image that contains the sprites
    that appear on the homepage
    Image that contains the sprites
    that appear on articles
    The solution is to split this big image into groups of
    sprites that will be used together.

    View full-size slide

  41. Image that contains the sprites
    that appear on the homepage
    Image that contains the sprites
    that appear on articles
    Overlapping sprites – same
    sprites appear on both homepage
    and article
    One problem with this solution is that if each page
    has a different image, some of the sprites in that
    image might repeat between the different pages.

    View full-size slide

  42. Image that contains the sprites
    that appear on the homepage
    A component that
    appears on the
    homepage but is not
    included in the
    homepage sprites
    image
    Another problem is that the components that appear on
    the page are determined by the editor – so there might be
    a component not included on the predefined image for
    the page and the sprites in that component will have to
    be displayed directly without using the combined sprites
    image.

    View full-size slide

  43. How to divide the big sprites image into smaller sprite
    groups?
    The challenge is how to define the sprite groups in
    the most efficient manner.

    View full-size slide

  44. Automatically look at the components that are on the page
    and create a unique image for thos components
    One possible solution is to create the sprite
    images dynamically according to the components
    that are currently on the page. In this way, each
    page will have a unique sprites image associated
    with it. There are a few problems with this solution.

    View full-size slide

  45. The hardest problem with that is that it's hard to
    implement.

    View full-size slide

  46. /homepage
    (html)
    contains a link to
    css file:
    /homepage.sprites.css
    (dynamically generated)
    ● collects all the sprites used on the
    homepage
    ● compiles an image with all those
    sprites
    ● returns css rules which point to that
    sprites image
    In that case you need the page's html to contain a link to
    an external css file that dynamically generates css rules
    for the all sprites used on that page based on a
    dynamically generated sprites image.

    View full-size slide

  47. Overlapping sprites
    There is also the problem of overlapping sprites –
    some components might be on 2 different pages.
    If each page has a unique sprites image associated
    with it the sprites for those components will be in 2
    sprite images.

    View full-size slide

  48. Waste bandwidth and increase
    page load times
    Overlapping sprites
    This solution ensures each page will have exactly 1
    sprite image related to it – this drasitaclly reduces
    the amount of http requests.
    But it also wastes a lot of bandwidth – the sprite
    images are not reused between different pages.

    View full-size slide

  49. This solution might be suitable in some cases, it really
    depends on the site editor's usage patterns. For ynet we
    found it's not suitable.

    View full-size slide

  50. Components
    most likely on
    the homepage
    Components
    most likely on
    articles
    Another solution is to create the sprite groups
    manually in advance according to client
    requirements – trying to anticipate the editor's
    usage patterns and which components will most
    likely be used on different pages.

    View full-size slide

  51. Components
    most likely on
    the homepage
    Components
    most likely on
    articles
    Common
    sprites – for all
    pages
    Social Sharing
    icons
    You can add different groups in any logical way
    which suits the site's usage patterns.

    View full-size slide

  52. This solution will slightly increase the number of http
    requests – each page will need to fetch a few different
    sprites images..

    View full-size slide

  53. Does not waste bandwidth
    So, this solution is not ideal regarding reduction of
    http requests but of course a lot better then without
    sprites. Also the sprites images will be cached on
    the client-side and reused between different pages
    – unlike the previous solution.

    View full-size slide

  54. Does not waste bandwidth
    No overlapping
    sprites
    It solves the problem of overlapping sprites – we
    divide the sprites manually and can make sure
    there is no overlap.

    View full-size slide

  55. Easier to implement
    Does not waste bandwidth
    No overlapping
    sprites
    And, it's easier to implement.

    View full-size slide

  56. The site's logo –
    changes on special
    occasions
    Links to sub-categories are
    images that are modified by the
    editor
    In addition to static sprites we also have some
    dynamic images that the editor uploads and can
    also benefit from spriting.
    For example on ynet – the site's logo can be
    changed by the editor and there are some other
    small images like that – that are used on many
    pages.

    View full-size slide

  57. The logo must be 130x50
    but the editor uploads 135x55
    If we use an img tag it works:

    But it doesn't work when using sprites
    with css
    background-position
    It's dangerous to compile those sprites
    automatically because they are uploaded by the
    editor – we need to ensure they work well with
    sprites.
    A common problem we encountered is that the
    editor uploads an image which is not exactly the
    required size. It works well in an img tag which
    scales it accordingly but not in a sprite.

    View full-size slide

  58. So we implemented a semi-automatic process that
    allows us to compile those dynamic sprites and test
    them – if there is a problem with one sprite we don't
    include it in the sprites image and it fallbacks to
    using a regular image.

    View full-size slide

  59. All this sprite usage needs tight integration with the
    development and deployment process.
    If the sprites are dynamically allocated and
    compiled, the developers can't create them in
    advanced during development.
    (http://www.deviantart.com/art/Puzzle-292285090)

    View full-size slide

  60. Also developers shouldn't need to wait to create
    sprites – a big image with a lot of sprites can take a
    long time to compile.
    (http://xkcd.com/303/)

    View full-size slide

  61. To solve these problems we developed a sprites api that
    has several responsibilites.

    View full-size slide

  62. Sprites Api
    Image Path
    Css class name
    ".ynetlogo“
    During development the usage is very easy – just
    give the api an image and get a css class name
    which can be used to display that image.

    View full-size slide



  63. This makes it fast and easy to use during development.

    View full-size slide

  64. .ynetlogo {
    background: url('sprites.png')
    background-position: -38px -511px;
    }
    .ynetlogo {
    background: url('ynetlogo.png')
    }
    Sprites Api
    Development
    Production
    Get css declarations
    The api keeps track of all the sprites that were used
    and after all the components were rendered it
    returns all the css declarations for all those sprites.
    During development, those css declarations will
    refer to the image directly so developers don't wait
    for sprite compilation, for qa or production it will
    refer to the relevant sprites image.

    View full-size slide

  65. Sprites Api
    Article sprites image
    Homepage sprites image
    Render sprites image
    The api also renders the sprite images according to a
    configuration that specifies how to distribute the sprites
    to the different groups.

    View full-size slide

  66. Of course, the css declarations and sprite images need to
    match precisely and we do this using versioning.

    View full-size slide

  67. homepage.sprites.v4.css
    homepage.sprites.v4.png
    Each sprite group has it's related css rules and a related
    version number. Each time the sprite group is modified,
    it's compiled and the version is incremented.

    View full-size slide

  68. group version status
    homepage 4 ready
    homepage 5 disabled
    homepage 6 compiling
    sprite_versions
    Also, each version has a status which indicates if the
    sprites for that version were compiled and ready for
    usage on the site. This also allows for easy rollback if
    there is a problem with a sprite version.

    View full-size slide

  69. github.com/jakobwesthoff/web-sprite-generator
    We used web-sprite-generator to render the sprites
    images and generate the css rules – we wrapped it
    with our own code that knows where to get the
    individual sprite images from, where to store the
    output image and update the backend so it knows
    the image was compiled and ready for usage on
    the site.
    The developers have the option to use it directly to
    test the sprite generation and integration before
    pushing to qa.

    View full-size slide

  70. So, developers don't need to do anything special to use
    sprites – they just upload an image and use the sprites
    api to get a css class name
    The sprites api handles all the hard work of compiling
    sprites and providing matching css rules.
    The sprites are distributed to different groups in
    production according to the site editor's usage patterns.

    View full-size slide

  71. css / js
    The site's content changes frequently but the css and
    javascript do not so we can ideally combine all the
    separate css and javascript code files into one big file,
    minify it and it can also be cached on the client-side and
    re-used between pages.
    (https://groups.google.com/forum/#!
    topic/nodejs/LquacmCOs-0)

    View full-size slide

  72. Css/js for
    articles
    components
    Css/js for
    homepage
    components
    In a dynamic cms we have similar problems as with
    sprites – if we put all the css declarations in one file it
    will be too big.
    Also, there are a lot of different components and we
    don't know in advance which components will appear on
    the page.

    View full-size slide

  73. Css/js for
    articles
    components
    Css/js for
    homepage
    components
    The solutions are also similar to the sprites solutions –
    manually or automatically distribute the css and js code
    into several external files.

    View full-size slide

  74. Unique page
    css / js files,
    each contains
    code for all
    the
    components
    that currently
    appear on the
    page
    homepage.css
    homepage.js
    For css and javascript it might be more effective to
    automatically generate a unique css/js file for every
    page.
    although there is overlap of code, because the css and
    javascript files are usually smaller then the sprite files it
    does not waste too much bandwidth or harm the page
    load time as much.
    It depends on the editors usage patterns.

    View full-size slide

  75. >
    The implementation is somewhat differente then for
    sprites.
    (http://usersnap.com/blog/good-habits-in-web-developm
    ent/)

    View full-size slide

  76. The ynet cms is made up of a lot of components – each
    component can be placed inside a cell and the cells are
    arranged in a dynamic grid. The site editors can change
    the arrangement of cells or the grid in the layout and can
    place different components inside the cells.

    View full-size slide

  77. Multi Articles
    Component MultiArticlesController
    display (request) : response
    All the parts are developed using mvc methodology, so
    each component is a controller that has a display method
    that returns the component's html.

    View full-size slide

  78. MultiArticlesController
    display (request) : response
    getStaticJs (request) : string
    getStaticCss (request) : string
    Each component's controller also has extra methods that
    return it's static css and javascript code. The generator
    knows that this static code can be served from an
    external file and the developers make sure that the code
    returned from those methods is static and does not
    depend on any dynamic variables.

    View full-size slide

  79. Generator
    display
    Iterate over all the
    component's controllers
    on the current page
    call the display method
    on each controller
    returns the combined
    html according to the
    page's layout
    The object that generates the page – combines the
    layout, calls the component's etc. is called the generator.
    The generator's display method returns the combined
    html of all the page's components according to a layout.

    View full-size slide

  80. Static
    Files
    API
    homepage.css
    homepage.js
    Both files are
    dynamically
    generated by
    the static
    files api
    The returned html contains links to the static javascript
    and css files that are dynamically generated by a static
    files api.

    View full-size slide

  81. Static
    Files
    API
    GetStaticFile ( type , page )
    "js" , "homepage"
    "css" , "homepage"
    "js" , "article"
    Iterate over all the
    component's controllers
    on the relevant page
    call the
    getStaticCss /
    getStaticJs method
    on each controller
    return the
    combined css /
    js
    The static files api gets as a parameter the type of file (js
    or css) and the relevant page. It then iterates over all the
    components on the relevant page and collects all the
    statc js / css code..

    View full-size slide

  82. It also handles versioning of those files so they can be
    cached indefinately and can optionally minify / compile
    coffeescripts / scss etc.
    Some of the functionality is similar to the sprites api and
    they do share some of the code.

    View full-size slide

  83. So, css and js code is similar to sprites, only the
    implementation is slightly different.
    Like for sprites, we made sure it will integrate well into
    the development process. Each component is completely
    separate from the other components and the
    optimizations are done without requiring any specific
    changes to each component's code.

    View full-size slide

  84. Caching
    The most basic optimization but also the hardest is
    caching. There are a lot of different levels of
    caching – I will focus on full-page cache in the http
    level.

    View full-size slide

  85. The basic problem is that without caching all the
    requests hit your servers which increases the load
    and exposes you to ddos attacks. Also, if you have
    users from around the world – they all have to
    reach your server which might require many hops.

    View full-size slide

  86. CACHING
    LAYER
    The solution is to use a cdn – content distribution
    network and/or an internal caching layer that can
    more effectively handle high loads and can also be
    nearer to the clients physical location.

    View full-size slide

  87. “There are only two hard things in computer
    science: cache invalidation, naming things, and
    off-by-one errors.“
    Phil Karlton (excluding the off-by-one joke)
    A famous quote by Phil Karlton, one of X11
    developers says that cache invalidation is one of
    the hardest things in computer science.
    He was refering to the X11 implementation –
    making sure the windows are refreshed while doing
    as few operations as possible. But it's also true for
    web caches.

    View full-size slide

  88. For static assets like images/javascript/css files it's
    relatively easy to do cache invalidation – you can
    use versioning for those assets and when making a
    change increase the version so a new file will be
    served.
    The best option for those assets is to use an
    external cdn that stores your files on their servers,
    ideally with servers around the world that serve the
    files from the servers nearest to each user.

    View full-size slide

  89. For dynamic content it's more difficult and I'll focus on
    that.
    There are 2 main options for cache invalidation:

    View full-size slide

  90. using ttls (time to live) – each object is cached for a
    predefined amount of time. When the predefined time
    passes the cache is invalidated and a fresh copy will be
    served.

    View full-size slide

  91. On-demand caching – this way the data is saved in the
    cache indefinately and when some data needs to change
    a request is made to invalidate the relevant data and a
    fresh copy will be served

    View full-size slide

  92. Example:
    ● Default ttl for all pages – 5 minutes
    ● Homepage needs to be refreshed more often so
    set at 2 minutes
    ● Articles are updated less often so can be set at 10
    minutes
    Ttl caching is relatively easy to implement – you specify
    a default ttl for all requests and possibly change that
    default ttl for specific requests.
    You need to determine the desired ttl for each resource -
    if it's too high then the site's content will be stale. If it's
    too low – then still your servers will get a lot of hits.
    Usually we want the ttl to be as high as possible but the
    client wants it lower so the site is more up-to-date.
    On-demand cache invalidation is much harder to
    implement.

    View full-size slide

  93. Edit and save an article Delete the article's page cache
    With on-demand cache invalidation when an editor
    edits an article – you invalidate the relevant article
    page's cache.

    View full-size slide

  94. Edit and save an article Delete the article's page cache
    The article's title
    is also used on
    the homepage
    Delete the homepage's
    cache
    But if the data from that article is used in other
    places – you also need to invalidate those
    locations.

    View full-size slide

  95. Edit and save an article Delete the article's page cache
    The article's title
    is also used on
    the homepage
    Delete the homepage's
    cache
    Also used on:
    ● Iphone application
    ● RSS feeds
    ● Headlines page
    ● Headlines ticker
    ● ...
    Sometimes there are a lot of such locations and it's
    difficult to keep track.
    Also, you need to make sure you don't have any
    changes which happen frequently and cause too
    much cache invalidations – that will make the
    cache useless.

    View full-size slide

  96. For a news website the cache must be invalidated
    as quickly as possible – for example if an editor
    makes a mistake like publishing a story which was
    not approved or uses a photo which should be
    censored – those must be fixed immediately
    For that reason, using a CDN is usually not an
    option – some cdns don't have the option to
    actively invalidate the cache and even if they do
    have that option – the cdn might take a lot of time
    to invalidate all it's caches because the cache is
    distributed across data-centers.

    View full-size slide

  97. External cdn
    Ttl-based cache
    invalidation
    The solution that worked best for us is a
    combination of ttl and cache invalidation.
    We use a cdn for ttl-based cache invalidation with
    relatively low ttls.

    View full-size slide

  98. External cdn
    Ttl-based cache
    invalidation
    Internal caching layer
    on-demand cache invalidation
    Invalidation
    requests
    And an internal caching layer for on-demand cache
    invalidation – using something like varnish, which
    handles cache invalidation very fast.
    This way most requests are served from the cdn
    but some still reach our servers and they are
    handled by the on-demand caching layer.

    View full-size slide

  99. External cdn
    Ttl-based cache
    invalidation
    Internal caching layer
    on-demand cache invalidation
    There is still a problem that the first user that
    requests a resource will reach the application
    server – because the resource is not cached yet

    View full-size slide

  100. External cdn
    Ttl-based cache
    invalidation
    Internal caching layer
    on-demand cache invalidation
    Stored in
    the cache
    Stored in
    the cache
    When this user's request is returned it is stored in
    the caches and the next user will get it from the
    cache.
    If it takes a long time to generate a page you may
    want to optimize even further by implementing
    cache warming.

    View full-size slide

  101. External cdn
    Ttl-based cache
    invalidation
    Internal caching layer
    on-demand cache invalidation
    Cache
    invalidation
    Request
    updated
    resource
    updated
    resource
    Stored in
    the cache
    Cache warming means that when the cache is
    invalidated the relevant page is actively
    re-generated, not waiting for the next request.

    View full-size slide

  102. External cdn
    Ttl-based cache
    invalidation
    Internal caching layer
    on-demand cache invalidation
    Cache
    invalidation
    Request
    updated
    resource
    updated
    resource
    Stored in
    the cache
    A possible problem with this is that users might
    interfere in this process. If a user requests a page
    after the cache was invalidated but before the
    updated request was cached – he will still reach
    our application server. This might be a problem if a
    certain page takes an extermely long time to
    generate.

    View full-size slide

  103. External cdn
    Ttl-based cache
    invalidation
    disabled
    Internal caching layer
    on-demand cache invalidation
    active
    active
    active
    A solution to this problem is to do the cache
    warming one server at a time.
    When an invalidation is needed – you can disable
    one caching server so it won't serve requests and
    then the invalidation and regeneration is performed
    on that server.

    View full-size slide

  104. External cdn
    Ttl-based cache
    invalidation
    disabled
    Internal caching layer
    on-demand cache invalidation
    active
    active
    active
    After the page was regenerated on that server it is
    re-enabled, and then another server is disabled
    and so on. After the first server the page doesn't
    need to regenerate because it can be copied from
    the first server.

    View full-size slide

  105. So, the combination of ttl caching using a cdn in
    combination with an internal on-demand cache
    invalidation internally worked well for us – only a very
    low percentage of requests reach our servers – most of
    them are handled by the cdn, and the few that do reach
    our servers are mostly handled by the internal caching.
    The cache-warming might be a bit overkill – but it's
    useful if you have pages that take a long time to
    generate.

    View full-size slide

  106. THANK YOU
    FOR YOUR ATTENTION
    QUESTIONS?

    View full-size slide