Cache strategies for microservice-based web apps

Cache strategies for microservice-based web apps Glen Campbell @glenc

Yes, I picked the dullest title ever

Buzzword Bingo:  “microservice-based web app” I sometimes hate phrases like
“microservice-based web app” because it’s essentially a new phrase coined by some marketing drone for something that is already in existence. ! Back in my day (you can tell that I’m a curmudgeon from that phrase, right?) we called it a “service-oriented architecture,” or SOA for short. ! In essence, we’re talking about an application that uses services rather than local procedures. In other words, we’re running code on a separate computer, accessed over a network, so that we can spread out the workload. ! In theory, this is a Very Good Thing because we’re not overloading a single computer, but allowing a highly-distributed network of computers to take on various parts of the application.

In 2004, I went to work for Yahoo!, first for
the Yahoo! News team and then as technical lead for Yahoo! Tech. This was the first “service-oriented architecture” site deployed at Yahoo!, and it was a huge learning experience. ! Here’s what the home page looked like when we launched (by the way, one reviewer called this “an explosion in the Web 2.0 factory,” but that’s probably not relevant). ! Everything you see here was provided by a backend service called over HTTP: a RESTful webservice, if you will. ! Some of those services are very simple (for example, one of them just serves an ad unit), while others are the tip of a very complex iceberg. For example, the “new and notable” unit recategorized content every few minutes based on the number of links, views, and comments it receives. This is the sort of work that you don’t want your web front-end to do, so it’s perfect for a backend service call. ! More importantly, that backend data doesn’t change very often. I mentioned that the data is updated every few minutes: in between updates, there’s actually no need to hit the service; we can store a cached version of the data and re-use it. !

–Wikipedia “A web cache is a mechanism for the temporary
storage (caching) of web documents, such as HTML pages and images, to reduce bandwidth usage, server load, and perceived lag. A web cache stores copies of documents passing through it; subsequent requests may be satisﬁed from the cache if certain conditions are met.” When we first built Yahoo! Tech, we did not include any web caching. We thought it might be premature optimization, and we wanted to know what benefits it was giving. ! When we did our first load testing on the site, we achieved a max throughput of about .6 requests/second without caching (per server). ! We turned on the cache, and performance immediately improved to about 50 requests/second. ! At that point, we were sold.

What is the most common type of web cache? Answer:
it’s in your browser. Every major browser respects the HTTP caching rules.

REST • Client-server • Stateless • Cacheable • Layered system
• Code on demand (optional) • Uniform interface I’m not going to go into a ton of detail on what is REST, nor will I get involved in some of the, er, excitable arguments around it. Let’s walk through these various components of the architectural style, however, just to refresh our memories. ! Client-server: separate the interface from the server Stateless: no client context stored on the server between requests Cacheable: clients can cache responses, so servers need to be clear on what can and cannot be cached Layered system: a client cannot tell if it is connecting directly to the server, or through an intermediary; in other words, caches and proxies must be transparent (but only to the layers above them). Code on demand: the server can transmit code that is executed on the client (JavaScript) Uniform interface: URLs, URIs, and a standard method for identifying them.

HTTP 1.2 The basic mechanism of RESTful web services is
defined by the HTTP 1.2 standard. For those of you that haven’t kept up with your required reading HTTP 1.2 is an update of the standard specification for HTTP: it does not change the protocol, but clarifies a lot of the ambiguity in the 1.1 spec, and makes standard some behaviors that have long been common on the web.

Relevant HTTP 1.1 Headers • Age: • Authorization: • Cache-Control:
• Connection: • ETag:! • Expires:! • If-Match: • If-None-Match: • If-Range: • Pragma: • Vary: • Warning: This looks like a lot; we’re not going to go over all of these in detail, but focus on a few that are important.

http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html    or    http://bit.ly/1p0KHQr Ok, here’s the deal. Most
of this is really, really tedious, but it’s also important. I’ll leave this link here so you can find it and bookmark it. All the details are there, but it can really impact the performance of your web application.

Cache-Control: Requests • no-cache • no-store • max-age={seconds} • max-stale={seconds}
• min-fresh={seconds} • no-transform • only-if-cached

Cache-Control: Responses • public • private • no-cache • no-store
• no-transform • must-revalidate • proxy-revalidate • max-age={seconds} • s-maxage={seconds} public - content may be cached private - content is for a specific user and may NOT be cached no-cache - do not cache this no-store - contains authenticated data and may not be stored no-transform - some proxies will convert content (for example, between TIFF and JPEG); this directive tells the proxies to not do that must-revalidate - tells the proxy to discard the data after it expires proxy-revalidate - same thing, but to private caches max-age - content is stale after {seconds} s-maxage - overrides max-age and Expires: values; forces revalidation at expiration

ETag: • Don’t use them • Generation is not speciﬁed
by the HTTP standard, and is often not consistent across a cluster. • Error-prone and can be used to track users who refuse cookies. • Turn them off; don’t use them

Expires: • Indicates when the resource is stale. • Speciﬁes
a date/time rather than delta seconds (Cache-Control: max-age=S) • Mostly used for compatibility with HTTP 1.0; Cache- Control: is more semantically rich.

Extensions • Cache-Control: max-age={s}, stale-while- revalidate={s} • Cache-Control: max-age={s}, stale-while-error={s}
These permit your continued operation in the event of a backend failure.

? Did you know that, according to HTTP, any URL
with a query string is not cacheable? ! Luckily, most caches ignore this.

Is data cacheable? • Highly cacheable data: news stories, blog
posts, aggregated data such as ratings or reviews (“likes”). • Uncacheable: secure, private, personal data such as user login information, credit card info, etc. Data that must change rapidly—stock quotes, for example, or health monitoring systems.

Cache Architectures

Web Server Service Example 1. No cache In this simple
example, the web server (frontend) calls the service directly, with no intermediary.

Web Server Cache (Proxy) Service Web Server Example 2. Shared
Cache This is a more complex example, with multiple web servers using a shared cache/proxy to access the service. You can expect substantially higher performance with this architecture (always assuming that some of your data is cacheable).

Web Server Cache (Proxy) Service Web Server Web Server Cache
(Proxy) Web Server ICP Example 3. Distributed Cache Using multiple cache systems provides redundancy and reduces loading. ICP between them can ensure consistency, if that’s supported. Note that, in the real world, there will probably a much higher ratio of web servers to caches.

Web Server Cache (Proxy) Service Web Server Web Server Cache
(Proxy) Web Server ICP (local cache) (local cache) (local cache) (local cache) Example 4. Local+Remote Cache Experiments have shown this to have about 20% higher performance than the previous example, since locally-cached data does not require network access. This might not be suitable for compute-heavy applications.

HTTP Proxies

Squid • Old, venerable; the reference implementation for the HTTP
standard • Single-threaded • Can be tricky to conﬁgure (a multitude of options) but very high-performance • Implements ICP (Internet Cache Protocol) for distributed and hierarchical caches

Varnish • More modern implementation than Squid; relies on virtual
memory and multi-threaded access • Easier to set up and conﬁgure than squid • Does not support ICP or cache hierarchies

nginx • reverse proxy and webserver - does not need
a separate web server process • great for static content, according to users • uses asynchronous sockets; one process per core architecture

Manual Caching

DIY caching • Tools let you build your own cache
system. • Not transparent, but can build transparency. • Most are simple key/value stores • Requires writing code Understand that anything done manually is not transparent itself; however, it can be used to build a transparent layer in an application stack.

DIY cache example • Object retrieval interface fetches data from
service. • Internal methods query the data store (memcached, Redis) ﬁrst and use stored data if possible. • If data is not in the cache, fetch it from the backend service and store it in the cache. All of this can be hidden under a data retrieval interface so that the application developer doesn’t need to know about it

Upsides for DIY caching • Provides a very clean programmatic
interface (transparent at the application level) • Can be tailored to speciﬁc solutions where you understand the data. • Often very high performance

Downsides to DIY caching • Requires code to be written,
tested, etc. • Requires code maintenance if the underlying data model is changed. • Not standardized like HTTP for specifying age, freshness of data (i.e., not a generic solution, but a custom one)

Edge Caching

What is an “edge cache?” • A content delivery network
(CDN) that holds static content on the “edges” of the Internet • Akamai is the biggest, but there are others: LimeLight, Microsoft Azure, Amazon CloudFront • Stores static content in multiple data centers • Content like JavaScript, CSS, images, and other media By storing content close to the end user, it reduces latency and removes load from the main provider

How does a CDN work? • Primary site (www.example.com) serves
the HTML page. • <script> <style> <img> etc. tags reference static content on the CDN • User’s browsers loads (and often stores) the static content locally, because it’s served with a Cache- Control: max-age=32767 header. For large, content-heavy sites, as much as 90% of their traffic is served by the CDN. CDNs also improve reliability because most of them are serviced by multiple providers

[email protected]  @glenc  http://glencampbell.co http://developer.rackspace.com

Cache strategies for microservice-based web apps

Cache strategies for microservice-based web apps

More Decks by Glen Campbell

Other Decks in Programming

Featured

Transcript