Caching means storing items in a temporary location, or cache, to be accessed more easily later. The term cache in this sense is general. For instance, the researchers in the first successful expedition to the South Pole stacked, or cached, food supplies along the route on their way there, so they could consume it while returning. Such caching was much more practical, or even feasible, than having the supplies explicitly delivered from their base camp.
In computing, caching serves a similar purpose: when data is needed, it needs to be fetched from the storage to the place of processing. Since such transfer is generally slow, we often—after having transferred the data to the place of processing for the first time—cache a copy of it next to the processing unit. If it turns out that we need it later, we save a storage round-trip which generally yields a considerable reduction in latency.
Examples in computing include caching memory blocks closer to CPU so we save round-trips between CPU and RAM, caching domain name resolutions in DNS resolvers so that we save (multiple) round-trips between the resolver and the root, the top level domain and authoritative name servers, or caching HTTP content in CDN so that clients can load contents quickly without having to fetch it from origin servers.
In this article we focus on HTTP and CDN caching in particular.
We have two types of caches in HTTP, private and shared ones. A private cache is a cache specific to a particular client, like a web browser. Private caches can store client-specific content that may contain sensitive information.
A shared cache, on the other hand, resides between the client and the origin server and stores content that is shared among multiple clients; such caches may not contain sensitive (personal) content. The most common type of a shared cache is a managed cache. These are purposely set-up to reduce the load on the origin server and to improve content delivery. Examples include reverse proxies, CDNs, and service workers in combination with the Cache API.
Managed caches are called managed because they allow managing cached contents explicitly; either through administration panels, service calls or a similar mechanism. This is in contrast to other types of caches which are managed by the server only through the
Cache-Control response header field.
The following are a few example capabilities one may find in a managed cache:
When a content is requested for the first time, the request is sent to the origin server which returns requested resources. For every resource that is returned, the origin server sets caching policy which tells if the resource can be cached, where, and for how long.
If a request is made for a resource that is in the cache, a cached copy is used: either from a private cache in the browser or from shared one in the CDN: either a full HTTP request is saved, or an HTTP request is sent to a CDN that is closer than the origin server and can respond faster.
A resource in cache is in one of two states: either fresh or stale. A fresh resource is valid and can be used immediately while a stale resource has expired and needs to be validated.
The driving factor in deciding whether a resource is stale is its age. In HTTP, this can be established either by examining the time it was fetched, or, even simpler, by inspecting resource’s version number.
When a cached resource becomes stale, we do not discard it immediately. Instead, its freshness can be restored by asking the origin server if the said resource has changed. This is achieved with a conditional HTTP request that contains an extra header specifying either i) the date when the cached resource was created, or ii) the version of the content that is cached.
If it turns out that the the stale resource did not change, the origin server responds with a response code
304 Not Modified; this is the origin server’s feedback stating that the cached version is up-to-date. Note that this is a very short message: the
304 Not Modified is a head-only response; it contains no body. But if the resource was changed, the server returns a normal
200 OK response where the response body contains the updated resource.
This process is called validation, or sometimes revalidation.
HTTP is a protocol used to connect to web servers by web browsers to request content to view. This is also used to transfer larger files, and is often used for software updates.
A CDN, or "Content Delivery Network," is a network of servers (typically placed around the world) used for the purpose of delivering content (videos, photos, CSS, etc..).