03. April 2023

Removing cached content in Varnish Cache

Content caching is the easy part, cache invalidation is where the rubber meets the road

There are 3 way to remove cached content from Varnish from within your application.

Simple "purge" for single urls
"Ban" for multiple urls
Remove by cache tags

Each has its up and downsides and depending on your system and needs you may choose one or the other.

Invalidating cache for a single url using "purge"

You can use the most simple way to invalidate cached content by "purging". Simply request the page you want to invalidate in your application with a custom http method like "PURGE".

In your Varnish configuration you would need to setup something like this:

vcl 4.1;

acl purge {
    "127.0.0.1";
}

sub vcl_recv {
    if (req.method == "PURGE") {
        if (client.ip !~ purge) {
            return (synth(405, "Method Not Allowed"));
        }

        return (purge);
    }
}

In Laravel you would need to send a purge request directly from your application on the very same server like this:

\Http::send('PURGE', 'http://mydomain.com/news');

The "acl purge" is an "Access Control List" named "purge". Here you define all IPs which are allowed to purge because you should restrict who is allowed to purge content, like most of the time only the server your application runs on and performs the purges.

More information on cache invalidation using purge can be found in the docs or in the developer tutorial on banning content.

Refresh immediately rather then purging

Rather then purging some specific url you can also send a "refresh" command which allows to put the targeted url back intro the Varnish cache with a fresh version.

In your Varnish configuration you would need to setup something like this:

vcl 4.1;

acl purge {
    "127.0.0.1";
}

sub vcl_recv {
    if (req.method == "REFRESH") {
        if (client.ip !~ purge) {
            return (synth(405, "Method Not Allowed"));
        }

        set req.method = "GET";
        set req.hash_always_miss = true;
    }
}

The value of hash_always_miss is that even through Varnish may find a cache entry for the requested url, it will still act as if it's a miss, fetch a new backend version, and if the headers are good, cache it.

In Laravel you would need to send a refresh request directly from your application on the very same server like this:

\Http::send('REFRESH', 'http://mydomain.com/news');

Since it's a custom http method, in Varnish you only need to switch if back to "GET" and you are good to go.

Invalidating cache for wildcard urls using "ban"

If you update the title of a news article, you naturally want to update that title on the details page of that news article as well as on the list view of all the news. Most probably your news article title change means also the news url changes. Now you have to store the former news article url somewhere to process a cache invalidation later on. OR you can simply wipe out all the news content at once, which could also be benefical in case of any references to the news article from other news articles, or a paginated news list.

In your Varnish configuration you would need to setup something like this:

vcl 4.1;

acl purge {
    "127.0.0.1";
}

sub vcl_recv {
  if (req.method == "BAN") {
    if (client.ip !~ purge) {
      return (synth(405));
    }

    if (!req.http.x-cache-invalidation-pattern) {
      return (synth(403, "x-cache-invalidation-pattern header missing."));
    }
	
	ban("obj.http.x-url ~ " + req.http.x-cache-invalidation-pattern + " && obj.http.x-host == " + req.http.host); // [tl! focus]

    return (synth(200, "Ban added"));
  }
  
  return (synth(405));
}

sub vcl_backend_response {
    set beresp.http.x-url = bereq.url;
    set beresp.http.x-host = bereq.http.host;
}

sub vcl_deliver {
    unset resp.http.x-url;
    unset resp.http.x-host;
}

For this to work Varnish has to store both url and host in a separate header for this object, so that it can compare it later on with the requested ban information.

In Laravel you would need to send a ban request with an invalidation pattern like "/news" directly from your application on the very same server like this:

\Http::withHeaders(['x-cache-invalidation-pattern' => '/news'])->send('BAN', 'http://mydomain.com');

With this request every url matching the pattern "/news" is banned and will be forwarded to the backend without sending the visitor any cached content, wether fresh or stale.

More information on cache invalidation using ban can be found in the developer tutorial on banning content.

Cache Tags

The problem with clearing cache via direct url or url pattern is: it assumes the targeted piece of content is only on those pages. But what if you have a blog which references a news article and that news article is updated or even deleted? What if you have a news article with referenced products, and that list of products gets updated? Maybe even in several places? What if you have a product detail page which references related products, and one of those products changes? Good luck keeping track of that within your application to purge all cached content.

But there's a better way: cache tags.

Simple cache tags solution

For each cache object in Varnish you can save additional information (like previously seen with x-url and x-host). This time you send additional "tags" with your response to tag all used content.

In Laravel you could send the response back with cache tags like this:

public function show() {
  return response($content)->withHeaders('x-cache-tags' => '1:news,2:news,10:news,3:blog,5:product']);
}

Your Varnish configuration you would need to setup something like this:

vcl 4.1;

acl purge {
    "127.0.0.1";
}

sub vcl_recv {
  if (req.method == "BAN") {
    if (client.ip !~ purge) {
      return (synth(405));
    }

    if (!req.http.x-purge-cache-tags) {
      return (synth(403, "x-purge-cache-tags header missing."));
    }

	ban("obj.http.x-cache-tags ~ " + req.http.x-purge-cache-tags); // [tl! focus]

    return (synth(200, "Ban added"));
  }
  
  return (synth(405));
}

sub vcl_deliver {
    unset resp.http.x-cache-tags;
}

To remove cached content based on tags via Laravel, you would need to send a ban request with an purge-cache-tags pattern like "1:news" directly from your application on the very same server like this:

\Http::withHeaders(['x-purge-cache-tags' => '1:news'])->send('BAN', 'http://mydomain.com');

There are some drawbacks to this solution:

You should write 1:news instead of news:1 (context:unique-id) because otherwise banning "news:1" would ban both "news:1" and "news:10". Or you setup a bit more complex regex to factor this in.
You need an even more complex regex pattern to ban pages by multiple cache tags at once.
Those pages affected by these bans are banned immediately which could be an issue with high traffic sites since visitors would need to wait again for the backend to fetch, and the configured grace time will not be taken into account.

But there is a better way even for this: xkey

xkey to the rescue

xkey is a VMOD (Extension written for Varnish Cache) which handles cache tags out of the box in a very nice way. You can define your cache tags, space or comma separated, in a header called "xkey" or set an existing cache-tags header in your vcl_backend_response subroutine to xkey and be fine.

xkey is part of of Varnish module collection by Varnish Software and can be integrated after installing just by importing it, see example config.

xkey supports two ways of purging content: normal "purge" and "softpurge". The normal purge invalidates content immediatly and any visitor to an affected page will have to wait until the backend responds with fresh content. "softpurge" on the other hand simply sets the time-to-life of targeted cache objects to 0. If you have grace time configured, your objects will enter grace world, and will be fetched in the background.

Here is an example config:

vcl 4.1;

import xkey;

sub vcl_recv {
  if (req.method == "PURGE") {
    if (client.ip !~ purge) {
      return (synth(405, "Method not allowed"));
    }

    if (!req.http.x-purge-cache-tags) {
      return (synth(403, "x-purge-cache-tags header missing."));
    }

    set req.http.x-purges = xkey.purge(req.http.x-purge-cache-tags); // [tl! focus]

    if (std.integer(req.http.x-purges, 0) != 0) {
      return(synth(200, req.http.x-purges + " objects purged"));
    } else {
      return(synth(404, "Key not found"));
    }
  }

  if (req.method == "SOFTPURGE") {
    if (client.ip !~ purge) {
      return (synth(405, "Method not allowed"));
    }
	
    if (!req.http.x-purge-cache-tags) {
      return (synth(403, "x-purge-cache-tags header missing."));
    }

    set req.http.x-purges = xkey.softpurge(req.http.x-purge-cache-tags); // [tl! focus]

    if (std.integer(req.http.x-purges, 0) != 0) {
      return(synth(200, req.http.x-purges + " objects purged"));
    } else {
      return(synth(404, "Key not found"));
    }
  }
}

sub vcl_backend_response {
  if (beresp.http.x-cache-tags) {
    # set header with tags, coming from your backend, to xkey header // [tl! focus]
    set beresp.http.xkey = beresp.http.x-cache-tags; // [tl! focus]
    unset beresp.http.x-cache-tags;
  }

Max header size issues with Nginx/Apache

Depending on your content and the amount of used cache tags, you could run into issues with your webserver. For example Nginx does not like the header having a bigger size then 8kb. So if your Varnish backend is an Nnginx server which routes php request to PHP-FPM, then you need to keep an eye on this.

Click here for more information on max header sizes on different servers.