Proper basic auth with Varnish Cache

Caching and everything else should work behind basic auth too.

In many default/example Varnish Cache configurations you see the following bit of config, usually somewhere at the top:

1 sub vcl_recv {
2     if (req.http.Authorization) {
3         return (pass);
4     }
5 }

That means: if a visitor is submitting the "Authorization" header then Varnish skips any cache lookups and passes the request directly to the backend. That also means: no proper testing of all the different parts of the configuration. If you are operating a stage server, which is usually protected by basic authorization, this needs to be fixed.

Why is it that way?

Usually it is because Varnish passes the request over to Apache/Nginx which handles basic auth. At that point in time Varnish does not know anything about basic auth. The backend (from Varnish perspective) responds with a 401 Unauthorized response status code which means Varnish won't cache it anyway. Then the visitor types in basic auth data and submits it back to the page. Only at that point does Varnish know: hey, there is an authorization header, oh oh, I should not cache that, otherwise other visitors will see the page without basic auth.

What's a better solution?

One nice solution can be moving basic auth handling all over to Varnish. That way either for certain pages or for the whole server Varnish will first respond with a 401 message, requesting basic auth credentials. And upon successful validation it will proceed with the remaining logic. Since Varnish now knows about basic auth and handles basic auth properly, it can respond with cached items.

Show me the code

 1 sub vcl_recv {
 2     # right at the top

 3 	call custom_basicauth;
 4 	
 5 	# if credentials match we move forward to this point where Varnish can lookup cached content.

 6 	
 7 	return (hash);
 8 }
 9 
10 sub custom_basicauth {
11   # for generating credentials: echo -n user:password | base64

12   if ((req.url ~ "/admin" || req.host ~ "internal-domain") && req.http.Authorization !~ "Basic bmljZTp0cnk=") {
13     return (synth(401));
14   }
15 }
16 
17 sub vcl_synth {
18   if (resp.status == 401) {
19     set resp.http.Content-Type = "text/html; charset=utf-8";
20     set resp.http.WWW-Authenticate = "Basic realm=PROTECTED";
21 
22     synthetic({"
23       <!doctype html>
24       <html>
25         <head>
26           <meta charset="utf-8">
27           <title>Error</title>
28         </head>
29         <body>
30           <h1>401 Unauthorized</h1>
31         </body>
32       </html>
33     "});
34 
35     return (deliver);
36   }
37 }

Right from the beginning we check if the url has something to do with our admin pages, or if it's entirely on an internal system, check for credentials, and then respond with a synthetic response asking for proper credentials.

Storing credentials in environment variables

If you want to store your basic auth credentials in an environmental variable and avoid storing it hardcoded within your Varnish configuration, you can do it this way:

vcl 4.1;

import std;

# ...

sub vcl_recv {
	call custom_basicauth;
	
	# if credentials match we move forward to this point where Varnish can lookup cached content.
	
	return (hash);
}

sub custom_basicauth {
  set req.http.basic_auth_hash = "Basic " + std.getenv("APP_BASIC_AUTH");

  if ((req.url ~ "/admin" || req.host ~ "internal-domain") && req.http.Authorization != req.http.basic_auth_hash) {
    return (synth(401));
  }

  unset req.http.basic_auth_hash;
}

First we need to store the final authorization header string which we compare. Inline concatenation of strings, and then comparing with regex, is something Varnish does not allow. Our basic auth credentials are stored in the "APP_BASIC_AUTH" env var. If we are either outside of the protected pages, or our credentials matched, we unset the basic auth hash to avoid leaking the credentials to the backend server.

Traditional .htpasswd usage

But I have a lot of basic auth passwords which I want to manage via traditional .htpasswd file.

You are not alone, there are others with the same requirement. vmod-basicauth to the rescue, which is quite easy to use, just take a look:

vcl 4.1;

import basicauth;

# ...

sub vcl_recv {
	call custom_basicauth;
	
	# if credentials match we move forward to this point where Varnish can lookup cached content.
	
	return (hash);
}

sub custom_basicauth {
  if ((req.url ~ "/admin" || req.host ~ "internal-domain") && !basicauth.match("/var/www/.htpasswd", req.http.Authorization)) {
    return (synth(401));
  }
}