Some Trickery or Resilience With Varnish

As of now, Varnish has no means to detect whether a backend is available or at good health before sending a request (periodic checking is scheduled for ver 2.0 and will presumably work with the cluster mode as well). So if you’ve got two or more backends, and under some condition can’t or won’t serve a request immediately or want to send it elsewhere depending on some circumstance, you can do this using HTTP return code or header with the not-so-well-documented feature ‘restart' (then again, what feature is well documented in Varnish?).

‘restart’ will effectively increase a counter by 1 and re-run vcl_recv(). You can set how many times a restart should take place before giving up entirely - should you not use the counter in a condition prior to it reaching the limit - by starting varnishd with -p max_restarts=n or param.set max_restarts 1 on the CLI. This variable defaults to 4, and you can of course set conditions depending on the number of restarts.

Here’s a sample VCL to do this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
 backend be1 {
                .host = "127.0.0.1";
                .port = "81";
        }
        backend  be2 {
                .host = "10.0.0.2";
                .port = "81";
        }

        sub vcl_recv {
                if (req.restarts == 0) {
                        set req.backend = be1;
                } else if (req.restarts == 1) {
                        set req.backend = be2;
                }
        }

        sub vcl_fetch {
                if (obj.status != 200 && obj.status != 302) {
                        restart;
                }
        }

In this simple VCL, a request destined for this instance of Varnish which doesn’t return 200 or 302 from the backend, is effectively sent to 10.0.0.2 which may have something else in store for the visitor!

If I for instance use the above VCL and set be1 to return a 301 for / and send a request to Varnish, this is what shows up in varnishlog:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
...
10 ObjProtocol  c HTTP/1.1
*   10 ObjStatus    c 301*
   10 ObjResponse  c Moved Permanently
   10 ObjHeader    c Date: Tue, 22 Jul 2008 00:25:29 GMT
   10 ObjHeader    c Server: Apache/2.0.59 (CentOS)
   10 ObjHeader    c X-Powered-By: PHP/5.1.6
   10 ObjHeader    c Location: http://be1.northernmost.org:6081/links.php/
   10 ObjHeader    c Content-Type: text/html; charset=UTF-8
   13 BackendClose b be1
   10 TTL          c 1839681264 RFC 120 1216686329 1216686329 0 0 0
   10 VCL_call     c fetch
 *  10 VCL_return   c restart*
   10 VCL_call     c recv
   10 VCL_return   c lookup
   10 VCL_call     c hash
   10 VCL_return   c hash
   10 VCL_call     c miss
   10 VCL_return   c fetch
   12 BackendClose b be2
   *12 BackendOpen  b be2 *10.0.0.1 38478 10.0.0.2 81
   12 TxRequest    b GET
   12 TxURL        b /
   12 TxProtocol   b HTTP/1.1
...
   10 ObjProtocol  c HTTP/1.1
   10 ObjStatus    c 200
   10 ObjResponse  c OK
   10 ObjHeader    c Date: Mon, 21 Jul 2008 23:37:24 GMT
   10 ObjHeader    c Server: Apache/2.2.6 (FreeBSD) mod_ssl/2.2.6 OpenSSL/0.9.8e DAV/2
   10 ObjHeader    c Last-Modified: Thu, 10 Jul 2008 14:26:46 GMT
   10 ObjHeader    c ETag: "35e801-3-3702d580″
   10 ObjHeader    c Content-Type: text/html
   12 BackendReuse b be2
...

You can of course use this for very basic resilience as well, but that’s definitely a job for your load balancer. Also be aware about the overhead in this, since the request after all is sent to the backend and processed before passed on to the other node.

Maybe it’s not the most useful feature in the world, but I thought it was nifty!

Jul 22nd, 2008