Kristian Lyngstøl's Blog

Varnish backend selection through DNS

Posted on 2010-08-02

A common challenge to using a cache is maintaining a mapping between public site names and actual web servers (backends). If you only have one type of web server (or maybe two?), and it's fairly static, this isn't a big deal. However, if your infrastructure spans tens of different types of web servers, then it starts getting iffy. Here's an example of how this could look:

director sports round-robin {
{ .backend = { .host = "sports1.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "sports2.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "sports3.internal.example.net"; .port = "80"; } }
}
director shop round-robin {
{ .backend = { .host = "shop1.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "shop2.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "shop3.internal.example.net"; .port = "80"; } }
}
director economy round-robin {
{ .backend = { .host = "economy1.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "economy2.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "economy3.internal.example.net"; .port = "80"; } }
}
director main round-robin {
{ .backend = { .host = "main1.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "main2.internal.example.net"; .port = "80"; } }
{ .backend = { .host = "main3.internal.example.net"; .port = "80"; } }
}
sub vcl_fetch {
if (req.http.host ~ "sports.example.net$") {
set req.backend = sports;
} elsif (req.http.host ~ "shop.example.net$") {
set req.backend = shop;
} elsif (req.http.host ~ "economy.example.net$") {
set req.backend = economy;
} else {
set req.backend = main;
}
}

This is obviously a bit of a drag, and so far we only added four sites.

Enter the DNS director

The DNS director allows you to define a single director containing one or more backends, just like any other backend director, but uses DNS to decide which one to pick. Simply put, it does a DNS lookup on the Host header and sees if it has a backend that matches.

Notice that it does NOT automatically try whatever IP the Host header resolves to. It has to know about the backend in advance. This might sound like a bit of a major flaw, but I choose to look at it as a safety net.

The DNS director also allow you to add a postfix to the host-name before it is looked up, so www.example.com could become www.example.com.internal.example.net. It has rudimentary DNS round-robin support and caches the DNS lookups (both successful lookups and misses). Since there isn't a practical way of obtaining the TTL of a DNS result except apparently hand-coding the resolver or possibly adding some obscure dependency, the life-time of the DNS cache is defined by a setting in the director, cleverly named .ttl.

As a last added bonus, I also added a really easy way to shoot yourself in the leg. With the DNS director, you can specify a range of backends using .list and a acl-like syntax. However, remember that adding 10.0.0.0/8 means Varnish will internally generate 16 million backends. That's PROBABLY not a good idea. So do use some moderation. A /24 or two shouldn't be a big deal, but I'd try to narrow it down as much as possible.

Here's the above example, except using FOO.example.net.internal.example.net instead of FOO<N>.internal.example.net, and assuming that the web servers are all in the 192.168.0.0/24 range or 172.16.0.0/24

director mydir dns {
.list = {
.port = "80";
.connection_timeout = 0.4;
"192.168.0.0"/24;
"172.16.0.0"/24;
}
.ttl = 5m;
.suffix = "internal.example.net"
}
sub vcl_recv {
set req.backend = mydir;
}

Specifying connection timeout and similar attributes is optional in .list, but has to be before the list of IPs. You do not have to use .list, you can also add backends the same way you would with the random or round-robin director.

The above examples caches the DNS results for 5 minutes. I've also added some counters (visible through varnishstat): Number of DNS lookups, DNS cache hits, failed DNS lookups and how often the DNS cache is full. You may still want to do some basic sanitizing of domain names so as to reduce DNS spam, but now you can probably just use one regsub to match a number of sites.

Availability

The DNS director was committed to Varnish development trunk yesterday (Sunday, August 1st 2010) and I expect it to be available in Varnish 2.1.4. It has already been used in production at a few customer sites, with good results. Like any non-trivial piece of code, there are certain aspects of it I want to improve, but I do not foresee that as a blocker for including it in a release. It does not affect the rest of Varnish at all if it not used (unless you count adding 4 counters to varnishstat).

If you want to test it, you'll have to use Varnish trunk. Alternatively, you can check out my Varnish Software (http://www.varnish-software.com) customer, then you just drop us a mail and you'll get your rpms or .debs shortly. (My marketing hat is currently firmly planted on my head).

The development of this feature was sponsored by Globo and Mercado Libre and implemented by myself/Varnish Software (http://www.varnish-software.com).