Kristian Lyngstøl's Blog

Varnish 3.0.0 - RSN!

Posted on 2011-06-09

Varnish 3.0.0 beta2 was just released (http://www.varnish-cache.org/lists/pipermail/varnish-announce/2011-June/000032.html), and we're aiming for 3.0.0 next week.

The release date is set for Thursday the 16th (next week(June, 2011, for the potential future archive crawlers)), and several release parties are planned. all around the world (http://v3party.varnish-cache.org/).

This will be a very special day for everyone in Varnish Software. Varnish 2.0.0 was released roughly the same week I started working at Redpill Linpro, before I was really involved with Varnish, and I still regret that I didn't snatch one of those fancy 2.0-t-shirts.

What's new?

While there are too many news to mention, there are two that stand out more than anything.

The first is gzip compression and the second is the Varnish module-architecture, or simply vmods. That most of ESI has been re-factored under the hood will be evident in future releases, and the same goes for streaming delivery.

Compression

  • "What, you don't already have compression?!"

Remember, Varnish is a caching service. Most of the time you don't need it to do the compression, because the web server will do it for you. But there are good reasons for why you want it. ESI is the primary use-case.

With ESI, Varnish needs to parse the content, and it can't do that if it doesn't understand the transfer encoding. With Varnish 2.1, you have to send uncompressed data to Varnish if you want to use ESI. This means you have to either deliver uncompressed content to your clients, or use yet an other service in front of Varnish.

But let me get a bit technical, because Varnish 3.0's compression is pretty awe-inspiring.

So with ESI, you have multiple, individually cached elements that make up a single user-visible page. So what we could do, is glue it all together and compress it before we send it. The downside is that we'll do the compression over and over. An alternative would be to cache the compressed result as long as the individual elements are unchanged, but that will require more space and more complexity.

So what does Varnish 3.0 does? It stores the elements compressed, modifies the right gzip-bits and glues it all together on the fly, without decompressing it. If a single element is changed, only that element needs to be updated. This is probably the best solution, even if the complexity of meddling with binary gzip headers directly can lead to some pretty tricky code. I challenge you to find a solution that handles compression in a smarter way.

Varnish 3.0 also does decompression. If your browser doesn't support compression (Possibly a script or other tool, real browsers support compression), Varnish will decompress the object for you on the fly. This is an other huge improvement over Varnish 2.1. In Varnish 2.1, this is solved using Vary: and storing different copies of the same object based on compression.

We can also do the same with the backend-data: If ESI needs to read the data, Varnish 3.0 can decompress it on the fly, parse it, then re-compress it before storing it.

And for you, the user, the complexity is fairly non-existent. Push the button, remove that nginx(no hard feelings)server you had doing compression, ????, profit!

VMODS!

VCL, the Varnish Configuration Language, is a flexible way of configuring Varnish. Since it is already translated from VCL to C, the compiled and linked in to the running Varnish instance, VCL has "always" had in-line C: Anywhere you want to, you can tell the VCL compiler that this is pure C code, and pass it directly to the compiler instead of trying to translate from VCL to C first. This was mainly provided because:

  1. It didn't add any complexity and was actually less work.
  2. It provided an escape chute for features we didn't want in Varnish-proper, but were valid for some users.

What features could that be? Syslog-support, geoip-integration, cryptographic verification of authorization headers, etc.

It turned out to be very useful, but impractical and difficult to re-use. Since it was all glued to your VCL, it meant that you had to stick C code in the middle of your regular configuration, and it made it very hard to combine two different features if you didn't know C yourself. And linking towards external libraries required parameter changes.

Enter varnish modules.

Simply put, vmods are a way of letting you do the same thing you can do with in-line C, but in a more sustainable manner. A vmod will typically have a few functions exposed to VCL, and the VCL just has to declare that it imports the vmod to use the functions without a single line of C-code in your VCL.

This also means that the vmod has its own build system, own linking process, flexible development environment and are much easier to share.

In the time to come, I expect us to have a community-site for vmods. I also expect that you will see a lot of minor yet important changes to Varnish during the Varnish 3.0-cycle that exposes more of the internal varnish utilities so vmods can use them.

Small disclaimer, though: There is no API guarantee. We don't wish to slow the pace of Varnish-development by restricting what we can change. That said, we wont tear things apart just to see our vmod-contributors bleed.

I plan to write a blog post on vmod-hacking in the near future, so expect more detail there.

Finishing words

Varnish 1.0 was, as any 1.0-release is, important. I was not involved with the project back then, but as I understand it, Varnish 1 was very much tailored to a specific site, or type of site.

With Varnish 2.0, Varnish became useful for anyone with a high-traffic site. The 2.0-series was a good release-series, adding a healthy mixture of bug fixes and non-intrusive features with each release. But constantly focused on real-world usage. We also saw the first Varnish User Group meeting during that time.

Varnish 2.1 has been a sort of intermittent release. Director-refactoring, ESI and to a certain degree persistent storage paved the way for architectural changes that had to be done. Meanwhile, the user-base really exploded.

Now, with Varnish 3.0.0 coming out, we are already seeing how useful vmods are for Varnish Software-customers like header vmod (https://github.com/KristianLyng/libvmod-header) (Still under development). Gzip also means there are no drawbacks to using ESI with Varnish. You don't need to add a second service to compress data.

It all boils down to Varnish being a more useful part of your architecture. It's easy to get fast and maintainable C-code in there if you need it, you can even pay someone to write just that bit for you. Without being confined to Varnish' own road map and release cycle. There are no longer any downsides to using ESI. It's fast. It's free software. There are professional service and support offerings available (http://www.varnish-software.com). Varnish follow standards - unless you tell it not to. And so forth.

I'm not usually one to pat myself too much on the back, but as I'm writing this, I feel proud to be part of the team that gives you Varnish 3.0.

If you're in Oslo, I'll see you next Thursday!