Thursday 8 November 2012

Caching 301 redirects in Varnish while keeping the protocol

I have noticed a drop in our Varnish cache hit ratio and, upon investigation, found that the backend was not allowing 301 redirects to be cached:

Client request:
   42 RxRequest    c GET
   42 RxURL        c /img_resized/au/images/gifts/2012/hero/christmas-homepage-hero-alyce.jpg
   42 RxProtocol   c HTTP/1.1

Backend request: 
   21 TxRequest    b GET
   21 TxURL        b /img_resized/au/images/gifts/2012/hero/christmas-homepage-hero-alyce.jpg
   21 TxProtocol   b HTTP/1.1

Backend response: 
   21 RxHeader     b Cache-Control: no-cache
   21 RxHeader     b Location: http://static.example.com/img/au/images/gifts/2012/hero/christmas-homepage-hero-alyce.jpg
   21 RxHeader     b Status: 301

I logged a bug for the application to be fixed and, in the meantime, added a VCL snippet to override the backend headers:

        # WSF-xxxx: App is not allowing redirects to be cached
        if ((req.http.Host == "static.example.com") &&
           (beresp.status == 301) &&
           (beresp.http.Cache-Control ~ "no-cache")) {
                set beresp.http.Cache-Control = "public, max-age=604800";
        }

There's a catch though.  We do SSL offloading for static.example.com in our F5 BigIP LTM load balancer.  In other words, the load balancer receives a https request and, in turn, makes a http request to Varnish.

As Varnish doesn't support SSL natively it is unaware of the protocol (http or https) being used; thus, the requests below get cached with the first answer it sees:

HTTP request
curl -I http://static.example.com/img/au/images/gifts/2012/hero/christmas-homepage-hero-alyce.jpg
(...)
HTTP/1.1 301 Moved Permanently
Location: http://static.example.com/img/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg

HTTPS request
curl -I https://static.example.com/img/au/images/gifts/2012/hero/christmas-homepage-hero-alyce.jpg
(...)
HTTP/1.1 301 Moved Permanently
Location: http://static.example.com/img/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg

Notice that the HTTPS request was redirected to a HTTP URL.  This will most certainly cause issues, including mixed-mode SSL alerts in some browsers.

Our backend is a Ruby on Rails application and, to my surprise, it understands the X-Forwarded-Proto HTTP header.  This means that the application will use the value of X-Forwarded-Proto to build the URL in the Location header.

I improved the F5 iRule by instructing the load balancer to insert a X-Forwarded-Proto header in HTTPS requests which were being off-loaded to Varnish:

when HTTP_REQUEST {
      HTTP::header replace X-Forwarded-Proto "https"
}

In addition to this, I also had to instruct Varnish to use the X-Forwarded-Proto header in the cache hash:

sub vcl_hash {
if (req.http.X-Forwarded-Proto) {
set req.hash += req.http.X-Forwarded-Proto;
}
}

With these tweaks Varnish is now able to account for SSL offloading and correctly serve the appropriate cached version of a page.  An improvement on this configuration would be to only use X-Forwarded-Proto in the hash if the response is a redirect; at the moment I can't be bothered that Varnish is caching objects twice.

HTTP request
$curl -I http://static.example.com/img_resized/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg
HTTP/1.1 301 Moved Permanently
Location: http://static.example.com/img/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg

HTTPS request
$curl -I https://static.example.com/img_resized/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg
HTTP/1.1 301 Moved Permanently
Location: https://static.example.com/img/au/images/gifts/2012/rectangle-75-58/christmas-promo-tile-alyce.jpg

No comments: