HTTP Caching in Ruby on Rails Applications

The fastest web page is one you've already loaded. Browsers love to avoid round-trips by caching assets. And HTTP provides ways for us to tell browsers what's changed and what hasn't - so they make the right decisions. In this article, Jonathan Miles introduces us to HTTP caching and shows us how to implement it in Rails.

A general way to describe caching is storing some data so that we can quickly retrieve it later. Sometimes, this means storing computed data so that it does not need to be re-computed, but it can also refer to storing data locally to avoid having to fetch it again. Your computer does this constantly, as your operating system tries to keep frequently accessed data in RAM so that it doesn't have to be fetched again from a hard drive or SSD.

Similarly, your browser tries to re-use resources it has already downloaded. You've probably seen this yourself when visiting a new website for the first time. The initial load takes longer because your browser has to pull down everything it needs, including all the images, javascript, and stylesheets. A fun fact is that when you freshly download the CNN homepage, your browser fetches more data than the original Doom game circa 1993. For the curious, at the time of writing this blog post, CNN downloads just over 3MB on my machine, compressed from ~15MB, and that's with an ad blocker enabled, while the original Doom installer was ~2.2MB.

For the browser to cache this data, there needs to be some coordination with the server. The browser needs to know what it can cache and for how long; otherwise, it could be showing you old content when the server has a newer version available. In this article, we'll look at how this client-server caching coordination is carried out and what Rails provides to alter it.

Although the focus is on how Ruby on Rails handles this, the actual mechanism is part of HTTP specifications. In other words, the caching we're talking about here is baked into the infrastructure of the internet, which makes it a cornerstone of how modern websites and frameworks are developed. Various frameworks, such as Rails, single-page applications (SPAs), and even static sites, can all use these mechanisms to help improve performance.

HTTP Request-Response

You're probably familiar with the request-response lifecycle, at least at a high level: you click a link on a website, your browser sends a request to the server for that content, and the server sends back that content (Note that I'm intentionally glossing over a lot of complexity here).

Let's dig a little bit into the actual data being sent in this back-and-forth transaction. Each HTTP message has a header and a body (not to be confused with <head> and <body> HTML tags). The request header tells the server which path you are trying to access and which HTTP method to use (e.g., GET/PUT/PATCH/POST). If needed, you can dig into these headers using either your browser's developer tools or a command-line tool, such as curl:

# curl -v honeybadger.io
...
> GET / HTTP/1.1
> Host: honeybadger.io
> User-Agent: curl/7.64.1
> Accept: */*

This first portion of the output is the request header. We're issuing a GET to honeybadger.io. This is then followed by what the server sent back (the "response header"):

>
< HTTP/1.1 301 Moved Permanently
< Cache-Control: public, max-age=0, must-revalidate
< Content-Length: 39
< Content-Security-Policy: frame-ancestors 'none'
...
< Content-Type: text/plain

The response includes the HTTP code (e.g., 200 for success or 404 for not found). In this example, it is a permanent redirect (301) because curl is trying to contact the http URL, which redirects to the secure https URL. The response header also includes the content type, which is text/plain here, but a few other common options are text/html, text/css, text/javascript, and application/json.

The response body follows the header. In our case, the body is blank because a 301 redirect does not need a body. If we tried again with curl -v https://www.honeybadger.io, you'd see the homepage content here, the same as if you were viewing the source in a browser.

If you want to experiment with this yourself here are two tips:

  1. To show only the response header with curl (e.g., no request headers or response body), use the -I option, as in curl -I localhost:3000.
  2. By default, Rails does not cache in a development environment; you may need to run rails dev:cache first.

The Cache-Control HTTP Header

The main header we care about, as far as caching goes, is the Cache-Control header. This helps determine which machines can cache a response from our Rails server and when that cached data expire. Within the Cache-Control header, there are several fields, most of which are optional. We'll go through some of the most relevant entries here, but for more information, you can check the official HTTP spec at w3.org.

Here's a sample from a basic out-of-the-box Rails response header:

< Content-Type: text/html; charset=utf-8
< Etag: W/"b41ce6c6d4bde17fd61a09e36b1e52ad"
< Cache-Control: max-age=0, private, must-revalidate

max-age

The max-age field is an integer containing the number of seconds the response is valid. By default, a Rails response for a view will have this set to 0 (i.e., the response expires immediately, and the browser should always get a new version).

public/private

Including public or private in the header sets which servers are allowed to cache the response. If the header includes private, it is only to be cached by the requesting client (e.g., the browser), not any other servers it may have passed through to get there, such as content delivery networks (CDNs) or proxies. If the header includes public, then these intermediary servers are allowed to cache the response. Rails sets each header to private by default.

must-revalidate

Rails also sets the must-revalidate field by default. This means that the client must contact the server to confirm that its cached version is still valid before it is used. To determine whether the cached version is valid, the client and server use ETags.

ETags

ETags are an optional HTTP header added by the server when it sends a response to the client. Typically, this is some sort of checksum on the response itself. When the client (i.e., your browser) needs to request this resource again, it includes the Etag it received (assuming it has a previous response cached) in the If-None-Match HTTP header. The server can then respond with a 304 HTTP code ("Not Modified") and an empty body. This means the version on the server hasn't changed, so the client should use its cached version.

There are two types of ETags: strong and weak (a weak tag is denoted with a W/ prefix). They behave the same way, but a strong ETag means the two copies of the resource (the version on the server and the one in the local cache) are 100% byte-for-byte identical. Weak ETags, however, indicate that the two copies may not be byte-for-byte identical, but the cached version can still be used. A common example of this is Rails' csrf_meta_tags helper, which creates a token that changes constantly; thus, even if you have a static page in your application, it will not be byte-for-byte identical when it's refreshed due to the Cross-Site-Request-Forgery (CSRF) token. Rails uses weak ETags by default.

ETags in Rails

Rails handles ETags automatically on views. It includes the ETag in outgoing headers and has middleware to check incoming ETag headers and returns 304 (Not Modified) codes when appropriate. Notably, however, because Rails generates views dynamically, it still has to do all the rendering work before it can figure out the ETag for that view. This means even if the ETags match, you are only saving the time and bandwidth it takes to send the data across the network, as opposed to something like view caching, where you can skip the rendering step completely if there's a cached version. However, Rails does provide a few ways to tweak the generated ETag.

stale?

One way to overcome the ever-changing CSRF token from changing the ETag is with the stale? helper in ActionController. This allows you to set the ETag (either strong or weak) directly. However, you can also simply pass it an object, such as an ActiveRecord model, and it will compute the ETag based on the object's updated_at timestamp or use the maximum updated_at if you pass a collection:

class UsersController < ApplicationController
  def index
    @users = User.includes(:posts).all

    render :index if stale?(@users)
  end
end

By hitting the page with curl, we can see the results:

# curl -I localhost:3000 -- first page load
ETag: W/"af9ae8f2d66b9b6c4d0513f185638f1a"
# curl -I localhost:3000 -- reload (change due to CSRF token)
ETag: W/"f06158417f290334f47ea2124e08d89d"

-- Add stale? to controller code

# curl -I localhost:3000 -- reload
ETag: W/"04b9b99835c359f36551720d8e3ca6fe" -- now using `@users` to generate ETag
# curl -I localhost:3000 -- reload
ETag: W/"04b9b99835c359f36551720d8e3ca6fe" -- no change

This gives us more control over when the client has to download the full payload again, but it still has to check with the server every time to determine whether its cache is still valid. What if we want to skip that check altogether? This is where the max-age field in the header comes in.

expires_in and http_cache_forever

Rails gives us a couple of helper methods in ActionController to adjust the max-age field: expires_in and http_cache_forever. They both work how you would expect based on their names:

class UsersController < ApplicationController
  def index
    @users = User.includes(:posts).all

    expires_in 10.minutes
  end
end
# curl -I localhost:3000
Cache-Control: max-age=600, private

Rails has set the max-age to 600 (10 minutes in seconds) and removed the must-revalidate field. You can also change the private field by passing a public: true named argument.

http_cache_forever is mostly just a wrapper around expires_in that sets the max-age to 100 years and takes a block:

class UsersController < ApplicationController
  def index
    @users = User.includes(:posts).all

    http_cache_forever(public: true) do
      render :index
    end
  end
end
# curl -I localhost:3000
Cache-Control: max-age=3155695200, public

This kind of extremely-long-term-caching is why Rails assets have a "fingerprint" appended to them, which is a hash of the file's content and creating filenames, such as packs/js/application-4028feaf5babc1c1617b.js. The "fingerprint" at the end effectively links the contents of the file with the name of the file. If the content ever changes, the filename will change. This means browsers can safely cache this file forever because if it ever changes, even in a small way, the fingerprint will change; as far as the browser is concerned, it's a completely separate file that needs to be downloaded.

Spheres of Influence

Now that we've covered some caching options, my advice might seem a bit odd, but I suggest that you try to avoid using any of the methods in this article! ETags and HTTP caching are good to know about, and Rails gives us some specific tools for addressing specific problems. The caveat, though, and it's a big one, is that all of this caching happens outside your application and is, therefore, largely outside your control. If you are using view caching or low-level caching in Rails, as covered in earlier parts of this series, and encounter invalidation issues, you have options; you can touch models, push updated code, or even reach into Rails.cache directly from the console if you have to, but not with HTTP caching. Personally, I'd much rather have to run Rails.cache.clear in production than face an issue where the site is broken for users until they clear their browser cache (your customer service team will love you for it too).

Conclusion

This is the end of the series on caching in Rails; hopefully, it was useful and informative. My advice for caching continues to be as follows: do as little as you can, but as much as you have to. If you experience performance problems, start by looking for methods that are hit often; perhaps, they can be memoized. Need the value to persist across requests? Maybe that heavily-used method could use some low-level caching. Or, perhaps, it's not any particular method; it's just crunching through all those nested partials, which are slowing things down; in this case, maybe view level caching can help. Rails gives us the "sharp knives" to target each of these issues, and we just need to know when to use them.

What to do next:
  1. Try Honeybadger for FREE
    Honeybadger helps you find and fix errors before your users can even report them. Get set up in minutes and check monitoring off your to-do list.
    Start free trial
    Easy 5-minute setup — No credit card required
  2. Get the Honeybadger newsletter
    Each month we share news, best practices, and stories from the DevOps & monitoring community—exclusively for developers like you.
    author photo

    Jonathan Miles

    Jonathan began his career as a C/C++ developer but has since transitioned to web development with Ruby on Rails. 3D printing is his main hobby but lately all his spare time is taken up with being a first-time dad to a rambunctious toddler.

    More articles by Jonathan Miles
    Stop wasting time manually checking logs for errors!

    Try the only application health monitoring tool that allows you to track application errors, uptime, and cron jobs in one simple platform.

    • Know when critical errors occur, and which customers are affected.
    • Respond instantly when your systems go down.
    • Improve the health of your systems over time.
    • Fix problems before your customers can report them!

    As developers ourselves, we hated wasting time tracking down errors—so we built the system we always wanted.

    Honeybadger tracks everything you need and nothing you don't, creating one simple solution to keep your application running and error free so you can do what you do best—release new code. Try it free and see for yourself.

    Start free trial
    Simple 5-minute setup — No credit card required

    Learn more

    "We've looked at a lot of error management systems. Honeybadger is head and shoulders above the rest and somehow gets better with every new release."
    — Michael Smith, Cofounder & CTO of YvesBlue

    Honeybadger is trusted by top companies like:

    “Everyone is in love with Honeybadger ... the UI is spot on.”
    Molly Struve, Sr. Site Reliability Engineer, Netflix
    Start free trial
    Are you using Sentry, Rollbar, Bugsnag, or Airbrake for your monitoring? Honeybadger includes error tracking with a whole suite of amazing monitoring tools — all for probably less than you're paying now. Discover why so many companies are switching to Honeybadger here.
    Start free trial
    Stop digging through chat logs to find the bug-fix someone mentioned last month. Honeybadger's built-in issue tracker keeps discussion central to each error, so that if it pops up again you'll be able to pick up right where you left off.
    Start free trial
    “Wow — Customers are blown away that I email them so quickly after an error.”
    Chris Patton, Founder of Punchpass.com
    Start free trial