Jorge Manrubia

July 21, 2021

Making Rails run just a few tests faster

IMG_3917.JPG


Some time ago, I learned that Rails parallel testing had a significant overhead due to database setup and fixture loading. Essentially, each process will set up its own database and load all the fixtures, and this is not free for non trivial datasets.

Fixtures are a factor here. At Basecamp, we use fixtures heavily and, when running single tests, the overhead of using parallelization in my box is around 5 seconds on average (with some way longer outliers). Of course, you can disable parallelization by setting PARALLEL_WORKERS=1 but I didn’t love having to care about setting env vars when running tests.

Recently I decided to take a stab at this problem. It resulted in this pull request. Rails will now disable parallelization automatically when the number of tests is under a configurable threshold. I thought I would have to extend Minitest to allow accessing the number of tests before tests started to run, but after some back and forth, I didn’t have to and ended up pretty happy with the result.

For determining a proper default threshold, I used this script. I was surprised by the results I got: 79 for HEY and 85 for Basecamp. So I conservatively went with 50 for the default.

After getting this merged, I discovered that there was a pretty recent attempt to solve this same problem here by Ricardo Díaz. I had missed it because I always run my tests from Rubymine, and Rubymine uses its own test runner, which is not based on Rails’ test task. Nevertheless, I think the new implementation is better since it doesn’t depend on the mechanism to launch the tests, and it is also based on the number of tests, not on the numbers of files passed to the test command.

As a bonus, Rails will now include a message about how tests are being run. For example:

Running 2829 tests in parallel using 8 processes

or

Running 15 tests in a single process (parallelization threshold is 50)

I love having parallel testing available in Rails, but I rarely use it. I always run single tests from my box and the full suite in the CI server. I hope this change will save myself and others a bunch of aggregated time in the form of a few seconds on each run.

About Jorge Manrubia

A programmer who writes about software development and many other topics. I work at 37signals.

jorgemanrubia.com