TestProf: a good doctor for slow Ruby tests

August 15, 2017

Topics

Share this post on

Vladimir Dementyev
Principal Backend Engineer

Translations

JapaneseTestProf: Ruby/Railsの遅いテストを診断するgem
ChineseRuby慢测试的“良医圣手”

If you’re interested in translating or adapting this post, please contact us first.

Writing tests is a significant part of the development process, especially in the Ruby and Rails community. We don’t care about test suite performance until we find ourselves wasting too much time waiting for the green light.

Other parts:

TestProf: a good doctor for slow Ruby tests
TestProf II: Factory therapy for your Ruby tests
TestProf III: guided and automated Ruby test profiling

I’ve spent a lot of time analyzing test suites performance and developed some useful techniques and tools to make tests faster. I’ve bundled them all into a meta-gem called TestProf, the Ruby test profiling toolbox.

The Motivation

Slow tests waste your time by making you less productive

You’re probably asking yourself: “Why does test suite performance matter?” Let me show you some statistics before proposing any arguments.

I’ve conducted a short survey earlier this year where I’ve asked Ruby developers about their testing preferences.

First of all—good news, everyone—it turned out (and frankly speaking, I was not surprised) that most Ruby developers do write tests. And that’s what I like about Ruby community the most, by the way.

According to the survey, only a quarter of all test suites runs for more than 10 minutes—and about a half runs for less than 5 minutes.

**Question.** For how long does your whole suite run?

**Question.** For how long does your whole suite run?

It seems like the situation is not that bad, is it? Let’s only look at test suites having more than 1000 examples.

**Question.** For how long does your whole suite run? (having more than 1000 examples)

Now it looks much worse: about a half of all suites runs for more than 10 minutes, and almost 30%—for more than 20 minutes.

By the way, a typical Rails application I’ve been working on had 6K to 10K examples.

Of course, you don’t have to run the whole suite every time you make a change. Usually, I run about 100 examples before each commit while working on a mid-size feature, and it only takes about a minute. But even that “minute” affects my feedback loop (see Justin Searls talk) and wastes my time.

Nevertheless, we must run all tests during our deployment cycle, using CI services. Would you like to wait dozens of minutes (or even hours if there are loads of builds in the queue) to deploy a hotfix? I doubt so.

Concurrent builds may help, but they come at a price. Take a look at the chart below:

**Question.** How many parallel jobs do you run on your CI?

**Question.** How many parallel jobs do you run on your CI?

For example, in my current project we have 5x parallelism and our average (per job) RSpec time is 2 minutes and 30 seconds for 1250 examples. That means our EPM (examples per minute) equals 500.

Before any optimizations, 800 examples ran for 4 minutes—that is only 200 EPM! Now we are saving 3-4 minutes with every build.

So, no doubt, slow tests waste your time making you less productive.

The Toolbox

OK, you have realized that your test suite is slow. How to figure out why?

Let me skip all the lyrics and introduce TestProf—the Ruby test profiling toolbox.

TestProf aims to help you identify the bottlenecks in your test suite and provide you with recipes to fix them.

Let me show you how I am using it to analyze and improve test suites.

General Profiling

It is often useful to collect some general information before digging deeper into a test suite.

Try to answer the following questions:

Where does your suite spend more time: in controllers, models, services or jobs?
What is the most time-consuming module/method?

Not so easy, isn’t it?

To answer the first question, you can use Tag Profiler from TestProf, which allows you to collect statistics grouped by a particular RSpec tag value. RSpec automatically adds a type tag to examples, so we can use it:

TAG_PROF=type rspec

[TEST PROF INFO] TagProf report for type

          type          time   total  %total   %time           avg

    controller     08:02.721    2822   39.04   34.29     00:00.171
       service     05:56.686    1363   18.86   25.34     00:00.261
         model     04:26.176    1711   23.67   18.91     00:00.155
           job     01:58.766     327    4.52    8.44     00:00.363
       request     01:06.716     227    3.14    4.74     00:00.293
          form     00:37.212     218    3.02    2.64     00:00.170
         query     00:19.186      75    1.04    1.36     00:00.255
        facade     00:18.415      95    1.31    1.31     00:00.193
    serializer     00:10.201      19    0.26    0.72     00:00.536
        policy     00:06.023      65    0.90    0.43     00:00.092
     presenter     00:05.593      42    0.58    0.40     00:00.133
        mailer     00:04.974      41    0.57    0.35     00:00.121
        ...

Now you can focus only on some test types when searching for bottlenecks.

You’re likely aware of general Ruby profilers, such as RubyProf and StackProf.

TestProf helps you run them against your test suite without any tweaks:

TEST_RUBY_PROF=1 bundle exec rake test

# or

TEST_STACK_PROF=1 rspec

Reports generated by these profilers can help you identify your hottest stack paths and, hence, answer the second question.

Unfortunately, this type of profiling requires a lot of resources and slows down your already not-so-fast test suite even more. You have to run it against a small portion of tests, but how to choose this portion? Well, randomly!

TestProf includes a special patch which allows you to run random RSpec example groups (or Minitest suites):

SAMPLE=10 bundle exec rspec

Now try running StackProf against a sample of your controller tests (because they are the slowest according to TagProf) and read the output. When I did this for one of my projects, I saw the following:

%self     calls  name
20.85       721   <Class::BCrypt::Engine>#__bc_crypt
 2.31      4690  *ActiveSupport::Notifications::Instrumenter#instrument
 1.12     47489   Arel::Visitors::Visitor#dispatch
 1.04    205208   String#to_s
 0.87    531377   Module#===
 0.87    117109  *Class#new

It turned out that our encryption configuration for Sorcery was as strict in test environment as in production.

In a typical Rails application, most of the time you will see something like this in your reports:

 TOTAL    (pct)     SAMPLES    (pct)     FRAME
   205  (48.6%)          96  (22.7%)     ActiveRecord::PostgreSQLAdapter#exec_no_cache
    41   (9.7%)          22   (5.2%)     ActiveModel::AttributeMethods::#define_proxy_call
    20   (4.7%)          14   (3.3%)     ActiveRecord::LazyAttributeHash#[]

A lot of ActiveRecord stuff—a lot of database usage. Wondering how to deal with it? Keep reading.

Database Interactions

Do you know how much time does your test suite spend in the database? Try to guess first, and then use TestProf to calculate if for you.

We’ve already expanded on instrumentation in Rails (ActiveSupport Notification and Instrumentation features), so let me skip the basics and introduce the Event Profiler.

EventProf collects instrumentation metrics during your test suite run and provides a report with general information and top-N slowest groups and examples concerning the specified metrics. Currently, it supports only ActiveSupport::Notifications out-of-the-box, but can be easily integrated with your own solution.

To get the information about the database usage, we can use the sql.active_record event. Then the report could look like this (very similar to rspec --profile):

EVENT_PROF=sql.active_record rspec ...

[TEST PROF INFO] EventProf results for sql.active_record

Total time: 00:05.045
Total events: 6322

Top 5 slowest suites (by time):

MessagesController (./spec/controllers/messages_controller_spec.rb:3)–00:03.253 (4058 / 100)
UsersController (./spec/controllers/users_controller_spec.rb:3)–00:01.508 (1574 / 58)
Confirm (./spec/services/confirm_spec.rb:3)–00:01.255 (1530 / 8)
RequestJob (./spec/jobs/request_job_spec.rb:3)–00:00.311 (437 / 3)
ApplyForm (./spec/forms/apply_form_spec.rb:3)–00:00.118 (153 / 5)

For my current project, the amount of time spent in DB is about 20%—and that’s after a lot of optimizations have already been applied! Initially, it was more than 30%.

There is no single good value for this metric that fits every project. It highly depends on your testing style: whether you write more unit tests or integration tests.

We mostly write integration tests, by the way—and 20% is not that bad (but can be even better).

What are the typical reasons of the high database time? It’s impossible to name them all, let me state some of them:

Useless data generation
Heavy test preparations (before/setup hooks)
Factory cascades.

The first is the famous Model.new vs. Model.create problem (or build_stubbed vs. create in case of FactoryGirl)—you likely don’t need to hit the database to unit-test your models. So don’t do that, right?

But what if I already do that? How to figure out which tests don’t need persistence? Here comes Factory Doctor.

FactoryDoctor informs you when you are creating data which is not necessary:

FDOC=1 rspec

[TEST PROF INFO] FactoryDoctor report

Total (potentially) bad examples: 2
Total wasted time: 00:13.165

User (./spec/models/user_spec.rb:3)
  validates name (./spec/user_spec.rb:8)–1 record created, 00:00.114
  validates email (./spec/user_spec.rb:8)–2 records created, 00:00.514

Unfortunately, FactoryDoctor is not a magician (it is still learning) and sometimes it produces false negatives and false positives too.

The second problem is a bit more tricky. Consider an example:

describe BeatleSearchQuery do
  # We want to test our searching feature,
  # so we need some data for every example
  let!(:paul) { create(:beatle, name: 'Paul') }
  let!(:ringo) { create(:beatle, name: 'Ringo') }
  let!(:george) { create(:beatle, name: 'George') }
  let!(:john) { create(:beatle, name: 'John') }

  # and about 15 examples here
end

You might think: “Just use fixtures here.” And that’s a good point unless you’re working on a pretty big project with dozens of models changing every day.

Another option is to use a before(:all) hook and generate data once. But there is a caveat here—we have to cleanup the database manually, because before(:all) runs outside of the transaction.

Or, we can wrap the whole group in a transaction manually! That is exactly what TestProf’s before_all helper does:

describe BeatleSearchQuery do
  before_all do
    @paul = create(:beatle, name: 'Paul')
    @ringo = create(:beatle, name: 'Ringo')
    @george = create(:beatle, name: 'George')
    @john = create(:beatle, name: 'John')
  end

  # and about 15 examples here
end

If you want to share context between different groups (files), consider using Any Fixture, which allows you to generate fixtures from code (e.g. using factories).

Factory Cascades

Factory cascade is a very common, but rarely addressed problem that can bring your whole test suite to a crawl. In short, it is an uncontrollable process of generating excess data through nested factory invocations. TestProf knows how to deal with that, and we have dedicated a whole post to this subject alone—make sure you read it.

Background Jobs

Besides database bottlenecks, there are many others too, of course. Let’s discuss one of them.

There is a common practice to inline background jobs in tests (e.g. Sidekiq::Testing.inline!).

Usually, we put some heavy stuff in the background, thus running all jobs unconditionally slows down the run time.

TestProf supports profiling of the time spent in background jobs (currently, only for Sidekiq). Just tell it to profile against sidekiq.inline:

EVENT_PROF=sidekiq.inline bundle exec rspec

Now when you know the exact time wasted, what should you do? Simply turning off inline mode would likely break a lot of test examples—too many to fix them quickly.

The solution is to turn off inlining globally and use it only when necessary. If you use RSpec, you can do the following:

# Add a shared context
shared_context "sidekiq:inline", sidekiq: :inline do
  around(:each) { |ex| Sidekiq::Testing.inline!(&ex) }
end

# And use it when necessary
it "do some bg", sidekiq: :inline do
  # ...
end

You still have to add the tags to every failing example, don’t you? Thanks to TestProf, you don’t.

There is a special tool called RSpec Stamp, which is a part of the toolbox. It automagically adds the specified tags:

RSTAMP=sidekiq:inline rspec

By the way, RSpec Stamp uses Ripper under the hood to parse source files and correctly insert tags.

Read the complete manual on how to migrate from inline! to fake! in our guides.

TestProf is published on GitHub and rubygems.org and is ready to be plugged into your application to help you improve your test suite performance.

This post is only an introduction to TestProf and doesn’t cover all the features. Jump to the next part of this series: TestProf II: Factory therapy for your Ruby tests to learn more about TestProf’s doctor bag of tools that can make your tests snappier, your TDD feedback loop shorter, and can help you become a happier Ruby developer.