Parallel testing #31900

eileencodes · 2018-02-05T18:55:32Z

@tenderlove and I worked on this which adds parallel testing to Rails applications by default. New applications will have parallel testing enabled by default, and older applications can add it to their test helper:

class ActiveSupport::TestCase
  parallelize(workers: 2)
end

Parallel testing in this implementation utilizes forking processes over threads. The reason we (tenderlove and eileencodes) chose forking processes over threads is forking will be faster with single databases, which most applications will use locally. Using threads is beneficial when tests are IO bound but the majority of tests are not IO bound. NOTE: after some experimentation we added a threaded parallelization method as well, but forked processes are still the default.

If an application doesn't want to use parallel testing they can either remove the parallelize block from the test application or set PARALLEL_WORKERS to 1 or fewer. For environments where you want to change the default number of workers from what you've set in your test_helper.rb you can export / set an environment variable to change the number of workers used. The following will use 15 workers and split the tests into 15 processes.

PARALLEL_WORKERS=15 bin/rails test

Note: While parallel testing will work with multiple primary databases, it currently doesn't rollback fixtures correctly. I'm actively working on fixing that but decided it was out of scope for this particular feature, since fixing it is not a feature of parallel testing but rather a bug / inconsistency in how Rails is handled. The fix for that should be coming shortly. Parallel testing and multiple primary databases does work with tests if not using fixtures. I'm not sure why I thought this but I just tested it locally again and the fixtures work. I think I had a bug in my setup the last time I tested this.

If you have multiple databases they can be setup like this in your test_helper.rb

class ActiveSupport::TestCase
  parallelize(workers: 2)

  parallelize_setup do |worker|
    # create a db w/ worker. Runs after processes are forked
  end

  parallelize_teardown do |worker|
    # delete the test databases or other cleanup. Runs before processes are closed
  end
end

To do:

Documentation
Guides
CHANGELOG

cc/ @tenderlove @dhh

tenderlove · 2018-02-05T19:02:30Z

This looks great! 3 things off the top of my head:

I'm not sure if Windows supports unix sockets. We should test that and possibly opt-out Windows folks.
JRuby definitely doesn't support fork. We might be able to get this to work with Process.spawn out of the box, but maybe we should opt-out JRuby folks for now too?
I like the class-level parallelize API for setting the number of workers, but do we care if people call the method more than once? (I think the answer is "no", but I just want to check)

nynhex · 2018-02-05T19:28:29Z

This is great work! Thanks @eileencodes @tenderlove ❤️

jasl · 2018-02-06T17:15:58Z

It seems Windows 10 upcoming RS4 release will support unix sockets partially, see https://blogs.msdn.microsoft.com/commandline/2017/12/19/af_unix-comes-to-windows/

metacritical · 2018-02-06T20:20:41Z

Does anyone use rails on Windows either development or production? MRI is still large i dont know anyone using JRUby on production. Opting them out for this release is OK.

GBH · 2018-02-06T20:27:11Z

Is this a replacement for https://github.com/grosser/parallel_tests ?

eileencodes · 2018-02-06T20:35:42Z

@metacritical I don't want to discount users of JRuby or Windows, however I think since we're a ways out from Rails 6.0 there's a ton of potential for this feature to evolve. I'd love to see a way to use threads over processes if someone finds that useful. I don't think it's necessary for the first iteration though. Once we have an API nailed down and this merged we can iterate on it and add a threads feature.

@GBH yes technically it replaces it, but not because we think the gem is doing anything incorrect. We're not personally using it at GitHub but we wrote our own implementation based on how we parallelize our own tests.

bogdanvlviv · 2018-02-06T21:45:26Z

activesupport/lib/active_support/testing/parallelization.rb

+        @url        = "drbunix://#{file}"
+        @pool       = []
+
+        DRb.start_service(@url, @queue)


I think we should generate uri by DRb in order to guarantee uniqueness. See #31591.

bogdanvlviv · 2018-02-06T21:48:48Z

activesupport/lib/active_support/testing/parallelization.rb

+
+require "drb"
+require "drb/unix"
+require "tempfile"


looks like we can remove require "tempfile" since we don't use Tempfile in this file.

bogdanvlviv · 2018-02-06T22:04:54Z

activerecord/lib/active_record/test_databases.rb

+      ActiveRecord::Tasks::DatabaseTasks.create(connection_spec)
+      ActiveRecord::Base.establish_connection(connection_spec)
+      if ActiveRecord::Base.connection.migration_context.needs_migration?
+        ActiveRecord::Tasks::DatabaseTasks.migrate


I think we should always execute (and rely on) ActiveRecord::Tasks::DatabaseTasks.migrate to ensure full setup for test database since this creates tables like schema_migrations, ar_internal_metadata etc.

They'll always need a migration because they'll always be brand new. I think however I'm going to drop this implementation and load up the structure/schema instead because it will be faster.

bogdanvlviv · 2018-02-06T22:21:13Z

activerecord/lib/active_record/test_databases.rb

+        ActiveRecord::Tasks::DatabaseTasks.migrate
+      end
+    ensure
+      ActiveRecord::Base.establish_connection(ActiveRecord::Base.configurations[Rails.env])


It is just question: can't realize why do we need to re-establish the connection?

We need to re-establish the original connection to AR Base since we want the test connection, not the other db connection. Otherwise the tests will try to all run against AR Base with the other db connection, not the test connection.

bogdanvlviv · 2018-02-06T22:52:11Z

activerecord/lib/active_record/test_databases.rb

+require "active_support/testing/parallelization"
+
+module ActiveRecord
+  module TestDatabases


I think we should express this module as private api by # :nodoc:

I haven't decided yet if I want to make this private or document it. I think it could be useful for setting up test dbs when you have multiple databases.

Decided to no doc this for now. Easier to document later than undocument.

alissonbrunosa · 2018-02-07T01:57:50Z

I'm afraid this problem is beyond the scope, but is there any way to define whether a test should run on threads or processes? For instance, if I had a test that is IO bound, it would be awesome to run on threads.
I don’t know, maybe this will solve the JRuby’s problem with fork as well.

yeongsheng-tan · 2018-02-07T02:29:30Z

Nice and sweet. 👏💖

simi · 2018-02-07T14:15:47Z

activesupport/lib/active_support/testing/parallelization.rb

+          end
+        end
+
+        def <<(o)


What not use delegator for << and pop methods?

dogweather · 2018-02-07T21:30:26Z

activesupport/lib/active_support/testing/parallelization.rb

+          end
+        end
+
+        def <<(o)


I'm wondering what o is — object? Is there some name we could give it that'd help describe what it's intended to be?

pangolingo · 2018-02-07T23:49:56Z

Sometimes I set breakpoints in tests with Pry or Byebug. I assume the might not work with multiple processes. Could we add a note to the docs mentioning that? (Sorry I haven't had a chance to test this for myself.)

yuki24 · 2018-02-08T20:39:46Z

@pangolingo when debugging with pry or byebug you should probably run a single test rather than running all of them. But at least byebug works well in a multi-threaded environment (pauses other threads when it hits byebug and resumes upon continue). I'm not sure how it behaves in a forked process.

tjschuck · 2018-02-14T17:08:12Z

activesupport/lib/active_support/test_case.rb

+      end
+
+      # Cleanup required for parallel testing. This can be used to drop databases
+      # if you're app uses multiple write/read databases or other clean up before


you're => your 😅

tjschuck · 2018-02-14T17:09:30Z

guides/source/testing.md

+end
+```
+
+The number of workers passes is the number of times the process will be forked. You may want to


passes => passed

tjschuck · 2018-02-14T17:10:07Z

guides/source/testing.md

+process. The databases will be suffixed with the number corresponding to the worker. For example, if you
+have 2 workers the tests will create `test-database-0` and `test-database-1` respectively.
+
+If the number of workers passes is 1 or fewer the processes will not be forked and the tests will not


passes => passed

eileencodes · 2018-02-15T19:52:07Z

Decided to add a threaded parallelizer after all. JRuby apps will automatically be generated using the threaded one. If you want to use threads just add with: :threads as a keyword argument.

I've updated docs, guides, and added a changelog.

eileencodes · 2018-02-15T19:53:32Z

activerecord/lib/active_record/test_databases.rb

+      connection_spec["database"] += "-#{i}"
+      ActiveRecord::Tasks::DatabaseTasks.create(connection_spec)
+      ActiveRecord::Base.establish_connection(connection_spec)
+      ActiveRecord::Tasks::DatabaseTasks.migrate


I plan on replacing this with a structure load or a straight copy of the database later on but currently structure load doesn't work with multiple databases so sticking with migrate for now.

Provides both a forked process and threaded parallelization options. To use add `parallelize` to your test suite. Takes a `workers` argument that controls how many times the process is forked. For each process a new database will be created suffixed with the worker number; test-database-0 and test-database-1 respectively. If `ENV["PARALLEL_WORKERS"]` is set the workers argument will be ignored and the environment variable will be used instead. This is useful for CI environments, or other environments where you may need more workers than you do for local testing. If the number of workers is set to `1` or fewer, the tests will not be parallelized. The default parallelization method is to fork processes. If you'd like to use threads instead you can pass `with: :threads` to the `parallelize` method. Note the threaded parallelization does not create multiple database and will not work with system tests at this time. parallelize(workers: 2, with: :threads) The threaded parallelization uses Minitest's parallel exector directly. The processes paralleliztion uses a Ruby Drb server. For parallelization via threads a setup hook and cleanup hook are provided. ``` class ActiveSupport::TestCase parallelize_setup do |worker| # setup databases end parallelize_teardown do |worker| # cleanup database end parallelize(workers: 2) end ``` [Eileen M. Uchitelle, Aaron Patterson]

simi · 2018-02-16T06:54:59Z

activerecord/lib/active_record/test_databases.rb

+
+      connection_spec = ActiveRecord::Base.configurations[spec_name]
+
+      connection_spec["database"] += "-#{i}"


What about underscore "_#{i}" to keep file based databases using underscores in file names?

They're temporary databases, unless it's going to break file based dbs I don't think underscore vs dash is a big deal.

IgorDobryn · 2018-02-18T08:23:15Z

Awesome!

Fudoshiki · 2018-02-21T01:43:24Z

@eileencodes
In rails 5.2

# frozen_string_literal: true

ENV['RAILS_ENV'] ||= 'test'
require_relative '../config/environment'
require 'rails/test_help'
require 'rubocop/rake_task'

RuboCop::RakeTask.new

class ActiveSupport::TestCase
  # Run tests in parallel with specified workers
  parallelize(workers: 2)

  # Add more helper methods to be used by all tests here...
end

rails t show failed rubocop tests

Now

# frozen_string_literal: true

ENV['RAILS_ENV'] ||= 'test'
require_relative '../config/environment'
require 'rails/test_help'
require 'rubocop/rake_task'

RuboCop::RakeTask.new

class ActiveSupport::TestCase
  # Add more helper methods to be used by all tests here...
end

rails t ignoring RuboCop::RakeTask.new

How use that now?

eileencodes · 2018-02-24T20:17:52Z

@Fudoshiki can you open a new issue explaining the problem you're having? From your comment I don't understand what the issue is. Thanks!

dapicester · 2018-02-26T09:30:16Z

activesupport/lib/active_support/test_case.rb

+                   when :threads
+                     Minitest::Parallel::Executor.new(workers)
+                   else
+                     raise ArgumentError, "#{with} is not a supported parallelization exectutor."


I believe this exectutor is a typo.

It was fixed by a4e226f

johnvross · 2018-03-19T13:08:04Z

Someone asked if anyone is using windows for rails development. Yes. I work for a state agency and use the windows subsystem for linux on windows for all development since the fall creators update and they fixed filewatchers. Just FYI we are out here.

chrishough · 2018-03-19T17:29:06Z

Does this apply to both rspec and minitest?

eileencodes · 2018-03-19T17:33:54Z

Hey @johnvross I know you're out there and I'm sorry someone asked that question as it's not an opinion the core team holds. I think you should be supported through the threaded parallelizer, but I'm not sure the Unix sockets that dRB relies on will work for you.

@chrishough this is using Minitest's parallel executor so I don't think it will work for rspec per say but we wrote the API in such a way that it's easy for us to add support for another parallelizer. This feature is still very new and it will be awhile before Rails 6 is released.

chrishough · 2018-03-19T17:46:25Z

Thanks @eileencodes. I was hoping it would replace https://github.com/grosser/parallel_tests, and I am definitely curious to see how this plays out.

WaKeMaTTa · 2018-05-24T23:36:08Z

@eileencodes if we use rspec or cucumber how we can use this feature?

thepeoplesbourgeois · 2018-07-16T16:25:21Z

@WaKeMaTTa I think that's the topic of the comments a few notes above yours; I think an adapter for rspec/cucumber needs to be written for this parallelizer

WaKeMaTTa · 2018-07-16T16:29:59Z

@thepeoplesbourgeois you are right. Thanks

Rails 6.0 introduces parallel testing and the default degree of parallelism is configured based on the number of CPU. refer rails#34735 rails#31900 When any minitest executed under the OS where 2 or more CPU available, SQLite 3 database files are not git-ignored. Also updated other files which ignores SQLite database files. * Steps to reproduce ``` $ git clone https://github.com/yahonda/rep_ignore_sqlite3_databases.git $ cd rep_ignore_sqlite3_databases/ $ bin/rails test $ git status ``` * Expected behavior: - No `Untracked files:` * Actual behavior: - SQLite 3 database files appeared ``` $ git status On branch master Untracked files: (use "git add <file>..." to include in what will be committed) db/test.sqlite3-0 db/test.sqlite3-1 db/test.sqlite3-2 db/test.sqlite3-3 db/test.sqlite3-4 db/test.sqlite3-5 nothing added to commit but untracked files present (use "git add" to track) $ ```

atef-ds · 2019-09-25T17:40:58Z

@eileencodes i have an issue with this solution:
sometimes we need to populate our test db with fake data before running the test => with parallel test our new db test are empty and we lost our fake data...
To fix it, i did:

my original db called "db_test" (contains some fake data),
i add a new one calle db_test_template: createdb -T db_test db_test_template
in test_helper:

parallelize(workers: 4)
parallelize_setup do |worker|
    configuration = ActiveRecord::Base.connection_config
    #delete db
    ActiveRecord::Tasks::DatabaseTasks.drop configuration.stringify_keys
    ActiveRecord::Base.establish_connection(configuration)

    configuration[:template] = ENV["POSTGRES_TEMPLATE"]
   #create a new db 
    ActiveRecord::Tasks::DatabaseTasks.create(configuration.stringify_keys)
  # now i have a db with data that i populated before in the original db
end

To run test :
POSTGRES_TEMPLATE=db_test_template rails test -v it works well for me
@eileencodes can we considerate somehow that we need to keep data that we populated before?

rails/rails#31900

Rails 6.0 introduces parallel testing and the default degree of parallelism is configured based on the number of CPU. refer rails/rails#34735 rails/rails#31900 When any minitest executed under the OS where 2 or more CPU available, SQLite 3 database files are not git-ignored. Also updated other files which ignores SQLite database files. * Steps to reproduce ``` $ git clone https://github.com/yahonda/rep_ignore_sqlite3_databases.git $ cd rep_ignore_sqlite3_databases/ $ bin/rails test $ git status ``` * Expected behavior: - No `Untracked files:` * Actual behavior: - SQLite 3 database files appeared ``` $ git status On branch master Untracked files: (use "git add <file>..." to include in what will be committed) db/test.sqlite3-0 db/test.sqlite3-1 db/test.sqlite3-2 db/test.sqlite3-3 db/test.sqlite3-4 db/test.sqlite3-5 nothing added to commit but untracked files present (use "git add" to track) $ ```

eileencodes added this to the 6.0.0 milestone Feb 5, 2018

eileencodes self-assigned this Feb 5, 2018

bogdanvlviv reviewed Feb 6, 2018

View reviewed changes

simi reviewed Feb 7, 2018

View reviewed changes

activesupport/lib/active_support/testing/parallelization.rb

end

end

def <<(o)

Copy link

Contributor

simi Feb 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What not use delegator for << and pop methods?

dogweather reviewed Feb 7, 2018

View reviewed changes

tjschuck reviewed Feb 14, 2018

View reviewed changes

eileencodes force-pushed the parallel-testing branch from 5be4dc1 to e012780 Compare February 15, 2018 19:50

eileencodes commented Feb 15, 2018

View reviewed changes

eileencodes changed the title ~~WIP: Parallel testing~~ Parallel testing Feb 15, 2018

eileencodes force-pushed the parallel-testing branch 2 times, most recently from 548f1de to 313a2a6 Compare February 15, 2018 20:12

eileencodes force-pushed the parallel-testing branch from 313a2a6 to 26821d9 Compare February 16, 2018 00:22

simi reviewed Feb 16, 2018

View reviewed changes

eileencodes merged commit 7286d81 into rails:master Feb 16, 2018

eileencodes deleted the parallel-testing branch February 16, 2018 13:09

dapicester reviewed Feb 26, 2018

View reviewed changes

bgentry mentioned this pull request Mar 4, 2018

Support parallel testing (coming in Rails 6) TalentBox/sequel-rails#159

Open

ashishbista mentioned this pull request May 2, 2018

Fast Parallel Tests with Docker and Containers RubyNepal/rorh#24

Closed

david-a-wheeler mentioned this pull request May 4, 2018

Parallelize test suite coreinfrastructure/best-practices-badge#1128

Open

palkan mentioned this pull request Jun 17, 2018

Multithreaded tests vs. testing utils #33146

Closed

FlowFX mentioned this pull request Aug 6, 2018

Use multiple test workers for parallel testing FlowFX/reggae-on-rails#102

Closed

zoras mentioned this pull request Apr 1, 2019

Add support for Rails 6 built-in Parallel Testing rspec/rspec-rails#2104

Open

zoras mentioned this pull request Apr 23, 2019

Support Multithread Execution in RSpec rspec/rspec-core#1254

Open

rmacklin mentioned this pull request May 15, 2019

Error using Rails 6 parallel testing and system tests: ChildProcess::LaunchError: Text file busy - /home/circleci/.webdrivers/chromedriver #36288

Closed

yahonda mentioned this pull request Aug 27, 2019

Ignore SQLite3 database files generated by parallel testing #37053

Merged

chrismaltais mentioned this pull request Sep 16, 2019

[Parallel Testing] byebug crashing on new Rails project #37207

Closed

cjlarose mentioned this pull request Oct 4, 2019

Log messages from parallel test workers are interleaved without any way to distinguish which log messages were written by which worker #37372

Closed

ioquatix mentioned this pull request Jul 6, 2020

Review existing literature ioquatix/turbo_test#2

Open

sanemat added a commit to sanemat/rails-boilerplate-mysql that referenced this pull request Dec 20, 2020

chore(ci): set parallel workers 1

ceeda46

rails/rails#31900


		connection_spec = ActiveRecord::Base.configurations[spec_name]

		connection_spec["database"] += "-#{i}"

Parallel testing #31900

Parallel testing #31900

Conversation

eileencodes commented Feb 5, 2018 • edited

tenderlove commented Feb 5, 2018

nynhex commented Feb 5, 2018

jasl commented Feb 6, 2018 • edited

metacritical commented Feb 6, 2018

GBH commented Feb 6, 2018

eileencodes commented Feb 6, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogdanvlviv Feb 6, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alissonbrunosa commented Feb 7, 2018 • edited

yeongsheng-tan commented Feb 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pangolingo commented Feb 7, 2018

yuki24 commented Feb 8, 2018 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eileencodes commented Feb 15, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

IgorDobryn commented Feb 18, 2018

Fudoshiki commented Feb 21, 2018 • edited

eileencodes commented Feb 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johnvross commented Mar 19, 2018

chrishough commented Mar 19, 2018

eileencodes commented Mar 19, 2018

chrishough commented Mar 19, 2018

WaKeMaTTa commented May 24, 2018

thepeoplesbourgeois commented Jul 16, 2018 • edited

WaKeMaTTa commented Jul 16, 2018

atef-ds commented Sep 25, 2019 • edited

eileencodes commented Feb 5, 2018 •

edited

jasl commented Feb 6, 2018 •

edited

bogdanvlviv Feb 6, 2018 •

edited

alissonbrunosa commented Feb 7, 2018 •

edited

yuki24 commented Feb 8, 2018 •

edited

Fudoshiki commented Feb 21, 2018 •

edited

thepeoplesbourgeois commented Jul 16, 2018 •

edited

atef-ds commented Sep 25, 2019 •

edited