Five Active Record Features You Should Be Using

Active Record is responsible for communicating with the persistence layer by default in Ruby on Rails applications. Using Active Record effectively and efficiently can greatly improve an application’s code.

In Ruby on Rails 4.0, some material changes have been made to Active Record. Understanding these changes, and how they are best utilized is important for any Rails developer.

To help explain these concepts, we can assume a Ruby on Rails application “booksandreviews.com” exists with three models:

class Book < ActiveRecord::Base
  belongs_to :author
  has_many :reviews
end

class Author < ActiveRecord::Base
  has_many :books
end

class Review < ActiveRecord::Base
  belongs_to :book
end

The smart people over at “booksandreviews.com” need to know what the state of their data to build metrics and make sales. Metrics need data and Active Record will be used to fetch that data.

1. Nested Queries

When doing database queries, the fewer the better. Since Active Record is responsible of crafting these queries, it is important to make sure it has all the help it needs. For simple queries this is rarely an issue, but more complex requirements can lead to sub-optimal results.

One day, Tim from sales comes rampaging through the office convinced that there must be a bug in the system. A recent sale for “booksandreviews.com” did not go well and he wants answers. Tim wants an analysis run. He wants all reviews published today, which are for books published in 2015.

Without too much thought, this approach seems reasonable:

book_ids = Book.where(publish_year: '2015').map(&:id)
# => SELECT "books".* FROM "books" WHERE (publish_year = '2015')

reviews = Review.where(publish_date: '2015-11-15',
                       book_ids: book_ids).to_a
# => SELECT "reviews".* FROM "reviews" WHERE "reviews"."publish_date" = '2015-11-15' AND "reviews"."book_ids" IN (1, 2, 3)

This will load the desired books, extract their id’s, and pass that result to the Review query. Not only does this generate two queries, it also wastes memory by creating an array of Book objects to map over and then another array of book_ids. With a large enough list of books, this could cause some serious problems.

Active Record’s where method returns an instance of ActiveRecord::Relation. These relations can be passed to other methods to aid in query construction. With the same request, we can save the map and array creation:

books = Book.where(publish_year: '2015')
# => ActiveRecord::Relation

reviews = Review.where(publish_date: '2015-11-15', book: books).to_a
# SELECT "reviews".* FROM "reviews" WHERE "reviews"."publish_date" = '2015-11-15' AND "reviews"."book_id" IN (SELECT "books"."id" FROM "books" WHERE "books"."publish_year" = '2015')

This still executes two SELECT statements but it nests them to let the database take care of memory allocation and optimization. The book_ids array is replaced with the books relation and is passed to the Review query.

Note: This can reduced to a single query with .joins, but for now we can assume that a nested query is desired.

2. DRY Scopes

Still fuming, Tim demands more information. Now he wants to know the list of all Books published in 2015 which have at least one approved Review. Since Reviews are subjective, they need to be approved in order to maintain the quality that “booksandreviews.com” is known for.

Luckily, a scope has been written on the Review class to accomplish this.

class Review < ActiveRecord::Base
  belongs_to :book

  scope :approved, ->{ where(approved: true) }
end

However, it is Books that we need to return, not Reviews. Repeating the scope definition, a join query can be used for this analysis:

books = Book.where(publish_year: '2015')
            .includes(:reviews)
            .references(:reviews)
            .where('reviews.approved = ?', true )
            .to_a
# => SELECT #long books and reviews column select# FROM "books" LEFT OUTER JOIN "reviews" ON "reviews"."book_id" = "books"."id" WHERE "books"."publish_year" = '2015' AND (reviews.approved = 't')

Books are returned at the cost of duplicating the approved scope. That means that the scope in Review changes, this code will not benefit from that change. The .includes and .references methods are used to ensure that we only return one Book (in the case of many Reviews belonging to the same Book).

The Don’t Repeat Yourself (DRY) principle was created for this exact reason. When identical code is not shared and instead repeated, changes to one version can have dangerous consequences on the other.

The good news is that Active Record provides precisely the medicine for this ailment: .merge.

With .merge, an existing scope can be used in another Active Record query.

books = Book.where(publish_year: '2015')
            .includes(:reviews)
            .references(:reviews)
            .merge(Review.approved)
            .to_a
# => SELECT #long books and reviews column select# FROM "books" LEFT OUTER JOIN "reviews" ON "reviews"."book_id" = "books"."id" WHERE "books"."publish_year" = '2015' AND (reviews.approved = 't')

Great! Now the results are the exact same and the code is DRY.

3. where.not

Typical insatiable Tim is back with yet another request to add to the brand new “totally not vanity metrics dashboard”. Now, he wants to know all the books not published in 2012.

Without even asking why such a silly request is necessary, some more code can be cranked out:

books = Book.where('publish_year != 2012').to_a
# => SELECT "books".* FROM "books" WHERE (publish_year != '2012')

Like before, this code works but could be better. There is some raw SQL in there that the next developer might not understand well enough to manipulate. Whatever the reason, it is best to rely on abstraction instead of explicit SQL.

To help solve this dilemma, the .not modifier has been introduced in Active Record 4.0.

books = Book.where.not(publish_year: 2012).to_a
# => SELECT "books".* FROM "books" WHERE (publish_year != '2012')

The result is the same but look how much nicer that is. Not only is the raw SQL gone, the code is more positive too.

4. first and take

Since “booksandreviews.com” has been around since 2012, chances are it upgraded from Ruby on Rails 3.0 to 4.0. One notable change from Active Record 3 to 4 is the behaviour of .first.

In Ruby on Rails 4.0+, the .first method returns the first row after the table has been ordered by its id.

Author.where(first_name: 'Bill').first
# => SELECT  "authors".* FROM "authors" WHERE "authors"."first_name" = "Bill" ORDER BY "authors"."id" ASC LIMIT 1

This will work fine for every table that has an id column. However, if a table does not need an id column, this method causes a problem.

Despite each Author having an id, complex joins might cause an issue with an implicit ORDER BY on queries.

To alleviate that problem, the take method can be used instead of first:

Author.where(first_name: 'Bill').take
# => SELECT  "authors".* FROM "authors" WHERE "authors"."first_name" = "Bill" LIMIT 1

This behaves in a much more explicit way, returning the same information without a default ordering.

5. .unscoped

During the development life of “booksandreviews.com”, countless modules have been built and gems included. Amidst this chaos, someone must have typed gem install hairball and horribly altered the Author class. This has led to the new guy Mike’s complaint that: “Authors are missing data”.

Mike knows that authors have a first_name but for some reason it is not being returned:

authors = Author.where(last_name: 'Smith').take(5)
authors.map(&:first_name)
# => [nil, nil, nil, nil, nil]

What Mike does not know is that one of those hairball gems added a default scope to all Active Record objects that begin with the letter “A”. However impossible this bug is, it exists and it is ruining Mike’s day.

What Mike needs is the .unscoped method. This method removes all existing scopes on an Active Record relation.

authors = Author.unscoped.where(last_name: 'Smith').take(5)
authors.map(&:first_name)
# => ['Frank', 'Frank', 'Jim', 'Frank', 'Frank']

(Is anyone else concerned that there are four Frank Smith authors?)

With the .unscoped method, all harmful default scopes are removed and the Franks are free.

Queries on Queries

With these five techniques (and probably a lot more), naiive Active Record queries can stay DRY and intuitive. The exhaustive list of what Active Record can provide can be found at Rails Guides.