Abstraction in Rails

by Jason Swett,

If we wanted to, we could, of course, write web applications in assembly code. Computers can understand assembly code just as well as Ruby or Python or any other language.

The reason we write programs in higher-level languages like Ruby or Python is that while assembly language is easy for computers to understand, it’s of course not easy for humans to understand.

High-level languages (Ruby, Python, Java, C++, etc.) provide a layer of abstraction. Instead of having to think about a bunch of low-level details that we don’t care about most of the time, we can specify the behavior of our programs at a higher, more abstracted level. Instead of having to expend mental energy on things like memory locations, we can focus on what our program actually does.

In addition to using the abstractions provided by high-level languages, we can also add our own abstractions. A function, for example, is an abstraction that hides low-level details. An object can serve this purpose as well.

We’ll come back to some technical details regarding what abstraction is. First let’s gain a deeper understanding of what an abstraction is using an analogy.

Abstraction at McDonald’s

Let’s say I go to McDonald’s and decide that I want a Quarter Pounder Meal. The way I express my wishes to the cashier is by saying “Quarter Pounder Meal”. I don’t specify the details: that I want a fried beef patty between two buns with cheese, pickles, onions, ketchup and mustard, along with a side of potatoes peeled and cut into strips and deep-fried. Neither me nor the cashier cares about most of those details most of the time. It’s easier and more efficient for us to use a shorthand idea called a “Quarter Pounder Meal”.

The benefit of abstraction

As a customer, I care about a Quarter Pounder Meal at a certain level of abstraction. I don’t particularly care whether the ketchup goes on before the mustard or if the mustard goes on before the ketchup. In fact, I don’t even really think about ketchup and mustard at all most of the time, I just know that I like Quarter Pounders and that’s what I usually get at McDonald’s, so that’s what I’ll get. For me to delve any further into the details would be for me to needlessly waste brainpower. To me, that’s the benefit of abstraction: abstraction lets me go about my business without having to give or receive information that’s more detailed than I need or want. And of course the benefit of not having to work with low-level details is that it’s easier.

Levels of abstraction

Even though neither the customer nor the cashier want to think about most of the low-level details of a Quarter Pounder Meal most of the time, it’s true that sometimes they do want to think about those details. If somebody doesn’t like onions for example, they can drop down a level of abstraction and specify the detail that they would like their Quarter Pounder without onions. Another reason to drop down a level of abstraction may be that you don’t know what toppings come on a Quarter Pounder, and you want to know. So you can ask the cashier what comes on it and they can tell you. (Pickles, onions, ketchup and mustard.)

The cook cares about the Quarter Pounder Meal at a level of abstraction lower. When a cook gets an order for a Quarter Pounder, they have to physically assemble the ingredients, so they of course can’t not care about those details. But there are still lower-level details present that the cook doesn’t think about most of the time. For example, the cook probably usually doesn’t think about the process of pickling a cucumber and then slicing it because those steps are already done by the time the cook is preparing the hamburger.

What would of course be wildly inappropriate is if me as the customer specified to the cashier how thick I wanted the pickles sliced, or that I wanted dijon mustard instead of yellow mustard, or that I wanted my burger cooked medium-rare. Those are details that I’m not even allowed to care about. (At least I assume so. I’ve never tried to order a Quarter Pounder with dijon mustard.)

Consistency in levels

Things tend to be easiest when people don’t jump willy-nilly from one level of abstraction to another. When I’m placing an order at McDonald’s, everything I tell the cashier is more or less a pre-defined menu item or some pre-agreed variation on that item (e.g. no onion). It would probably make things weird if I were to order a Quarter Pounder Meal and also ask the cashier to tell me the expiration dates on their containers of ketchup and mustard. The cashier is used to taking food orders and not answering low-level questions about ingredients. If we jump among levels of abstraction, it’s easy for the question to arise of “Hang on, what are we even talking about right now?” The exchange is clearer and easier to understand if we stick to one level of abstraction the whole time.

Abstraction in Rails

In the same way that abstraction can ease the cognitive burden when ordering a Quarter Pounder, abstraction can ease the cognitive burden when working with Rails apps.

Sadly, many Rails apps have a near-total lack of abstraction. Everything that has anything to do with a user gets shoved into app/models/user.rb, everything that has anything to do with an order gets shoved into app/models/order.rb, and the result is that every model file is a mixed bag of wildly varying levels of abstraction.

Soon we’ll discuss how to fix this. First let’s look at an anti-example.

Abstraction anti-example

Forem, the organization behind dev.to, makes its code publicly available on GitHub. At the risk of being impolite, I’m going to use a piece of their code as an example of a failure to take advantage of the benefits of abstraction.

Below is a small snippet from a file called app/models/article.rb. Take a scroll through this snippet, and I’ll meet you at the bottom.

class Article < ApplicationRecord
  # The trigger `update_reading_list_document` is used to keep the `articles.reading_list_document` column updated.
  #
  # Its body is inserted in a PostgreSQL trigger function and that joins the columns values
  # needed to search documents in the context of a "reading list".
  #
  # Please refer to https://github.com/jenseng/hair_trigger#usage in case you want to change or update the trigger.
  #
  # Additional information on how triggers work can be found in
  # => https://www.postgresql.org/docs/11/trigger-definition.html
  # => https://www.cybertec-postgresql.com/en/postgresql-how-to-write-a-trigger/
  #
  # Adapted from https://dba.stackexchange.com/a/289361/226575
  trigger
    .name(:update_reading_list_document).before(:insert, :update).for_each(:row)
    .declare("l_org_vector tsvector; l_user_vector tsvector") do
    <<~SQL
      NEW.reading_list_document :=
        setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.title, ''))), 'A') ||
        setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_tag_list, ''))), 'B') ||
        setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.body_markdown, ''))), 'C') ||
        setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_user_name, ''))), 'D') ||
        setweight(to_tsvector('simple'::regconfig, unaccent(coalesce(NEW.cached_user_username, ''))), 'D') ||
        setweight(to_tsvector('simple'::regconfig,
          unaccent(
            coalesce(
              array_to_string(
                -- cached_organization is serialized to the DB as a YAML string, we extract only the name attribute
                regexp_match(NEW.cached_organization, 'name: (.*)$', 'n'),
                ' '
              ),
              ''
            )
          )
        ), 'D');
    SQL
  end
end

Given that dev.to is largely a blogging site, the concept of an article must be one of the most central concepts in the application. I would imagine that the Article would have a lot of concerns, and the 800-plus-line article.rb file, which contains a huge mix of apparently unrelated stuff, shows that the Article surely in fact does have a lot of concerns connected to it.

Among these concerns, whatever this trigger thing does is obviously a very peripheral one. If you were unfamiliar with the Article model and wanted to see what it was all about, this database trigger code wouldn’t help you get the gist of the Article at all. It’s too peripheral and too low-level. The presence of the trigger code is not only not helpful, it’s distracting.

The trigger code is at a much lower level of abstraction than you would expect to see in the Article model.

The fix to this particular problem could be a very simple one: just move the trigger code out of article.rb and put it in a module somewhere.

class Article < ApplicationRecord
  include ArticleTriggers
end

The trigger code itself is not that voluminous, and I imagine it probably doesn’t need to be touched that often, so it’s probably most economical to just move that code as-is into ArticleTriggers without trying to improve it.

Another anti-example

Here’s a different example which we’ll address in a little bit of a different way.

There are a couple methods inside article.rb, evaluate_markdown and evaluate_front_matter.

class Article < ApplicationRecord
  def evaluate_markdown
    fixed_body_markdown = MarkdownProcessor::Fixer::FixAll.call(body_markdown || "")
    parsed = FrontMatterParser::Parser.new(:md).call(fixed_body_markdown)
    parsed_markdown = MarkdownProcessor::Parser.new(parsed.content, source: self, user: user)
    self.reading_time = parsed_markdown.calculate_reading_time
    self.processed_html = parsed_markdown.finalize

    if parsed.front_matter.any?
      evaluate_front_matter(parsed.front_matter)
    elsif tag_list.any?
      set_tag_list(tag_list)
    end

    self.description = processed_description if description.blank?
  rescue StandardError => e
    errors.add(:base, ErrorMessages::Clean.call(e.message))
  end

  def evaluate_front_matter(front_matter)
    self.title = front_matter["title"] if front_matter["title"].present?
    set_tag_list(front_matter["tags"]) if front_matter["tags"].present?
    self.published = front_matter["published"] if %w[true false].include?(front_matter["published"].to_s)
    self.published_at = parse_date(front_matter["date"]) if published
    set_main_image(front_matter)
    self.canonical_url = front_matter["canonical_url"] if front_matter["canonical_url"].present?

    update_description = front_matter["description"].present? || front_matter["title"].present?
    self.description = front_matter["description"] if update_description

    self.collection_id = nil if front_matter["title"].present?
    self.collection_id = Collection.find_series(front_matter["series"], user).id if front_matter["series"].present?
  end
end

These methods seem peripheral from the perspective of the Article model. They also seem related to each other, but not very related to anything else in Article.

These qualities to me suggest that this pair of methods are a good candidate for extraction out of Article in order to help keep Article at a consistent, high level of abstraction.

“Evaluate markdown” is pretty vague. Evaluate how? It’s not clear exactly what’s supposed to happen. That’s fine though. We can operate under the presumption that the job of evaluate_markdown is to clean up the article’s body. Here’s how we could change the code under that presumption.

class Article < ApplicationRecord
  def evaluate_markdown
    body_markdown = ArticleBody.new(body_markdown).cleaned
  end
end

With this new, finer-grained abstraction called ArticleBody, Article no longer has to be directly concerned with cleaning up the article’s body. Cleaning up the article’s body is a peripheral concern to Article. Understanding the detail of cleaning up the article’s body is neither necessary nor helpful to the task of trying to understand the essence of the Article model.

Further abstraction

If we wanted to, we could conceivably take the contents of evaluate_markdown and evaluate_front_matter change them to be at a higher level of abstraction.

Right now the bodies of those methods seem to deal at a very low level of abstraction. They deal with how to do the work rather than what the end product should be. In order to understand what evaluate_markdown does, we have to understand every detail of what evaluate_markdown does, because it’s just a mixed bag of low-level details.

If evaluate_markdown had abstraction, then we could take a glance at it and easily understand what it does because everything that happens would be expressed in the high-level terms of what rather than the low-level terms of how. I’m not up to the task of trying to refactor evaluate_markdown in this blog post, though, because I suspect what’s actually needed is a much deeper change and a different approach altogether, rather than just a superficial polish. Changes of that depth that require time and tests.

How I maintain a consistent level of abstraction in my Rails apps

I try not to let my Active Record models get cluttered up with peripheral concerns. When I add a new piece of behavior to my app, I usually put that behavior in one or more PORO models rather than an Active Record model. Or, sometimes, I put that behavior in a concern or mixin.

The point about PORO models is significant. In the Rails application that I maintain at my job, about two-thirds of my models are POROs. Don’t make the mistake of thinking that a Rails model has to be backed by Active Record.

Takeaways

  • Abstraction is the ability to engage with an idea without having to be encumbered by its low-level details.
  • The benefit of abstraction is that it’s easier on the brain.
  • Active Record models can be made easier to understand by keeping peripheral concerns out of the Active Record models and instead putting them in concerns, mixins or finer-grained PORO models.

7 thoughts on “Abstraction in Rails

  1. Guillaume Verger

    I agree with Nicolas, your posts get better and better!
    I really like the food comparison for abstractions, it makes it pretty obvious why it is bad to mix different levels together.

    Reply
  2. James

    Really great article, the point about the McDonalds abstraction really brings the whole concept together.

    I have a question, you mentioned:

    “In the Rails application that I maintain at my job, about two-thirds of my models are POROs”

    I’m curious why do you use models instead of service objects? Is there some benefit you get from using models not backed by ActiveRecord instead of the service object pattern?

    Reply
    1. Jared White

      In my opinion, there’s no such thing as “service objects”. What you’re describing is likely the Command pattern, and I feel like that it serves a very specific and not at all generic purposes within a codebase. Unfortunately, people will just generate a huge volume of Command pattern objects which are then used basically like procedural functions in an imperative (rather than declarative) manner. It just doesn’t adhere to good design principles of OOP. I know Jason’s written a lot about this topic and I’m really on board with his observations.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *