Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delegated type to Active Record #39341

Merged
merged 16 commits into from May 23, 2020
Merged

Add delegated type to Active Record #39341

merged 16 commits into from May 23, 2020

Conversation

dhh
Copy link
Member

@dhh dhh commented May 18, 2020

Class hierarchies can map to relational database tables in many ways. Active Record, for example, offers purely abstract classes, where the superclass doesn't persist any attributes, and single-table inheritance, where all attributes from all levels of the hierarchy are represented in a single table. Both have their places, but neither are without their drawbacks.

The problem with purely abstract classes is that all concrete subclasses must persist all the shared attributes themselves in their own tables (also known as class-table inheritance). This makes it hard to do queries across the hierarchy. For example, imagine you have the following hierarchy:

Entry < ApplicationRecord
Message < Entry
Comment < Entry

How do you show a feed that has both Message and Comment records, which can be easily paginated? Well, you can't! Messages are backed by a messages table and comments by a comments table. You can't pull from both tables at once and use a consistent OFFSET/LIMIT scheme.

You can get around the pagination problem by using single-table inheritance, but now you're forced into a single mega table with all the attributes from all subclasses. No matter how divergent. If a Message has a subject, but the comment does not, well, now the comment does anyway! So STI works best when there's little divergence between the subclasses and their attributes.

But there's a third way: Delegated types. With this approach, the "superclass" is a concrete class that is represented by its own table, where all the superclass attributes that are shared amongst all the "subclasses" are stored. And then each of the subclasses have their own individual tables for additional attributes that are particular to their implementation. This is similar to what's called multi-table inheritance in Django, but instead of actual inheritance, this approach uses delegation to form the hierarchy and share responsibilities.

Let's look at that entry/message/comment example using delegated types:

# Schema: entries[ id, account_id, creator_id, created_at, updated_at, entryable_type, entryable_id ]
class Entry < ApplicationRecord
  belongs_to :account
  belongs_to :creator
  delegated_type :entryable, types: %w[ Message Comment ]
end

module Entryable
  extend ActiveSupport::Concern

  included do
    has_one :entry, as: :entryable, touch: true
  end
end

# Schema: messages[ id, subject ]
class Message < ApplicationRecord
  include Entryable
  has_rich_text :content
end

# Schema: comments[ id, content ]
class Comment < ApplicationRecord
  include Entryable
end

As you can see, neither Message nor Comment are meant to stand alone. Crucial metadata for both classes resides in the Entry "superclass". But the Entry absolutely can stand alone in terms of querying capacity in particular. You can now easily do things like: Account.entries.order(created_at: :desc).limit(50).

Which is exactly what you want when displaying both comments and messages together. The entry itself can be rendered as its delegated type easily, like so:

# entries/_entry.html.erb
<%= render "entries/entryables/#{entry.entryable_name}", entry: entry %>

# entries/entryables/_message.html.erb
<div class="message">
  Posted on <%= entry.created_at %> by <%= entry.creator.name %>: <%= entry.message.content %>
</div>

# entries/entryables/_comment.html.erb
<div class="comment">
  <%= entry.creator.name %> said: <%= entry.comment.content %>
</div>

Sharing behavior with concerns and controllers

The entry "superclass" also serves as a perfect place to put all that shared logic that applies to both messages and comments, and which acts primarily on the shared attributes. Imagine:

class Entry < ApplicationRecord
  include Eventable, Forwardable, Redeliverable
end

Which allows you to have controllers for things like ForwardsController and RedeliverableController that both act on entries, and thus provide the shared functionality to both messages and comments.

Creating new records

You create a new record that uses delegated typing by creating the delegator and delegatee at the same time, like so:

Entry.create! message: Comment.new(content: "Hello!"), creator: Current.user

If you need more complicated composition, or you need to perform dependent validation, you should build a factory method or class to take care of the complicated needs. This could be as simple as:

class Entry < ApplicationRecord
  def self.create_with_comment(content, creator: Current.user)
    create! message: Comment.new(content: content), creator: creator
  end
end

Adding further delegation

The delegated type shouldn't just answer the question of what the underlying class is called. In fact, that's an anti-pattern most of the time. The reason you're building this hierarchy is to take advantage of polymorphism. So here's a simple example of that:

class Entry < ApplicationRecord
  delegated_type :entryable, types: %w[ Message Comment ]
  delegates :title, to: :entryable
end

class Message < ApplicationRecord
  def title
    subject
  end
end

class Comment < ApplicationRecord
  def title
    content.truncate(20)
  end
end

Now you can list a bunch of entries, call Entry#title, and polymorphism will provide you with the answer.

dhh and others added 2 commits May 18, 2020 13:58
Co-authored-by: Aaron Patterson <aaron.patterson@gmail.com>
Copy link
Contributor

@terracatta terracatta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits I saw reading through the docs. Super cool!

activerecord/lib/active_record/delegated_type.rb Outdated Show resolved Hide resolved
Co-authored-by: Jason Meller <jason@kolide.co>
Copy link
Contributor

@victorperez victorperez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice! I just leaved some suggestions for your consideration.

activerecord/lib/active_record/delegated_type.rb Outdated Show resolved Hide resolved
activerecord/lib/active_record/delegated_type.rb Outdated Show resolved Hide resolved
@tomdalling
Copy link

Is the name "delegated type" up for review? I don't see any delegation happening in the code. It looks more like a "subtype", or "secondary type", or something like that.

@nwjlyons
Copy link

It seems like a subclass/subtype. Django calls it Multi-table inheritance

@pixeltrix
Copy link
Contributor

What's the story for creating records? Are you intending to use nested attributes? Also do you start with the Entry or do you create records using the delegated types Comment or Message ?

@dhh
Copy link
Member Author

dhh commented May 19, 2020

@nwjlyons Yeah, that's similar, at least in terms of the table backing. It's a bit different in the sense that this is forming a hierarchy with composition and delegation rather than inheritance. But would be good to reference it.

@pixeltrix Good point. In both BC3 and HEY, we have factories creating the entry/entryable (or recording/recordable) combos, which need to do a lot more work than just creating the link. But the basic pattern is just Entry.create! entryable: Message.new(subject: "hello!"), creator: Current.user.

@dhh
Copy link
Member Author

dhh commented May 19, 2020

@tomdalling The type is delegated through the named role. So Entry delegates its type to entryable, which is then answered by Message or Comment. I mean, it's good to recognize that we're doing these gymnastics because of the impedance mismatch. If this was pure OO, we wouldn't bother.

@Qqwy
Copy link

Qqwy commented May 19, 2020

This is very interesting! From the text as it is currently written, though, it is not entirely clear what the advantage would be of this new technique vs. using plain composition.
Maybe this could be added?

@JanDintel
Copy link

JanDintel commented May 19, 2020

I like the documentation, but I don't completely understand what the query methods return.

Does Entry.all return the Entry objects from the entries table or Message and Comment objects from the messages and comments tables? Same applies for Account.entries.

Based on reading the code it's the first, but it is a bit hard to understand from the documentation.

Copy link
Contributor

@mstroming mstroming left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what the db tables look like in this case. Can you provide an example of the schema for the Entry, Message and Comment tables?

It seems that polymorphic associations are still utilized? I like that this approach here addresses the drawbacks of STI and Abstract Models, but the lack of referential integrity in polymorphic associations is still concerning.

@victorperez
Copy link
Contributor

It's unclear to me what the db tables look like in this case. Can you provide an example of the schema for the Entry, Message and Comment tables?

It seems that polymorphic associations are still utilized? I like that this approach here addresses the drawbacks of STI and Abstract Models, but the lack of referential integrity in polymorphic associations is still concerning.

There is already an Schema note in the documentation:

# # Schema: entries[ id, account_id, creator_id, created_at, updated_at, entryable_type, entryable_id ]

# # Schema: messages[ id, subject ]

# # Schema: comments[ id, content ]

@mstroming
Copy link
Contributor

mstroming commented May 20, 2020

@victorperez Thanks. I missed that.

@dhh For me, maintaining data integrity in the db is usually more important than utilizing polymorphism here. Is there a way to accomplish the delegated type while allowing the db to enforce referential integrity?

@dhh
Copy link
Member Author

dhh commented May 20, 2020

@mstroming There is not. This relies on polymorphic associations, and you're not going to get vanilla foreign key constraints on those. But you could probably setup a stored procedure somehow to enforce it, if you really wanted to.

@jonathanhefner
Copy link
Member

Is the name "delegated type" up for review? I don't see any delegation happening in the code. It looks more like a "subtype", or "secondary type", or something like that.

That was my initial reaction too. I think because we are used to talking about delegating behavior, whereas this is delegating subtyping. Or in other words, delegating the ability to be extended with specialized behavior.

I feel like the following API could help make the gymnastics more clear:

class Entry < ApplicationRecord
  delegates_specialization_to :entryable
end

class Message < ApplicationRecord
  specialization_of :entryable
end

class Comment < ApplicationRecord
  specialization_of :entryable
end

In the above code, specialization_of would include Entryable, and it could handle invoking the latter half of define_delegated_type_methods. I also think it would be pretty slick if delegates_specialization_to dynamically created (or re-opened) the Entryable module.

@lazaronixon
Copy link
Contributor

lazaronixon commented May 21, 2020

I don't know if I got it right. But how does it differ from the polymorphic association? I always thought that bc3 recordable pattern used it.

@tomdalling
Copy link

I like "specialization", as in "an Entry has a specialization" and "a Comment is a specialization of an Entry".

DelegatedType doesn't fit any definition of delegation that I'm familiar with. There are no attributes or methods being forwarded, and Entrys don't look/behave like Comments using Ruby magic or anything like that. I think calling this "delegation" will confuse a lot of people.

@terracatta
Copy link
Contributor

terracatta commented May 21, 2020

My suggestion for naming

class Entry < ApplicationRecord
  describes_an :entryable, types: %w[ Message Comment ] 
  # `describes_a` would also be a valid keyword if you wanted to say
  # `describes_a :readable` or something else that doesn't start with a vowel.
end

@jonathanhefner
Copy link
Member

I think calling this "delegation" will confuse a lot of people.

Reading my comment again, I can see how delegates_specialization_to might also be read like "delegates Entry#specialization to Entryable#specialization", instead of "delegates the ability to specialize to Entryable".

It might be more clear as delegates_specializing_to, but another option could be specializes_via.

@dhh
Copy link
Member Author

dhh commented May 22, 2020

I think I might inadvertently have shared plans for a bike shed and opened the floor to which color it should be painted. My sincerest apologies 😄

"Entry delegates it's type to the entryable role, of which we currently have Message and Comment". The delegation methods, like entry.message?, is a type predicate that's answered by the entryable (even if we shortcut it for performance by going straight to the polymorphic class column).

I shall expand the example to show how half the point here is to erect a delegation scheme. For example, in HEY, we'd delegate #content to the entryable, so Entry#content ends up delegating to Entryable#content, which is answered by Message using the rich text, and by Comment using plain text (you need to make sure they duck adequately, of course).

Anyway, I appreciate the feedback, but I didn't find any of the suggestions offered more compelling than what we have here. I'll get this finished up this weekend! ✌️

@tomdalling
Copy link

I'll leave my suggested naming (using "specialization") here: https://gist.github.com/tomdalling/bc585d824466d102e635d5d03502e095

I appreciate that your time is precious @dhh so no need to reply, and that the decision is ultimately yours, of course. I just thought that if there was any time to improve the naming it would be now, before rolling it out to thousands of devs/projects. I don't think of that as bikeshedding, personally.

dhh and others added 3 commits May 23, 2020 09:29
Co-authored-by: Víctor Pérez <v.rodriguez@volcanic.co.uk>
Co-authored-by: Víctor Pérez <v.rodriguez@volcanic.co.uk>
@dhh dhh changed the title The dash of sugar used to support the key architectural pattern in BC3 + HEY Add delegated type to Active Record May 23, 2020
@dhh dhh merged commit fd8fd4a into master May 23, 2020
@dhh dhh deleted the delegated-type branch May 23, 2020 22:14
@kaspth
Copy link
Contributor

kaspth commented May 23, 2020

Obviously the bikeshed should be Ruby red 😄🙌

Woot, nice! 🤘🥳

@aamdani
Copy link

aamdani commented May 23, 2020

In the final example around adding further delegation, with title for entryables, Message and Comment, could we add the include Entryable line somewhere in the class definitions for clarity's sake? Unless I've misunderstood the implementation details.

This is in order for the class that's signing up to include this "virtual interface" to be responsible for implementing the delegated :title message, right?

@matteeyah
Copy link
Contributor

matteeyah commented May 24, 2020

Since this is currently in master, this'll be included in Rails 6.1, right?

@dhh
Copy link
Member Author

dhh commented May 24, 2020 via email

# You create a new record that uses delegated typing by creating the delegator and delegatee at the same time,
# like so:
#
# Entry.create! message: Comment.new(content: "Hello!"), creator: Current.user

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Entry.create! message: Comment.new… is kinda confusing, did you mean Entry.create! comment: Comment.new…?

Also, does it dynamically imply and set the entryable_id and entryable_type: “Comment”?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Further up in the thread, @dhh mentions the format as being: Entry.create! entryable: Message.new(subject: "hello!"), creator: Current.user which seems to make more sense.

@matteeyah
Copy link
Contributor

matteeyah commented May 26, 2020

@dhh Definitely! I'm really glad that you released this design pattern. I was able to extract redundant columns out of a database table. Refactoring this made me sleep easier at night, so thanks for that!

I did a spike to come up with a PoC for introducing this into the codebase of a product that I'm working on (https://github.com/Respondo/respondo/pull/225) by monkey-patching ActiveRecord with delegated types. It's amazing how can a small code change in ActiveRecord facilitate a big change in the domain model.

@robertomiranda
Copy link
Contributor

I've published a Backport gem for Rails >= 5

mirror source code: https://github.com/robertomiranda/delegated_type
published here: https://rubygems.org/gems/delegated_type

@jhirn
Copy link

jhirn commented Nov 6, 2020

I've been achieving this in psql using HSTORE, then JSONB, columns and STI for heterogeneous attributes for a very long time.

I'm happy to have official support for this in Rails as I’d always prefer and off-the-shelf over a homegrown solution, but would love to to hear thoughts on multi-table vs JSON to avoid sparsely populated tables. Keep I mind this gem was only built to support my original needs and is nowhere near as polished as the solution above.

The only knee jerk curiosity I have is the efficiency of unioning tables vs querying a single table when retrieving a heterogeneous collection of subtypes but without experimenting a little. The API for this PR is definitely nicer in many regards, but at the physical layer I'm not so sure.

Thanks for this and STI to begin with. Due to JSON columns, I overuse and defend that feature to a point I'm not even sure I should talk about publicly 😇

https://github.com/madeintandem/jsonb_accessor

https://madeintandem.com/blog/2013-3-single-table-inheritance-hstore-lovely-combination/

@twada
Copy link

twada commented Nov 9, 2020

@dhh From the perspective of mapping objects and tables, "class-table inheritance" mentioned here is actually a Concrete Table Inheritance, Delegated types newly introduced here looks like a Class Table Inheritance (CTI).

I'm very excited to use this great feature!

@adammiribyan
Copy link

adammiribyan commented Nov 14, 2020

In case anybody finds this helpful, a few extra pieces that fit nicely with the delegated types concept:

N+1 prevention DSL sugar

module Entryable
  TYPES = %w[ List Spot ]
end
class Entry < ApplicationRecord
  delegated_type :entryable, types: Entryable::TYPES

  scope :with_entryables, -> { includes(:entryable) }
end

Automatically matching entry's url helper based on the entryable_type

Rails.application.routes.draw do
  direct :entry do |model|
    route_for model.entryable_name, model
  end
end
class SpotsController < ApplicationController
  def create
    @entry = Entry.create! entryable: Spot.new(params.require(:spot).permit(:address))
    redirect_to @entry # Redirects to e.g. /spots/47, with 47 being the newly created Entry id.
  end
end

@slimdave
Copy link

slimdave commented Dec 6, 2020

@mstroming There is not. This relies on polymorphic associations, and you're not going to get vanilla foreign key constraints on those. But you could probably setup a stored procedure somehow to enforce it, if you really wanted to.

Coming late to this I'm afraid, but referential integrity cannot be enforced with stored procedures/triggers etc. Those approaches are always subject to race conditions because user sessions can only see other sessions' committed changes.

@kurt-mueller-osumc
Copy link

kurt-mueller-osumc commented Dec 16, 2020

In the schema outlined in the original post, I could have an Entry that has an entryable_id that doesn't exist in the table specified by the entryable_type. I could also create Messages and Comments with no backing Entry record.

In order to attempt to make illegal states unrepresentable, having an entry_idcolumn on messages and comments with a foreign key constraint ensures that a Message or Comment cannot exist without a backing Entry. I don't need any stored procedures to enforce that, only database constraints.

@mstroming
Copy link
Contributor

In the schema outlined in the original post, I could have an Entry that has an entryable_id that doesn't exist in the table specified by the entryable_type. I could also create Messages and Comments with no backing Entry record.

In order to attempt to make illegal states unrepresentable, having an entry_idcolumn on messages and comments with a foreign key constraint ensures that a Message or Comment cannot exist without a backing Entry. I don't need any stored procedures to enforce that, only database constraints.

The existence of polymorphic associations does not allow the database to enforce referential integrity, however, because no foreign keys can be defined.

@kurt-mueller-osumc
Copy link

kurt-mueller-osumc commented Dec 16, 2020

In the schema outlined in the original post, I could have an Entry that has an entryable_id that doesn't exist in the table specified by the entryable_type. I could also create Messages and Comments with no backing Entry record.
In order to attempt to make illegal states unrepresentable, having an entry_idcolumn on messages and comments with a foreign key constraint ensures that a Message or Comment cannot exist without a backing Entry. I don't need any stored procedures to enforce that, only database constraints.

The existence of polymorphic associations does not allow the database to enforce referential integrity, however, because no foreign keys can be defined.

If entry_id existed on messages and comments and pointed to the entries table, I definitely could have a foreign key contraint because it's not a polymorphic association. And a message or comment could not exist without a corresponding entry record.

@slimdave
Copy link

In the schema outlined in the original post, I could have an Entry that has an entryable_id that doesn't exist in the table specified by the entryable_type. I could also create Messages and Comments with no backing Entry record.
In order to attempt to make illegal states unrepresentable, having an entry_idcolumn on messages and comments with a foreign key constraint ensures that a Message or Comment cannot exist without a backing Entry. I don't need any stored procedures to enforce that, only database constraints.

The existence of polymorphic associations does not allow the database to enforce referential integrity, however, because no foreign keys can be defined.

If entry_id existed on messages and comments and pointed to the entries table, I definitely could have a foreign key contraint because it's not a polymorphic association. And a message or comment could not exist without a corresponding entry record.

True, but it's a fundamentally different relationship in terms of the possible cardinalities of the join in that case.

@kurt-mueller-osumc
Copy link

True, but it's a fundamentally different relationship in terms of the possible cardinalities of the join in that case.

Good point. In my example, cardinalities would be fundamentally different: an Entry could have_many :messages and have_many :comments. In the original example, a Message could have_many :entries, etc. In either case, there's no way to enforce the cardinalities at the database level (not that I'm aware of).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet