New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Data class implementation: Simple immutable value object #6353
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
78a6767
to
3e4f762
Compare
rb_define_method(rb_cData, "initialize", rb_data_initialize_m, -1); | ||
rb_define_method(rb_cData, "initialize_copy", rb_struct_init_copy, 1); | ||
|
||
rb_define_method(rb_cData, "==", rb_data_equal, 1); | ||
rb_define_method(rb_cData, "eql?", rb_data_eql, 1); | ||
rb_define_method(rb_cData, "hash", rb_data_hash, 0); | ||
|
||
rb_define_method(rb_cData, "inspect", rb_data_inspect, 0); | ||
rb_define_alias(rb_cData, "to_s", "inspect"); | ||
rb_define_method(rb_cData, "to_h", rb_data_to_h, 0); | ||
|
||
rb_define_method(rb_cData, "members", rb_data_members_m, 0); | ||
|
||
rb_define_method(rb_cData, "deconstruct", rb_data_deconstruct, 0); | ||
rb_define_method(rb_cData, "deconstruct_keys", rb_data_deconstruct_keys, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think all these instance methods should be defined directly on the generated class (and not on Data) to allow future optimizations by specializing them for the members, i.e., https://bugs.ruby-lang.org/issues/16122#note-69 and as I mentioned in an earlier comment on this PR.
Otherwise such optimizations would have to create a new Module which is extra footprint and would be visible in ancestors
.
Could you make a PR to adjust that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zverok It's unfortunate you didn't replied to this before the 3.2 release (maybe you didn't see it?).
The cost is we'll need to create an extra Module per Data subclass for optimization.
Hopefully that turns out to not be too much of a footprint drawback.
Given a pattern like
Measure = Data.define(:amount, :unit) do
def initialize(amount:, unit: '-none-') = super
end
as used in https://bugs.ruby-lang.org/issues/16122#note-68
This actually only works if the initialize
defined by Data.define
is defined on Data and not on the Data subclass (named Measure
here), otherwise the custom initialize
would just overwrite it and super
wouldn't work.
So for optimizing Data methods we will need to do the same trick as for Struct, to include a Module between the subclass and Data
and define optimized methods there, so e.g., keyword arguments are dealt much more efficiently than the generic logic in Data#initialize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate you didn't replied to this before the 3.2 release (maybe you didn't see it?).
I didn't.
TBH, communicating via GitHub comment in already closed PR is not the most reliable way to reach me :)
I am receiving GH mentions about everything that happens in ruby/ruby repo, but since Feb 24, I didn't have resource to read through them, so I just recently marked all thousands of them as "read" and started to follow the development anew...
This is not to say I am guaranteed to see mentions unless they are in PRs I am explicitly checking. Not the best mode, probably, but I have no resource to rethink it either 🤷 I am much more accessible on Twitter, email, or in new tickets on the bug tracker, which I also follow but more rigorously.
Also, considering... well, everything, ... I am not guaranteed to be reachable at all, so I believe any discussions about Data
or other features I've introduced are better to maintain on the tracker.
Which, I think, is the best place to discuss the problem you raised, because I don't have a strong opinion on this. Though, I would maintain that the "super
should refer to the default implementation" is a useful rule for any method, not only initialize
. But how to find a compromise between this rule and efficiency, that I am not sure.
rb_undef_alloc_func(rb_cData); | ||
rb_define_singleton_method(rb_cData, "define", rb_data_s_def, -1); | ||
|
||
rb_define_singleton_method(rb_cData, "members", rb_data_s_members_m, 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this redundant with https://github.com/ruby/ruby/pull/6353/files#diff-af52c7b2f2401c72137b502f634cbceb9d313a66897d630930c62f4f8bfb7faaR389 (line 389 above)?
* | ||
* Examples: | ||
* | ||
* Measure = Data.new(:amount, :unit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should not it be Data.define
in this case? Also in some other docs below as well.
Shouldn't Data instances be frozen by default as they are immutable ?
Otherwise I have to explicitly freeze the Data instance, which seems to serve no advantage over Struct in terms of usage with Ractor |
Nevermind, turned out to be limitation of my understanding of |
Thank you 🎉 @zverok! Very happy with the design of Super hopeful 🙏 that we can get the |
This was added in Ruby 3.2.0-preview3 [0]. [0] ruby/ruby#6353
Should this not be the singular "datum" rather than the plural "data"? |
Source: #16122
Example docs rendering: Data
Design and implementation decisions made:
1. The "define data object method is called
Data.define
. As per Matz:There were a few quite reasonable arguments towards
define
in that ticket. To add to them, my PoV:def
is strongly associated in Ruby with "defining a method" and became a separate word in Rubyist's dictionary, not a "generic shortcut for 'define'"2.
define
accepts keyword and positional args; they are converted to keyword args there, and checked ininitialize
The fact that
initialize
accepts only keyword args and checks them makes it easy to define custominitialize
with defaults, for example (it is all explicitly documented, see link above):(This might be enough for not to invent a separate API for default values, but this discussion can be postponed.)
3. I didn't introduce any additional APIs yet (e.g. something like
with
which was discussed). So, the full API of theData
as rendered by RDoc is:I believe it is enough for the introduction, and then the feedback might be accepted for how to make it more convenient.
4. I wrote custom docs for
Data
class instead of copy-pasting/editing docs forStruct
. I have a strong belief that the approach to docs used is more appropriate:Probably it is not a good place to discuss those approaches in general, but at least for the Data class, I believe the chosen approach is superior. If the core team believes it is not true, I think my changes can be merged and then formal docs provided by team members who consider them valuable.