New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rust YJIT #5826
Rust YJIT #5826
Conversation
$(Q) if [ -f '$(YJIT_LIBS)' ]; then \ | ||
set -eu && \ | ||
echo 'merging $(YJIT_LIBS) into $@' && \ | ||
$(RMALL) '$(CARGO_TARGET_DIR)/libyjit/' && \ | ||
$(MAKEDIRS) '$(CARGO_TARGET_DIR)/libyjit/' && \ | ||
$(CP) '$(YJIT_LIBS)' '$(CARGO_TARGET_DIR)/libyjit/' && \ | ||
(cd '$(CARGO_TARGET_DIR)/libyjit/' && $(AR) -x libyjit.a) && \ | ||
$(AR) $(ARFLAGS) $@ $$(find '$(CARGO_TARGET_DIR)/libyjit/' -name '*.o') ; \ | ||
fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not simply include $(YJIT_LIBS)
in $(LIBRUBY_A_OBJS)
or $(INITOBJS)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did try that, but that creates a nested archive. I got some build errors since ld
has trouble finding YJIT symbols when dealing with nested archives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was messing around on x86_64-darwin, and this patch seems to work. Does this seem reasonable? If so, I will throw up a PR to see if we can make it pass on all platforms
diff --git a/template/Makefile.in b/template/Makefile.in
index fee62c8d30..de339847a0 100644
--- a/template/Makefile.in
+++ b/template/Makefile.in
@@ -304,16 +304,7 @@ $(LIBRUBY_A):
@$(RM) $@
@-[ -z "$(EXTSTATIC)" ] || $(PRE_LIBRUBY_UPDATE)
$(ECHO) linking static-library $@
- $(Q) $(AR) $(ARFLAGS) $@ $(LIBRUBY_A_OBJS) $(INITOBJS)
- $(Q) if [ -f '$(YJIT_LIBS)' ]; then \
- set -eu && \
- echo 'merging $(YJIT_LIBS) into $@' && \
- $(RMALL) '$(CARGO_TARGET_DIR)/libyjit/' && \
- $(MAKEDIRS) '$(CARGO_TARGET_DIR)/libyjit/' && \
- $(CP) '$(YJIT_LIBS)' '$(CARGO_TARGET_DIR)/libyjit/' && \
- (cd '$(CARGO_TARGET_DIR)/libyjit/' && $(AR) -x libyjit.a) && \
- $(AR) $(ARFLAGS) $@ $$(find '$(CARGO_TARGET_DIR)/libyjit/' -name '*.o') ; \
- fi
+ $(Q) $(AR) $(ARFLAGS) $@ $(LIBRUBY_A_OBJS) $(INITOBJS) $(YJIT_LIBS)
@-$(RANLIB) $@ 2> /dev/null || true
verify-static-library: $(LIBRUBY_A)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ianks I don't think that's going to work. If you run CI you should see the failure I saw in various different configurations.
Just for posterity, the specific errors I saw came from bigdecimal's extconf failing during linking. In general all of the extensions had trouble linking against object files that happen to depend on internal references to YJIT private symbols.
EDIT: I ran CI with the patch and this is what the error looks like: https://github.com/Shopify/ruby/runs/6131388624?check_suite_focus=true#step:12:772
The extconf log from bigdecimal is not in the GitHub log output but shows the link errors.
/// thankfully those cases are rare and don't cross the FFI boundary. | ||
#[derive(Copy, Clone, PartialEq, Eq, Debug, Hash)] | ||
#[repr(transparent)] // same size and alignment as simply `usize` | ||
pub struct VALUE(pub usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
For those interested, the docs say this:
The isize and usize types are pointer-sized signed and unsigned integers. They have the same layout as the pointer types for which the pointee is Sized, and are layout compatible with C's uintptr_t and intptr_t types
You will need to install: | ||
- A C compiler such as GCC or Clang | ||
- GNU Make and Autoconf | ||
- The Rust compiler `rustc` and Cargo (if you want to build in dev/debug mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since 2021 edition is required, we should specify that. Not sure how up to date rustc is for the various package managers
- The Rust compiler `rustc` and Cargo (if you want to build in dev/debug mode) | |
- The Rust compiler `rustc` (>= 1.60.0) and Cargo (if you want to build in dev/debug mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: The OP comment notes that the minimum Rust version is 1.60.0
At the moment YJIT requires Rust 1.60.0 or newer.
// | ||
// What's up with the long prefix? The "rb_" part is to apease `make leaked-globals` | ||
// which runs on upstream CI. The rationale for the check is unclear to Alan as | ||
// we build with `-fvisibility=hidden` so only explicitly marked functions end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rustc doesn't do -fvisibility=hidden
. Instead when linking a cdylib it uses a version script to hide exported symbols. This is because any library may end up being linked into a rust dylib in which case all symbols need to be exported for other crates to depend on. In any case you can only get symbol conflicts for symbols not marked as #[no_mangle]
when using the exact same rustc version as the rustc version is hashed into mangled symbols. To further reduce the chance of conflicts you can pass a unique string to -Cmetadata
which is hashed into mangled symbols too. That leaves the standard library of which most symbols are mangled and thus would only conflict when the exact same version is used which is probably fine. There are a couple of symbols in the standard library that aren't mangled though because of cyclic dependencies between libcore, liballoc and libstd. For them changing rustc to mark them as protected would be fine I think. In addition you could use a version script to prevent exporting any symbols starting with __rust_
. This is true for the majority of the unmangled symbols. Only a couple use a different prefix (rust_eh_personality
, __rg_alloc
, __rdl_alloc
, ...). Arguably they should be changed to use the __rust_
prefix too. In addition maybe rustc could start including the rustc version or a hash in them? In any case most of them haven't been changing a lot between versions and may thus be safe to export until rustc gets changed for better hygiene.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added the rustc changes to my personal TODO list. Unless someone else makes these changes first it will probably take a while for me to get to it though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the context around -fvisibility=hidden
, and for working on rustc! I think risk of collision is acceptably low for now since the Rust code is built as a static lib and linked into a DSO for that configuration, and the linker takes care of only including parts of the Rust stdlib that we use? I haven't verified that, though, and I added that to my TODO list. Thanks!
Also I think I understand the rationale mentioned in the comment a bit better now. It looks like -fvisibility=hidden
is only relevant for dynamic libs so the prefixes are for avoiding collisions in the static lib case. (The comment was about practices in the C parts of Ruby)
@@ -0,0 +1,24 @@ | |||
// Silence dead code warnings until we are done porting YJIT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, still needed for now. We've accumulated a bunch of these and will need to do a pass to address them.
Adopt suggestions from nobu. Thanks! ruby#5826 (comment)
Dearest Maintainer, I am very excited about this change! I checked out the branch and ran clippy (https://github.com/rust-lang/rust-clippy) Thank you for you work on this! I am happy to leave each clippy suggestion as a github suggestion if that helps. Becker |
Good call! Might be worthwhile to add add a clippy check to CI soon so the lints don’t pile up. Could help with adoption and keeping up with best practices |
Adopt suggestions from nobu. Thanks! ruby#5826 (comment)
Adopt suggestions from nobu. Thanks! ruby#5826 (comment)
I would consider setting up a rustfmt configuration as the formatting is not very consistent. |
I do not know much about JITs and was just linked here by a friend. This is very impressive. There a few points that I just want to make sure are known from the view of a Rust dev.:
I may try running this with Miri for laughs later. Anyways thanks for the cool project! I'll be keeping an eye on it. |
Adopt suggestions from nobu. Thanks! ruby#5826 (comment)
Thanks for all the suggestions! I pushed some commits to adopt some small changes, but please understand that I simply can’t get to all of them in this PR, especially the suggestions that generate big diffs. I’ll try to respond to your comments, but sorry in advance if I don’t get to yours. I’m outnumbered and I’m not very good or quick with words. 🙏 Since we use linear history in this repo I would like to keep the commit set in this PR small to make the merge manageable. After the initial merge of course we’re open to adopting improvements! Regarding clippy/rustfmt specifically, we’ll think about setting something up because it looks like a lot of folks are interested. If you are interested in contributing to YJIT, feel free to file PRs in this repo after the merge! |
Very impressive! Quick question for you: I didn't see any license headers in the YJIT files. Should I understand the Rust YJIT code to be under the same license as the rest of the Ruby codebase? I'm interested in potentially using the dynamic assembler (and possibly more) in my own project. |
Right, we use the same license, no change from when YJIT is in C. I'm not a lawyer though!
You are welcome to do that, but I feel there has got to be better alternatives on crates.io that's more convenient to use and more Rusty. Good library design is hard work, and Rust YJIT doesn't do that work because, well, we don't intend to be a Rust library on crates.io! Just be aware. |
As annoying as the initial diff might be, I’m very +1 on doing this as early as possible. I suspect a large majority of rust devs have this configured as part of their workflow / editor configuration, so enabling it will reduce the burden of contributing. If it’s not in this PR, a follow up formatting PR + a quick-merge would work as well! |
In December 2021, we opened an [issue] to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications. Because we want to be careful, YJIT is guarded behind a configure option: ```shell ./configure --enable-yjit # Build YJIT in release mode ./configure --enable-yjit=dev # Build YJIT in dev/debug mode ``` By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then `cargo` is used to fetch development dependencies, but when building in release, `cargo` is not required, only `rustc`. At the moment YJIT requires Rust 1.60.0 or newer. The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in `doc/yjit/yjit.md`. The CI tests have been updated and do not take any more resources than before. The development history of the Rust port is available at the following commit for interested parties: 1fd9573 Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up. [issue]: https://bugs.ruby-lang.org/issues/18481 Co-authored-by: Maxime Chevalier-Boisvert <maximechevalierb@gmail.com> Co-authored-by: Noah Gibbs <the.codefolio.guy@gmail.com> Co-authored-by: Kevin Newton <kddnewton@gmail.com>
Thanks to suggestions from Stranger6667 on GitHub. Co-authored-by: Dmitry Dygalo <dmitry@dygalo.dev>
Thanks to suggestion from bjorn3 on GitHub. Co-authored-by: bjorn3 <bjorn3@users.noreply.github.com>
c9f9147
to
e763afa
Compare
let foo = (0 as *const $struct_type); | ||
|
||
unsafe { | ||
let ptr_field = (&(*foo).$field_name as *const _ as usize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI this is a dereference of a null pointer, which is UB. It probably doesn't codegen to an actual read of address 0, so you don't get a fault, but it's UB nonetheless. If possible, it would be good to use the memoffset
crate instead.
I don't know if this was copy-pasted from elsewhere, but bindgen
has the same UB code pattern in it. I wouldn't be surprised if this code has spread all over.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LLVM definitely doesn't read an address of 0 because this is an address computation, no memory is actually being read.
We've been trying to avoid using external crates which is the problem. I would like to use some sort of built in offsetof
operator instead, it would look cleaner, but unfortunately it's only accessible through an external crate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While I greatly worry about pasting unsafe
code from external crates (multiple times, I've fixed the same piece of UB in 2 or 3 crates because they all copy+pasted instead of having a dependency), memoffset
is MIT and the source for offset_of!
is right here: https://github.com/Gilnaa/memoffset/blob/66f4c956a72daabeafa87ba9aef339d9035d78b4/src/offset_of.rs#L90 and I think it would be better to copy+paste than definitely have UB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for letting us know! This macro is actually unused and I'm in the process of deleting it while addressing other dead code lints. We have #[allow(unused_macros)]
at the moment which hides the lint. We'll be sure to try for a UB free implementation when we need one. :)
In December 2021, we opened an issue to solicit feedback regarding the porting of the YJIT codebase from C99 to Rust. There were some reservations, but this project was given the go ahead by Ruby core developers and Matz. Since then, we have successfully completed the port of YJIT to Rust. We are opening this pull request to upstream Rust YJIT, effectively replacing the C version of YJIT.
The new Rust version of YJIT has reached parity with the C version, in that it passes all the CRuby tests, is able to run all of the YJIT benchmarks, and performs similarly to the C version (because it works the same way and largely generates the same machine code). We've even incorporated some design improvements, such as a more fine-grained constant invalidation mechanism which we expect will make a big difference in Ruby on Rails applications.
Because we want to be careful, YJIT is guarded behind a configure option:
By default, YJIT does not get compiled and cargo/rustc is not required. If YJIT is built in dev mode, then
cargo
is used to fetch development dependencies, but when building in release,cargo
is not required, onlyrustc
. At the moment YJIT requires Rust 1.60.0 or newer.The YJIT command-line options remain mostly unchanged, and more details about the build process are documented in
doc/yjit/yjit.md
.The CI tests have been updated and do not take any more resources than before.
The development history of the Rust port is available at the following commit for interested parties: Shopify@1fd9573
Our hope is that Rust YJIT will be compiled and included as a part of system packages and compiled binaries of the Ruby 3.2 release. We do not anticipate any major problems as Rust is well supported on every platform which YJIT supports, but to make sure that this process works smoothly, we would like to reach out to those who take care of building systems packages before the 3.2 release is shipped and resolve any issues that may come up.
Please feel free to comment or ask questions on this pull request if you have any.
AppVeyor CI checks are currently red on the master branch, so they will be red on this PR as well.