Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RJIT #7448

Merged
merged 98 commits into from Mar 6, 2023
Merged

RJIT #7448

merged 98 commits into from Mar 6, 2023

Conversation

k0kubun
Copy link
Member

@k0kubun k0kubun commented Mar 5, 2023

full diff1: 55367b3...22d944c

Description

This PR replaces the current implementation of MJIT with a new JIT called "RJIT" 2.

  • RJIT uses a pure-Ruby assembler to generate native code
    • MJIT requires a C compiler at runtime. YJIT requires a Rust compiler at build time. RJIT doesn't require them.
    • This means that RJIT's warmup could be slower than YJIT, but it's still much faster than MJIT's.
  • The code generated by RJIT looks very similar to YJIT
    • In fact, many methods are direct translations of the Rust code into Ruby.
    • This allows us to simplify the Ruby VM by removing MJIT-specific implementations.
    • We could do some early experiments for YJIT in RJIT too if we want.

See the ticket for motivation and further details: [Feature #19420]

Benchmark

I benchmarked the interpreter, YJIT, RJIT, and MJIT with yjit-bench.

Headline

RJIT's performance is still nowhere near YJIT's, but notably RJIT outperforms MJIT in all headline benchmarks, which are considered the most real-world workloads. RJIT gives a small speedup on railsbench even with yjit-bench's short warmup.

output_543

Other

Sometimes MJIT is still better than RJIT. However, RJIT outperforms both YJIT and MJIT on Optcarrot, which was the benchmark used for the Ruby 3x3 milestone.

output_542

Micro

30k_ifelse and 30k_methods are the things that YJIT is very good at, but RJIT outperforms YJIT on them. This seems to be because YJIT chose to interleave inline code and outlined code for Code GC and arm64's performance whereas RJIT doesn't do that. This is a good reminder of the code layout's impact.

output_544

Footnotes

  1. I merged this branch in multiple batches because pushing hundreds of commits at once pressures our notification system a bit. However, an auto-format commit interrupted the operation, so I needed to resolve the conflict and this PR has only the diff after that commit.

  2. This PR doesn't rename the interface and internal names from MJIT to RJIT yet, but a separate PR will do that soon.

@k0kubun k0kubun force-pushed the rjit branch 6 times, most recently from e37b624 to bac3243 Compare March 6, 2023 06:47
@k0kubun k0kubun marked this pull request as ready for review March 6, 2023 07:21
@k0kubun k0kubun merged commit 22d944c into ruby:master Mar 6, 2023
3 of 7 checks passed
@k0kubun k0kubun deleted the rjit branch March 6, 2023 07:29
@eregon
Copy link
Member

eregon commented Mar 6, 2023

Nice! Finally a JIT in a nice language^^
Chris Seaton was always saying we should write a Ruby JIT in Ruby, and for sure it feels elegant.

Do you have a link to the full list of commits/diff? -> it's already in the description :)

I merged this branch in multiple batches because pushing hundreds of commits at once pressures our notification system a bit

Could we just merge them at once (in the future) and let the notification system deal with it at its own pace?
Does the notification system need to notify for every commit? Notifying once per push would probably solve that.

Regarding the Headline benchmarks, would you also have numbers with enough warmup for MJIT? Otherwise it's just showing MJIT takes longer to warmup but not comparing peak performance/how much it can optimize.

@k0kubun
Copy link
Member Author

k0kubun commented Mar 6, 2023

Notifying once per push would probably solve that.

Yeah, definitely. I may try that path next time.

Regarding the Headline benchmarks, would you also have numbers with enough warmup for MJIT?

I used your warmup harness and this was the medians that it printed (full details):

bench interp (ms) yjit (ms) rjit (ms) mjit (ms)
activerecord 111 67 94 112
hexapdf 1807 1173 1593 1777
liquid-c 44 31 38 44
liquid-render 110 62 83 91
mail 94 72 88 97
psych-load 1432 986 1174 1202
railsbench 1497 980 1340 1584
ruby-lsp 45 33 40 95
sequel 89 79 103 194

Also this was the last time I seriously benchmarked MJIT with enough warmup. At least on railsbench, I've only seen 1.02~1.05x speedup with MJIT at best, so RJIT will most likely be faster than MJIT regardless.

@tenderlove
Copy link
Member

Now I need to make TenderJIT outperform RJIT 🤣

@k0kubun
Copy link
Member Author

k0kubun commented Mar 6, 2023

Good news is that you've got a lot of new RubyVM::MJIT::C APIs (which will be RubyVM::RJIT::C soon) for making that happen 😄

@jensengrey
Copy link

This is amazing. Wonderful work.

The benchmark times include JIT overhead correct? Is there a graph of just the JIT overhead. What Ruby methods does RJIT rely heavily on and could they be sped up by RJIT? How does one use RJIT to make RJIT itself faster (lower latency).

@k0kubun
Copy link
Member Author

k0kubun commented Mar 8, 2023

Full details are in https://gist.github.com/k0kubun/4e31fd289f8e3543dc094421eca90861. I haven't implemented a feature to render a graph to visualize the warmup performance, but you can see how it behaves on the 1st iteration, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants