The Ruby Spec Suite Compatibility Report

Introduction

For a long time I wanted to show visually the compatibility of Ruby implementations with ruby/spec. This is not as easy as it sounds. First, we would like the same version of ruby/spec for all tested Ruby implementations, but in general each Ruby implementation uses a different version (except when synchronized once a month). We would like the same Ruby compatibility version (e.g., 2.6), but some Ruby implementations do not target 2.6 currently. And finally the number of specs (that is, of it examples) on a given platform should be fairly constant, which required a few fixes in ruby/spec.

As a quick recap, ruby/spec and CRuby tests are the two major test suites for Ruby compatibility. ruby/spec has the advantage that each spec example has a description of a specific behavior it tests. CRuby tests on the other hand often require to reverse-engineer a test to understand what behavior(s) it intends to test, and they are more coarse-grained. In this post I focus on ruby/spec.

Approach

In this first report I ran the specs on CRuby 2.6.6, TruffleRuby, JRuby, Opal and Artichoke on Linux. I used development versions because I want to use the same specs for all implementations when possible, along with the corresponding tags – that is specs marked as failing – which are only available from the corresponding repository’s development branch.

The specific development versions and command lines used are shown in this gist. We use the number of examples that CRuby runs, which is always equal or slightly higher than the number of examples when running on other implementations.

The setup for running specs on Opal and on Artichoke is more complex – due to typically not having access to the real filesystem – so I used the version of ruby/spec that they import, which is older and therefore has less specs. Opal currently targets Ruby 2.5 and Artichoke is still an early Ruby implementation. They both use an include list of specs to run, and so might pass more than tested in CI. So please consider numbers for those Ruby implementations with a grain of salt.

Below you can see two totals, because some implementations do not support the Ruby C-API:

Without further ado, here is the data.

Results

Group CRuby TruffleRuby JRuby Opal Artichoke
Command-line
141 specs
100% 87.94% 73.76% 0.00% 0.00%
Language
2367 specs
100% 97.85% 97.17% 69.84% 0.00%
Core Library
20841 specs
100% 96.80% 93.61% 42.93% 6.56%
Standard Library
6891 specs
100% 97.62% 85.97% 7.26% 4.75%
Security
40 specs
100% 100% 92.50% 0.00% 0.00%
TOTAL without C-API specs
30280 specs
100% 30280 passing 97.03% 29382 passing 92.06% 27875 passing 36.66% 11101 passing 5.59% 1694 passing
C-API
1325 specs
100% 97.36% 0.00% 0.00% 0.00%
TOTAL
31605 specs
100% 31605 passing 97.05% 30672 passing 88.20% 27875 passing 35.12% 11101 passing 5.36% 1694 passing

Interpretation

The interpretation that follows is mine, and I tried to keep it as neutral and factual as possible. I apologize if there is any inaccuracy or subjective opinion: let me know, I am happy to update the blog post based on justified comments.

As expected, CRuby passes all specs. This is by design, ruby/spec only includes specs that pass on CRuby since it is the standard implementation. There is a rare exception for specs specifying the correct behavior for known bugs in CRuby (ruby_bug in MSpec) which we can ignore here (there are only 3 ruby_bug specs currently in ruby/spec).

We see that TruffleRuby passes 97% of all specs which is quite impressive. To be fair, TruffleRuby focuses more on ruby/spec than on CRuby tests for the advantages mentioned above.

Next, we have JRuby passing 88% or 92% of all specs, depending on whether you count C-API specs. That’s impressive as well, notably with high scores on language and core specs. It is worth noting JRuby is currently working on finishing Ruby 2.6 support.

Then we have Opal, which passes a sizable number of language, core and library specs. Opal is compiling to JavaScript, so unlike JRuby and TruffleRuby it does not aim to implement lower-level system functionality. Opal is also typically run in browsers and not on the command line, so Opal does not run command-line specs.

Finally, Artichoke, a fairly new Ruby implementation, passes some core and library specs. Based on MRuby, which implements a subset of Ruby, it still has a long way to achieve compatibility with CRuby. Update (February 2021): Artichoke now passes an estimated 22% of core library specs when trying to run all core specs with this PR, however those are not run in CI and might regress.

Note that specs from ruby/spec are not necessarily representative of general compatibility with the entire Ruby ecosystem, even though they are obviously related. For instance, there might be far more spec examples on a given method than usages in Ruby gems, or the opposite. There are Ruby semantics that are not yet covered in ruby/spec: it is an ongoing effort to improve coverage by the way of contributions to ruby/spec from Ruby implementations and from external contributors. In general, when an alternative implementation has a bug report, they either ensure there is already a spec or add one so over time the ruby/spec coverage improves.

I took a look at CRuby tests for more context and there are in total 20587 CRuby tests when run on CRuby 2.6.6 locally. TruffleRuby currently runs and passes 10467 CRuby tests in CI. JRuby currently runs and passes 8133 CRuby tests in CI. Opal and Artichoke do not seem to run CRuby tests currently. So it seems clear alternative Ruby implementations focus more on ruby/spec, but also that there are still a lot more CRuby tests they could run. Some CRuby tests are however CRuby-specific and not applicable for any other Ruby implementation.

Conclusion

I am looking forward to Ruby implementations improving their compatibility and passing a higher percentage of ruby/spec. I plan to make such a report again when that happens.

Other Ruby Implementations

I would like to report on other Ruby implementations but currently it seems difficult.

Any help to run ruby/spec on these implementations with the stats formatter or a compatible output is welcome. The stats formatter has a very simple format, it’s a Hash of spec files to number of passing examples as a YAML file:

---
language/some_spec.rb:
  :examples: 10
  :errors: 1
  :failures: 2
  :tagged: 3

Acknowledgments

I used the technique from this blog post to draw the percentage charts.