Recently in Ruby Core - March edition

Many Ruby developers don’t have time to keep up to date with the recent developments inside Ruby core. To make this easier I decided to write up the most recent progress that is happening inside the Ruby Bug tracker. I won’t summarize bug tickets, but only feature proposals. But if an important Bug ticket rises I will mention it at the end.

Here are the most important issues being discussed in the month of March:

Documentation improvements for combining multiple hash values

In this issue Chris Seaton is proposing that the documentation should be improved so that we developers are pointed into the right direction when we want to combine multiple values into a hash code.

When would we need to combine hash values?

If we want a custom class to be able to be a key in a Hash we need to implement the custom #hash method. This method will be called from the VM when it needs to put the connected value into the correct bucket.

What issue does this feature solve?

Many developers are using sub optimal methods to create this custom #hash method, resulting in unnecessary hash table collisions, resulting in bad performance.

With this feature approved the developers will be pointed into the right direction to implement this method. Which is quite easy most of the time:

class MyCustomClass
  def initialize(attrs)
    @x = attrs[:x]
    @y = attrs[:y]
  end
  def hash
    [@x, @y].hash
  end
end

Another thing that is proposed is that this best practice is further optimized inside the VM resulting in even better performance.

Merging WASI based WebAssembly support into CRuby

This issue was proposed quite some time ago, but in came into fruition more recently. (To be honest this was done before march but I came across this only recently because I stumbled across this medium blog post.)

When no issues are found Ruby 3.2 will support the WASI platform as well. Meaning it would be possible to run Ruby inside the browser and, more importantly, to deploy a wasi image and use Ruby for edge computing or just to deploy a script easily. Because the image has the Ruby VM included the computer running the script wouldn’t have to have a Ruby installed. To test this out you can already download the Ruby 3.2 preview 1 version.

No clobber def

In this issue Ed Mangimelli is proposing to add a way for a developer who consumes a library to get some feedback when he creates a new method overwriting an existing one. Right now he proposes that this feedback should be an Exception that is thrown when the file is parsed.

There has been a little bit of feedback yet, mainly from people who like this idea but no core committer has commented on this issue yet. Also, there might be a pure Ruby solution already so we will see if this proposal makes it into Ruby.

Reverse the order of GC Compaction cursor movement

In Ruby 2.7 a manual garbage collection (GC) compaction was added. In this issue a change to the algorithm is proposed. The reason for this is twofold: The current way to compact after GC makes it very hard to move objects between size pools. The second reason is that the compactor is missing some size pools. Here is the current progress.

What is a size pool?

A size pool is a heap with a fixed size that can contain multiple objects, depending on the size. Ruby has now multiple size pools and all of them have an own heap and scan/compact cursor which creates the problem this issue is trying to solve.

Introduce general IO#timeout and IO#timeout=for all (non-)blocking operations

This feature is relevant mainly because of the Fiber scheduler. If this PR gets merged all non-blocking IO instances can have a timeout per IO instance. This way the developer can create a new timeout for every call to an IO operation and also handle the error.

What is the Fiber scheduler again?

The Fiber scheduler makes it possible to write an auto-switching event loop. When a non-blocking IO event happens the event loop can run another Fiber and therefore increase throughput drastically. The async gem, which is developed from the same developer who created the scheduler, is the most famous event loop that utilizes this Ruby feature.

Variable Width Allocation: Arrays

This feature proposal comes with a PR as well. It is not a user facing enhancement but it is part of a long running improvement to the Ruby VM. The first improvement was already merged when the variable width allocation was implemented for Strings

What is a variable width allocation?

Right now most bigger objects have the data for them split in different places in the memory. This is a bad thing because the CPU can’t cache it in a good way and therefore the performance decreases massively. Variable width allocation solves this problem by putting the data right after the metadata into the memory. So, after this is done for most of the objects the performance should be significantly improved. For more information take a look into this talk from the author himself.

Default empty string argument for String#sub and String#sub!

In this issue there is a change proposed to the sub method. The author wants the default replacement string to be an empty string. Though there was not much response yet, so we will see where this will go.

Named ripper fields

This issue might be very interesting for you if you are working on tooling for Ruby. Ripper is a parser for a Ruby script. It returns an array with many subarrays for Ruby code. Here is a very simple example:

Ripper.sexp("puts 'wuhu')
=> [:program,
 [[:command, [:@ident, "puts", [1, 0]], [:args_add_block, [[:string_literal, [:string_content, [:@tstring_content, "wuhu", [1, 6]]]]], false]]]]

To work with this, the program has to have a lot of implicit knowledge. The author created an https://github.com/ruby/ruby/pull/5679 for a subclass that uses named nodes instead of implicit meaning in an array because of the index number.

Coerce anything callable to a Proc

Recently in the Ruby ecosystem there was a trend towards functional programming. The dry-rb ecosystem is using this a lot. To further ease working in a FP way with Ruby there is this proposal to coerce any callable object into a Proc

One advantage would be that a Proc is able to be curryed. Currying means that we don’t have to give a function all the arguments at once. If there are missing arguments we don’t get an error but a new Procthat is able to receive the remaining arguments. After all arguments have been supplied the result is returned. With this technique it might be easier to write good composable objects.

Enhancements to prettyprint

The documentation for this issue is really great so I will just shortly summarize it here: The datastructure that prettyprint is using has new features to be able to use this class for a code formatter. Here is the corresponding PR.

Merge IO#wait_readable and IO#wait_writable into core

This proposal is connected with the fiber scheduler again. (While following the Ruby core development I noticed that this is a very active area.) This issues seems to be resolved already and the conclusion is that there will be a merge to Ruby core.

Honorable mentions - Bug Edition

FreeBSD 13 Issue

The Ruby CI is failing recently for FreeBSD 13. The current maintainer of this platform is busy right now so anyone who can help with this platform is encouraged to take part in this discussion.