Vinicius Stock

Posted on Apr 9, 2019 • Updated on Feb 10, 2022

Creating Ruby native extensions

#rails #ruby #c #webdev

Note

Since the time this post was written, the ext option has been added to bundler, which automatically configures a new gem to be a native extension. Instead of following the outdated steps in this article, simply create the new gem using the ext option.

bundle gem my_gem --ext

What native extensions for Ruby are

When programming in Ruby, our code is compiled into instructions and then executed by the Ruby virtual machine (which is built in C).

Ruby native extensions are libraries written in C using the built-in functions of the RubyVM. Basically, it's C programming with a ton of functions and macros to interact with the virtual machine. Anything that can be achieved by using pure Ruby can also be implemented using the built-in instructions.

Why they are useful

Native extensions have a significant performance advantage when compared to pure Ruby, making them an excellent alternative for heavy load processing. Additionally, they permit tailored memory management thanks to having access to C functions like malloc and free.

A variety of popular gems are native extensions. When running bundle install, any gem that prints out "building native extensions" to the console is one of them. Some examples are: nokogiri, mysql2 and yajl-ruby.

How to create a native extension

Let's walk through the steps for creating a native extension from scratch.

Creating the gem
Gemspec configurations
Adding the compile task
The extconf file
Creating the C extension
Requiring the shared object
Testing the extension

Creating the gem

The first step is generating the gem. The bundle gem command encapsulates that task. In this case, our example gem is called "super".

$ bundle gem super

Gemspec configurations

With the default files created, we need to modify the gemspec configuration to register the extension and also add the rake-compiler gem to be able to compile it in development. The important modifications are:

Adding the "ext" folder to spec.files. The ext folder is where the native extensions files will live
Adding the extconf.rb file path to spec.extensions. We'll go through what this file is later. For now, just remember that it needs to be inside the path "ext/NAME_OF_EXTENSION/"
Adding the rake-compiler as a development dependency

# super.gemspec

lib = File.expand_path("../lib", __FILE__)
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
require "super/version"

Gem::Specification.new do |spec|
  ...
  spec.files = Dir["{app,config,db,lib,ext}/**/*",
                   "MIT-LICENSE",
                   "Rakefile",
                   "README.md"]

  spec.extensions << "ext/super/extconf.rb"

  spec.add_development_dependency "rake-compiler"
  ...
end

Adding the compile task

After adding the rake-compiler gem, The compile task needs to be made available for the application. This is done by adding the following in our Rakefile.

# Rakefile
...
require "rake/extensiontask"

Rake::ExtensionTask.new("super") do |ext|
  ext.lib_dir = "lib/super"
end
...

The extconf file

The extconf.rb file contains the configurations to generate the makefile used to compile the extension. Customizing this file can be quite tricky, involving manipulating global variables, checking the current platform and including external libraries.

It becomes increasingly complex if the extension is split across many C files instead of a single one, for instance. However, the default configuration for a single file is straight forward.

# ext/super/extconf.rb

require "mkmf"

extension_name = "super"
dir_config(extension_name)
create_makefile(extension_name)

Creating the C extension

This is certainly the most challenging part of building native extensions. Learning how to use all the functions and macros made available by the RubyVM takes time and a few gotchas might have you looking at your code with a confused expression on your face.

An example of that is type conversions. A C float is not the same as a Ruby float and the appropriate macros need to be applied to handle values. If the input value is coming from Ruby into C, it needs to be converted into a C float. It must then be converted back to a Ruby float when returning to the Ruby context. Let's avoid type conversions in our super extension for simplicity.

The two mandatory steps of the C extension are: including the RubyVM and creating an initializer for the extension (which is named Init_NAMEOFEXTENSION). Everything else is the gem's logic.

# ext/super/super.c
#include <ruby.h>

void Init_super(void) {}

Let's dive into an example. We'll create the following class (represented here in pure Ruby) using the C extension.

# lib/super/super.rb

module Super
  class Super
    def initialize
      @var = {}
    end
  end
end

The equivalent native extension would be:

# ext/super/super.c
VALUE SuperModule = Qnil;
VALUE SuperClass = Qnil;

void Init_super();
VALUE super_initialize(VALUE self);

void Init_super() {
    SuperModule = rb_define_module("Super");
    SuperClass = rb_define_class_under(SuperModule, "Super", rb_cObject);
    rb_define_method(SuperClass, "initialize", super_initialize, 0);
}

VALUE super_initialize(VALUE self) {
    rb_iv_set(self, "@var", rb_hash_new());
    return self;
}

The class Super is now defined with the initialize method as presented in pure Ruby. The functions and macros details are listed below.

VALUE a macro for representing generic values
Qnil Ruby's nil definition
rb_define_module defines a module
rb_define_class_under creates a class under a given module. The arguments are the module object, the class name as a string and the class it will inherit from (which is Object in this case)
rb_define_method defines the initialize method. The arguments are the class object where the method will be defined, the method name as a string, the method implementation function and the number of arguments
rb_iv_set sets an instance variable to a given value. Takes the self object, the variable name, and the variable value
rb_hash_new instantiates a new hash. Just like {} in Ruby

Knowing the available RubyVM functions and macros is essential for creating extensions, but they are undoubtedly hard to memorize. Documentation and examples provide valuable assistance during the process.

Requiring the shared object

The native extension has been written. We can now cross our fingers, compile it and require the resulting shared object. Compilation in development is done using the task we previously imported.

$ rake compile

The result is the shared object file super.so under the lib folder. Requiring it in the gem's module will make all our definitions available.

# lib/super.rb

require "super/version"
require_relative "super.so"

module Super
end

Testing the extension

Our extension is complete and tests can be written to verify it. By requiring the shared object file, everything defined in the C is now available as Ruby. Therefore, extensions are tested likewise regular Ruby code.

Here is a possible test for the initialize method using rspec.

# spec/super/super_spec.rb

require "spec_helper"

describe Super::Super, type: :lib do
  describe ".initialize" do
    subject { described_class.new }

    it "sets var as an empty hash" do
      var = subject.instance_variable_get(:@var)
      expect(var).to eq({})
    end
  end
end

Conclusion

Using pure Ruby or C native extensions is a tradeoff. Despite the significant performance advantage, C extensions increase the complexity of reading and writing code when compared to Ruby.

Committing to using native extensions must be a conscious decision and the responsible team has to agree that the extra maintenance efforts will not surpass the performance benefits.

Nonetheless, knowing your way around native extensions is yet another useful skill for the Rubyist toolbelt.

Top comments (4)

Ashwin Vaswani • Apr 9 '19

Cool! I haven't created a native extension yet but this got me researching other alternatives and I found Helix, looks like it lets you write type safe performant Ruby classes in Rust. What's really interesting is that you can use any arbitrary Rust crate in your code.. Gonna have to find a reason to use this 😄

Vinicius Stock • Apr 9 '19

I haven't used Helix before, but it does seem interesting. I know there are other alternatives for writing native exntesions in Crystal and in Java. I wonder if they differ significantly in performance.

Mikhail Krainik • Apr 16 '19

As told there usehelix.com/roadmap#performance-p...

Performance parity with C
In general, Rust is in the same performance ballpark as C for code >written in Rust (sometimes it’s even faster). However, the cost of >crossing from Ruby to Rust is still high (compared to Ruby C >extensions).

That being said, because using native code is so much faster than >Ruby, you can recoup the cost difference pretty quickly. This >problem is more important for chatty APIs, or drop-in replacements >for Ruby APIs that intrinsically require a lot of communication with >Ruby (e.g. to_str).

George Plymale II • Apr 11 '19

Here is another interesting tutorial about how to write C extensions.

DEV Community