One of the most interesting design challenges in a programming language interpreter or VM is what kind of memory management API to offer to native extensions.

This issue is more or less invisible unless you’re writing a native extension. When you are simply writing a program in Ruby, Python, Lua, etc., you don’t usually care how the memory management is implemented. You only care is that you can create objects with reckless abandon and that they will get cleaned up for you, somehow. The people who write the VM get a lot of latitude in how to actually implement this cleanup because the implementation is so hidden from user programs. Different VMs can use completely different strategies, even for the same language! For example, CPython uses reference counting whereas PyPy uses tracing, even though both are VMs for Python.

But when the VM offers a C/C++ API for native extensions, it usually exposes more implementation details. Memory management tasks that would be performed automatically if you were writing directly in Python (for example) need to be performed manually at the C/C++ level. For example, CPython knows how to increment and decrement refcounts when you say x = y in Python. But at the C/C++ level you must explicitly increment and decrement the refcounts yourself.

A native extension can create pointers to VM-owned memory (like a char* pointer to data from a string object), and it needs some way of knowing how long this data will stay alive. In many native extension APIs, you can even get a pointer directly to a VM object, and you need a way of ensuring that the target object isn’t deleted while you are using it!

This is an interesting design problem. What makes it especially interesting is how many different ways there are to solve it. I know of native extension APIs for Ruby (MRI), JavaScript (V8), Lua, PHP, and Python (CPython), and literally no two VMs in this list solve the problem the same way! When I discovered this, the situation seemed ripe for a series of blog articles exploring this problem and all of the solutions.

There is a fair bit of information to cover for each VM, so I’ve split this article into parts. This article will explore Ruby, specifically the C VM for Ruby known as MRI (Matz’s Ruby Interpreter).

Representing Ruby objects in C

Let’s start by painting a quick picture of how we refer to Ruby objects from a C extension. The ruby.h header defines a type called VALUE which we can use to refer to any Ruby object from C:

I put the Ruby object in “Ruby VM space,” which means that it’s owned/managed by Ruby, and its C-level representation is private to the Ruby interpreter.

If the Ruby object is of type String, then it will be represented internally by Ruby as an RString, and we can get a pointer to the string data by writing:

void foo() {
  VALUE str = rb_str_new2("Hello, world!");
  const char *data = RSTRING_PTR(str);
}

Now we have something that looks like this:

Often when we are writing a C extension, we want to be able to attach some custom C data (like a struct) to a Ruby value. Ruby provides APIs for creating Ruby objects that are just wrappers around some C data. These internally use Ruby VM object types RData and RTypedData (depending on whether you use the old or new API for this).

As a simple, example, suppose we wanted to implement a “pair” Ruby object in C that points to two Ruby values:

/* Our C representation of a pair of Ruby objects. */
typedef struct {
  VALUE first;
  VALUE second;
} RubyPair;

/* Function to construct a new RubyPair, which will be
 * wrapped in an Ruby "RTypedData" object. */
VALUE MakeRubyPair(VALUE first, VALUE second) {
  RubyPair *pair;

  pair = ALLOC(RubyPair);
  pair->first = first;
  pair->second = second;

  /* cRubyPair is the Ruby class for this, and specifies the
   * object's behavior in Ruby.  It's outside the scope of
   * this article.
   *
   * We'll describe RubyPair_type below. */
  return TypedData_Make_Struct(cRubyPair, RubyPair_type, pair);
}

It would look something like this:

Ruby’s Garbage Collector

Ruby uses a tracing garbage collector. The basic idea behind tracing GC is to traverse the graph of live objects, starting from the “root” objects (objects that are being directly used) and following all known references, which will determine which objects are still reachable. Unreachable objects can then be freed. There are lots of complicated variations on this theme that try to avoid scanning the entire heap every time, but the basic principle remains the same.

There are a couple key questions that a tracing garbage collector needs to be able to answer:

  1. what are the “root objects” (objects directly in use)?

  2. given an object, what other objects does it directly reference (so we can mark them too)?

Let’s start by addressing #2.

Finding References Between Objects

For types that are part of the Ruby VM (Array, Hash, etc) Ruby already knows how to scan them to find the VALUE pointers inside. Ruby can trace references between objects in “Ruby VM space” without our help.

But what about types we create in a C extension, like RubyPair above? If the GC “mark” phase reaches this object, it needs to know how to follow these two Ruby object pointers. How will it do this, since it doesn’t know how to find the VALUE pointers in our RubyPair struct?

Ruby’s answer is to put this responsibility directly in the hands of the extension author. Anyone who implements a Ruby data type in C that can reference other Ruby objects must implement a “mark” function that visits them. For example, here is a mark function that would work for RubyPair.

void RubyPair_mark(void *_self) {
  RubyPair *self = _self;

  rb_gc_mark(self->first);
  rb_gc_mark(self->second);
}

void RubyPair_free(void *self) {
  xfree(self);
}

rb_data_type_t RubyPair_type = {
  "RubyPair",
  { RubyPair_mark, RubyPair_free, NULL },
};

Ruby calls RubyPair_mark whenever the “mark” phase of garbage collection reaches one of our RubyPair objects. It counts on us to call rb_gc_mark() for every Ruby object we reference. It will then free things that can’t be reached from any mark function. If you make a mistake and forget to call rb_gc_mark() on one of your references, it may get freed even though you are still referencing it. You will probably SEGV the interpreter next time you try to access that object!

Finding GC Roots

So how does the VM find GC roots? This is where things get tricky. Ruby already knows about any live references from Ruby-level variables (local, global, or instance variables). That part is easy. But what about Ruby objects that are only referenced from C? For example, you could write something like this in C:

VALUE RubyPair_inspect(VALUE self) {
  VALUE str = rb_str_new2("<");
  str = rb_str_append(str, rb_inspect(self->first));
  str = rb_str_cat2(str, ", ");
  str = rb_str_append(str, rb_inspect(self->second));
  str = rb_str_cat2(str, ">");

  return str;
}

Any of those calls to rb_str_* functions could trigger a garbage collection cycle. How will the GC know to avoid collecting str itself?

Ruby’s solution to this problem is to scan the C stack and registers (!!) looking for VALUE pointers. This is a highly platform-specific and non-portable thing to do. Ruby needs some way of deciding what addresses signify the begin and end of the stack. And as it scans, it needs to skip/discard any data that isn’t a VALUE – this is somewhat heuristic by its nature. Despite all these caveats, the scheme must work well enough, since Ruby has been using it for over 10 years now. The details of how this is done are beyond the scope of this article. But if you are interested in knowing more, there is some fun reading in gc.c.

If the GC is triggered inside our RubyPair_inspect function above, Ruby will find the str local variable and know to treat it as a GC root, since the C stack references it. This will keep str alive for as long as the function runs. And if we get a pointer to the string’s internal data with RSTRING_PTR(str), we know it will live as long as the String object does.

It is also possible for VALUE pointers to be in global variables . Ruby provides the functions rb_gc_register_address(VALUE*) and rb_gc_unregister_address(VALUE*) that you can call with the address of any variable of type VALUE. This will probably be a global variable (while it could technically be on the heap, I’ve never seen any reason to do this). Ruby will read the variable and treat it as a root every time the GC runs its “mark” phase.

We can put this all together to get a full picture of Ruby’s memory management model for native extensions:

This memory model and API are convenient to use from C. Functions like RubyPair_inspect are nice to write and are GC-safe without requiring any manual memory management bookkeeping.

The trickiest part is making sure you get the mark() function right. To test this properly you need to make sure you trigger some GC cycles while your user-defined objects are alive, and then traverse your objects again to make sure none of the VALUE pointers are dangling. You also need to make sure that every VALUE you are using from C is discoverable as a root by the GC.