You're reading the Ruby/Rails performance newsletter by Speedshop.

Let's talk about three myths about why Ruby isn't faster than it is.

I've been doing this for a while now, so I've heard lots of guesses as to why Ruby isn't faster. I refuse to say "slow", because Ruby isn't meaningfully slower than it's close competition (exception Javascript, but I would rather say 'JS is fast' than 'Ruby is slow').

However, these guesses are usually wrong and rooted in misconceptions. Let's talk about the big three I hear the most often.

Myth 1: Ruby is slow because GC is slow

I think this was maybe true during the Bad Old Days of Ruby 1.8, but it certainly hasn't been true since Ruby 2.3's release.

Ruby 2.0 through 2.3 contained a number of extremely important improvements to Ruby's garbage collection which reduced GC time in most programs to near-zero. Almost all of these improvements were authored by Koichi Sasada.

The biggest improvement was around generational garbage collection - rather than GC all objects every time you need to GC, instead, you can use the age of an object to determine how often you need to check if it's free or used. You can check old objects less frequently, and check new objects more frequently.

I gave a conference talk about all of the improvements made to the garbage collector during this time period at Fosdem 2017.

I get to see several dozen new Ruby applications in great detail every year, thanks to my consulting work. Most application performance monitoring (New Relic, etc) tools now will show you how much of your total CPU time is spent on GC. I can tell you that over 5 years of consulting, I can count on one hand the number of applications that spend more than 1% of their time in GC.

I think this myth comes from a programmer perception about the primary difference between a language like Ruby and a compiled language like C is that Ruby's memory is managed, and C is not, so memory management and GC must be a huge performance drag. Well, the real-world numbers just don't bear it out.

Major GC pauses on Rails apps take about 100 milliseconds and run about once every 100 requests. That's not a significant source of slowdown. Minor GC pauses are 5-10 milliseconds and run once every 7 or 8 requests. It's just not a big deal.

This is also why "out of band GC" no longer really matters. What's the point of GC-ing outside of the request if it imposes such a small penalty?

Myth 2: Ruby is slow because "not enough concurrency"

It also seems like many programmers think Ruby is held back because of something relating to the Global VM Lock, concurrency or parallelism.

First, let's be clear: parallelism or concurrency will not make your Ruby programs faster by itself. Let's say you removed the GVL today: would your programs get faster? Nope. They would not. It's not magical speed juice.

Instead, we have to write our applications in ways that take advantage of the underlying parallelism available.

Consider this: Ruby's concurrency model is not really different at all from Node's. Node has a GVL, just like Ruby. However, the Node community decided to write their applications around an event loop, and all of their applications are written so that they don't block that event loop waiting on anything.

This style has been attempted in Ruby, but never caught on. First, we had EventMachine. Now, Samuel Williams (ioquatix) is trying something similar with Falcon and his new additions to Ruby core (the Fiber scheduler coming in Ruby 3).

These styles are unpopular with Rubyists because they involve asynchronous code, callbacks and promises, which are confusing and not how we're used to writing programs. I am skeptical they will ever catch on with the community. We could have worked like Node does, but we decided not to.

Ruby already has a parallelism model, it's just not in the language: multiple processes. It works great. It's running the biggest websites in the world, right this moment. It's not meaningfully faster or slower than other approaches for the purposes of web applications, but it is certainly more memory intensive.

However, these days, RAM is so cheap, even on cloud providers, that process-based parallelism isn't a blocker anymore. More and more Rails applications (especially outside of Heroku) run out of CPU before they run out of memory.

More parallelism, like Ractor, will be great, but it won't be game changing for existing Ruby applications. Instead, I expect that it will open up new areas to Ruby around massively concurrent applications with many persistent connections: load balancers, WebSockets, WebRTC and other apps that require many open connections at once.

Myth 3: Ruby is slow because core would rather do other things

This particular myth isn't unique to the Ruby community. I hear it in the gaming community a lot: "company X should stop shipping features X Y and Z and focus solely on bugfixes".

This myth ignores the reality of who does the work. Like developers in any software company, open source contributors have different skills and interests. Not everyone at a software company can ship bugfixes. Some people are artists, illustrators or UI designers: should they just twiddle their thumbs while all the programmers work on bugfixes?

In OSS, some people are good (or even just interested) in some things, and others are interested in other things. There are a lot of people currently active on Ruby core who are interested in syntax and language features. That's why you've seen things shipped like pattern matching, etc.

OSS development is not driven by demand from the eventual software consumer. It is driven by the passions, skills and interests of those who donate their time for free to the project. And, often, those two things do not align.

We have maybe a handful of people who contribute performance improvements to Ruby core, and only two people making big, important improvements: Koichi Sasada and Takashi Kokubun. These two people, especially, really don't ever comment or work on the language syntax or new features. They work entirely on their own performance realms (Mr. Sasada on the VM, and Mr. Kokubun on the JIT). They're not being distracted or slowed down by other areas of Ruby.

The reasons why Ruby isn't faster than it is are simple:

Not enough C-language programmers contributing improvements to the core VM.
There is no number 2.

How we cultivate item number one is up for debate, and I have no good solutions. The most often proposed is corporate sponsorships, but that ignores three realities:

Who will they sponsor? It's not like there's a massive pool of talented C programmers with tons of experience in Ruby that would love to do this. If they pick someone who's a good C programmer but without any experience in Ruby, how long will it take for that person to make a meaningful change?
The free rider problem. Large companies can coast on the contributions and sponsorships of other large companies. Why be the first to sponsor when you can wait for someone else to do it?
Maybe Ruby is fast enough. For most large companies, the investment of $250k or more per year into _maybe_ making Ruby _a little bit_ faster may not improve their business meaningfully.

Anyway, I'm interested to hear what you think. As always, you can reply to this email and it goes direct to my normal email inbox.

-Nate

You can share this email with this permalink: https://mailchi.mp/railsspeed/ruby-perf-myths-gc-and-concurrency?e=45f407dd66

Copyright © 2020 Nate Berkopec, All rights reserved.

Want to change how you receive these emails?
You can update your preferences or unsubscribe from this list.