Concurrency for HTTP Requests in Ruby and Rails

 
Concurrency and parallelism in Ruby is represented by threads Photo by Wendy van Zyl from Pexels.

Ruby MRI does not support parallel CPU bound operations due to the dependency on non-thread safe C extensions. Input/Output operations like HTTP requests, are still a perfectly valid use case for spinning up multiple threads. Read on to learn what tools are available for requests concurrency in Ruby with all their cons and pros.

Global Interpreter Lock and blocking I/O

Let’s start by describing what’s a blocking I/O. Long story short, any operation that does not directly use the CPU cycles from its thread, but instead delegates the work to external processes is a blocking I/O. Typical examples in the context of Ruby on Rails web apps are SQL database queries, reading/writing to files, or HTTP requests.

To see what’s the practical difference between the CPU bound operation and blocking I/O check out the following code snippets. I encourage you to run them in your local IRB.

require 'securerandom'

def now
  Process.clock_gettime(Process::CLOCK_MONOTONIC)
end

start_sequential = now

5.times.map do
  100000.times do
    SecureRandom.hex
  end
end

finish_sequential = now
puts "Sequential duration #{finish_sequential - start_sequential}"

start_concurrent = now

5.times.map do
  Thread.new do
    100000.times do
      SecureRandom.hex
    end
  end
end.each(&:join)

finish_concurrent = now
puts "Concurrent duration: #{finish_concurrent - start_concurrent}"
On Ruby MRI sequential and concurrent CPU bound operation times will be almost the same
require 'net/http'

def now
  Process.clock_gettime(Process::CLOCK_MONOTONIC)
end

start_sequential = now

5.times.map do
  5.times do
    Net::HTTP.get('example.com', '/index.html')
  end
end

finish_sequential = now
puts "Sequential duration #{finish_sequential - start_sequential}"

start_concurrent = now

5.times.map do
  Thread.new do
    5.times do
      Net::HTTP.get('example.com', '/index.html')
    end
  end
end.each(&:join)

finish_concurrent = now
puts "Concurrent duration: #{finish_concurrent - start_concurrent}"
Multithreaded HTTP requests execute ~5 times faster


The first example of generating random strings is a CPU bound operation. Spinning up multiple threads does not affect its execution time. In Ruby MRI, Global Interpreter Lock (GIL) works as a mutex, not allowing several Ruby threads to run in parallel.

The latter is performing external HTTP calls (blocking I/O), so every new thread effectively parallelizes the execution, significantly reducing its duration.

The behavior will be different for JRuby and Rubinius because they don’t use GIL, but we’ll focus solely on MRI in this tutorial. To dive deeper into the topic of concurrency in the different flavors of Ruby, I can highly encourage somehow dated, but still surprisingly relevant eBook - Working with Ruby threads.

Now that we know what GIL, CPU, and I/O bound parallel operations are all about, let’s find out how this knowledge can be used in practice.


Case study: Slack API Requests

Abot is largely dependent on Slack API. All of the interactions with the anonymous bot command or UI interface can issue multiple HTTP requests.

Occasionally Slack API responded slower than usual, accounting for most of the endpoint’s execution time. Slack requires bot users backend APIs to respond within a maximum of three seconds, so it was necessary to optimize the faulty endpoints.

The culprit was the following part of Abot UI, fetching both public and private channels data via two separate Slack API HTTP calls.

Abot Slack UI

Abot Slack UI for writing an anonymous channel message


Code responsible for this part of the app was implemented in a Team::SlackApiClient class:

class Team::SlackApiClient
  ...

  def get_all_conversations
    get_groups_list + get_channels_list
  end
end


Each of the HTTP call was blocking the single main thread:

Synchornous HTTP requests

Synchronous HTTP calls


Let’s now discuss different approaches to parallelizing the HTTP calls:

Parallel HTTP requests

Parallel HTTP calls

Native Ruby Threads

A straightforward approach to solving this issue could be to rewrite the code as follows:

def get_all_conversations
  groups_thread = Thread.new do
    get_groups_list
  end

  channels_thread = Thread.new do
    get_channels_list
  end

  [groups_thread, channels_thread].map(&:value).flatten
end

Every request is executed it its own thread, which can run in parallel because it is a blocking I/O. But can you see a catch here?

If you cannot, that’s exactly the point. get_groups_list and get_channels_list method implementations are potentially non-thread safe, and there is no simple way to validate it. You could check out the method’s implementation details but in a typical Ruby project it’s turtles all the way down, with the usual excess of external dependencies.

There’s no way of knowing if some gem down the call stack uses a shared mutable state, or a mutex that can cause a deadlock.

Concurrent Ruby gem promises

A somewhat better approach in terms of thread safety could be to use one of the concurrency abstractions offered by a popular concurrent-ruby library. For the price of yet another gem, you get some thread safety guarantees with an honest warning that:

“No concurrency library for Ruby can ever prevent the user from making thread safety mistakes…“

The discussed code example rewritten using promises would look like that:

  def get_all_conversations
    groups_promise = Concurrent::Promise.execute do
      get_groups_list
    end

    channels_promise = Concurrent::Promise.execute do
      get_channels_list
    end

    [groups_promise, channels_promise].map(&:value!).flatten
  end

Even if thread safety is somehow expected, there are still risks associated with parallelizing Ruby code execution. One worth mentioning is exhausting the SQL database connections pool.

Multithreading and Rails SQL database pool

Every Ruby process has a limit on how many connections it can establish to the database. Each spawned thread requires a new connection. Performing an SQL query in parallelized blocks of code could quickly deplete the process pool. As discussed before, it can be challenging to guarantee that code down the stack trace would never connect to the database.

A case when the pool exhaustion scenario is highly probably is parallelizing code execution within Sidekiq jobs. A single Sidekiq process usually executes jobs in a couple of threads. If any of those jobs also uses the threads internally, then M x N database connections are required where M is a Sidekiq process concurrency and N number of threads spawned by a job.

Unless necessary, you should always avoid spawning new threads within Sidekiq jobs. Usually, a better approach might be to spawn more fine-tuned jobs, than to spawn threads inside a job.

Another scenario when depleting the pool can happen is using threads when handling a request in a Puma server. Due to its multithreaded nature, Puma works similarly to Sidekiq. Number of Puma worker threads competing for a database connection can easily exceed the available pool, crashing your Rails production servers.

Typhoeus Hydra

Probably the safest solution would be to stick with concurrency only on the layer of the HTTP requests. An excellent tool for that is a Typhoeus gem.

Its Hydra API allows dispatching parallelized requests. Contrary to the previous approaches, Typhoeus does not spawn Ruby threads but instead uses a cURL multithreading capabilities.

Because no Ruby threads are used, this solution doesn’t pose any of the risks described above. Multithreading is precisely scoped to the I/O section of the app, so shared state, the potential deadlock or sneaky SQL queries exhausting the database pool are not possible.

Implementing Typhoeus requires a bit more code changes than the previous solutions:

def get_all_conversations
  endpoint = "https://api.slack.com/conversations.list?token=#{access_token}&exclude_archived=true"
  hydra = Typhoeus::Hydra.hydra

  groups_params = "type=public_channel"
  groups_request = Typhoeus::Request.new("#{endpoint}&#{groups_params}")

  channels_params = "type=private_channel"
  channels_request = Typhoeus::Request.new("#{endpoint}&#{channels_params}")

  hydra.queue groups_request
  hydra.queue channels_request

  hydra.run

  groups_json = groups_request.response.body
  channels_json = channels_request.response.body

  # Rest of the parsing codes
  ...
end

If you plan your HTTP client architecture upfront for Typhoeus and its Hydra API, then implementation details could be abstracted away.

Summary

All of the proposed solutions are equivalent in terms of performance. For the case described, the speedup was close to 100%, because two requests were parallelized. With more subsequent requests, e.g. when paginating over a large collection, the performance gain could be even more considerable.

Multithreading has to be applied with care. Spinning up threads to make code “run faster because concurrency” is a recipe for disaster. Every scenario is different, but for parallelizing HTTP requests, I would recommend sticking with Typhoeus Hydra as the safest of all the described methods.



Back to index