TL;DR

This bug has been fixed in 2.6.1. Please upgrade and all should be good.

If you do any HTTP communication (HTTP requests, Elasticsearch, etc) do not upgrade to 2.6.0p0 or apply the patch below as soon as possible.

Ruby is eating up characters when pushed over HTTP

Ruby 2.6.0 has been released not long ago. Not many are unfortunately aware of a major bug that was introduced with these release.

This bug can affect you in many ways, some of which you may not even be aware. All may run well up until you decide to send a particular type of payload and then, things will get interesting.

What am I talking about?

This. What does it even mean? Well in the best scenario it means, that you will end up having a critical error like so:


Net::HTTP.post(URI('http://httpbin.org/post'), 'あ'*100_000)
Traceback (most recent call last):
       16: from /net/http.rb:502:in `block in post'
       15: from /net/http.rb:1281:in `post'
       14: from /net/http.rb:1493:in `send_entity'
       13: from /net/http.rb:1479:in `request'
       12: from /net/http.rb:1506:in `transport_request'
       11: from /net/http.rb:1506:in `catch'
       10: from /net/http.rb:1507:in `block in transport_request'
        9: from /net/http/generic_request.rb:123:in `exec'
        8: from /net/http/generic_request.rb:189:in `send_request_with_body'
        7: from /net/protocol.rb:247:in `write'
        6: from /net/protocol.rb:265:in `writing'
        5: from /net/protocol.rb:248:in `block in write'
        4: from /net/protocol.rb:275:in `write0'
        3: from /net/protocol.rb:275:in `each_with_index'
        2: from /net/protocol.rb:275:in `each'
        1: from /net/protocol.rb:280:in `block in write0'

However, there's a much more interesting case that you can encounter. You can end up sending data that will be trimmed in a way that will make your server receive incomplete yet valid information.

That is not a security issue per se but can be a massive problem if you use your format as a protocol between some internal services.

Sidenote: bjeanes reported on Github, that this bug can also corrupt JSON in a way that will make it parsable but incorrect regarding data it consists.

Set HTTP API as a POC of this bug

To illustrate how this bug can become problematic and hard to debug, let's build an HTTP based API that implements basic set operations via the web.

Some assumptions for the sake of simplicity:

  • we always send data in the following format: DATA,COMMAND;
  • we have three commands: GET, ADD and DEL;
  • to save a couple of bytes, when no command provided as a second argument, we run an ADD command;

This is how our abstract API could work:

client = Api.new
client.get #=> []
client.add('12313131') #=> ['12313131']
client.add('msg') #=> ['12313131', 'msg']
client.del('msg') #=> ['12313131']

A set API server implementation

The implementation of such an API server will just take us a couple of lines in Ruby:

require 'webrick'
require 'set'

server = WEBrick::HTTPServer.new(Port: 3000)
set = Set.new

server.mount_proc '/' do |req, res|
  data, action = req.body.split(',')
  action ||= 'ADD'

  # Return set data for any command, no need to handle GET
  case action
  when 'ADD'
    set.add(data)
  when 'DEL'
    set.delete(data)
  end

  res.body = set.to_a.to_s
end

trap('INT') { server.shutdown }

server.start

You can start it by running:

ruby server.rb
[2019-01-09 22:38:58] INFO  WEBrick 1.4.2
[2019-01-09 22:38:58] INFO  ruby 2.6.0 (2018-12-25)
[2019-01-09 22:38:58] INFO  WEBrick::HTTPServer

A set API client implementation

The client is not much more complicated:

require 'net/http'

class Api
  HOST = 'localhost'
  PORT = 3000

  def initialize
    @http = Net::HTTP.new(HOST, PORT)
  end

  def get
    request nil, 'GET'
  end

  def add(data)
    request data, 'ADD'
  end

  def del(data)
    request data, 'DEL'
  end

  private

  def request(data, cmd)
    Net::HTTP::Post
      .new('/', 'Content-Type': 'application/json')
      .tap { |req| req.body = "#{data},#{cmd}" }
      .yield_self(&@http.method(:request))
      .yield_self(&:body)
  end
end

client = Api.new
client.get
client.add('12313131')
client.add('msg')
client.del('msg')

When executed, you end up with exactly what we've wanted to achieve:

puts client.get #=> []
puts client.add('12313131') #=> ['12313131']
puts client.add('msg') #=> ['12313131', 'msg']
puts client.del('msg') #=> ['12313131']

Risk of an uncompleted payload

So far so good. We have an excellent API that we can use for storing anything we want. And here magic starts.

We decide to store some analytics results, that are used by other APIs to grant access to some super essential and expensive business information™.

It doesn't matter what the results are. All we need to know from our perspective, is the fact, that it will fit into memory. So, we hand out our API client code to other developers; we run our server and... in the middle of the night the phone rings:

Data that is supposed to be deleted is still available. We constantly run the DEL command but nothing disappears! We need to revoke all the access ASAP!

How can it be!? This service has been running for months now, and everything was good. There was a recent update in Ruby, but even after that specs were passing and the service has been running for at least two weeks.

And this is the moment when this bug presents itself in all the glory. For big enough payload, Ruby is trimming data that is being sent, and unfortunately for us, it trims last three letters, that is the full DEL command. When we run an ADD and DEL on a given string, we expect it not to be in the results anymore, however...

Note: the dots from the payload below aren't usual dots but Unicode middle dots - that is important.

PAYLOAD_SIZE = 8_301_500
data = 'a' * PAYLOAD_SIZE + '···'

client = Api.new
client.get
client.add(data)
client.del(data)
puts client.get #=> ["aaaaaaaaaaaa...\xC2\xB7\xC2\xB7\xC2\xB7"]

The data is still there! Because the data consists multibyte characters, the payload got trimmed, and we've ended up with a non-direct GET operation (DATA,) instead of a DEL. We had three multibyte characters in the data, and because of that, Ruby removed three last characters from the string before sending it to the server.

Patching things up

As a temporary patch you can use the body_stream instead of using body combined with Ruby StringIO:

Net::HTTP::Post
  .new('/', 'Content-Type': 'application/json')
  .tap { |req| req.body_stream = StringIO.new(operation) }
  .tap { |req| req.content_length = operation.bytesize }
  .yield_self(&@http.method(:request))
  .yield_self(&:body)

or if you use Faraday, you can just apply following patch:

module NetHttpFaradayPatch
  def create_request(env)
    super.tap do |request|
      if env[:body].respond_to?(:read)
        request.content_length = env[:body].size
      end
    end
  end
end

Faraday::Adapter::NetHttp.prepend(NetHttpFaradayPatch)

Here's the proper fix, however Ruby 2.6.1 has not yet been released.

Summary

It's a rather unpleasant bug, and I'm quite surprised that despite being fixed, new Ruby version hasn't had been released yet Ruby 2.6.1 has been released and it fixes the issue. For now, if my patches work for you, that's great but anyhow I would advise you to downgrade to Ruby 2.5.3 . It's hard to be sure, that there aren't other scenarios in which this bug may become even more problematic.


Cover photo by theilr on Attribution-ShareAlike 2.0 Generic (CC BY-SA 2.0) license.