21 Feb 2017 · Software Engineering

    Stubbing the AWS SDK

    9 min read
    Contents

    Originaly published on https://devops.college. Republished with author’s permission.

    If you’re reading this, dear Devop, you can probably remember a time before you were a Devop, or a Site Reliability Engineer, or whatever. Back to a time when you were a Systems Administrator, or a Unix Technician, or an Infrastructure Engineer; when your job involved switch blades and RAID cards, and you had to worry about LUN alignment and BIOS configurations.

    Of course nobody is suggesting those tasks have disappeared but, with the advent of cloud computing, much of the yuckiness has been abstracted away from us. Instead of ‘remote hands’ we are presented with a nice clean API, gifting us the ability to build complex infrastructure…provided we have the skills to utilize it.

    A corollary to this has been rise of OSS. Instead of black-box COTS products with expensive support contracts we are encouraged to use free software, with the source code hung out like fresh linen, inviting us to fix our own bugs and craft our own features. Inevitably then, our roles have transmuted. Increasingly we have to live up to the Dev part of our job title. Where once we wrote scripts we now write code; we can find ourselves unexpectedly in coding interviews, and writing tooling has become an essential skill.

    But this is good news! It should be embraced. What it means in practicality is that we’ve had to imbibe knowledge from our developer cousins, to catch up with them in a relatively short period of time. We have quickly adopted their coding practices: pull requests, code reviews, coding standards…and testing.

    Testing. Until recently, I’ll admit, tests were things that were bolted on once everything else was done. TDD was something that happened in text books. I’d embark on a test suite with a heavy heart and a dramatic sigh, and the afternoon would drag.

    Of course I was wrong. I’m now a convert, a born again testafarian. The moment of epiphany came when I realised that, far from being a hindrance, a good test suite is absolutely indispensable. Its a small investment up-front, a down-payment for months or years of smooth maintenance and easy extendability.

    This is particularly true when writing code that interacts with vendor APIs, such as AWS. As long as the API is well designed and well documented, we are easily able to mock the expected responses, and write whole applications without a single credit card detail being passed.

    To illustrate this point let’s write a very small Ruby class to perform the simplest of tasks — listing DynamoDB tables. We’ll use RSpec and the frankly awesome response-stubbing powers of the AWS Ruby SDK to test two common scenarios: listing only tables with a given prefix, and handling paginated responses.

    In true TDD fashion, lets write some tests first:

    require_relative "../dynamo_lister.rb"
    require_relative "factories/dynamo_factory.rb"
    
    include DynamoFactory
    
    describe DynamoLister do
      context "list tables" do
        before :context do
          Aws.config[:dynamodb] = {
            stub_responses: {
              list_tables: list_tables_response(prefix: "test")
            }
          }
        end
    
        it "lists all tables" do
          expect(DynamoLister.list_all.length)
            .to eq(10)
        end
    
        it "lists all tables in given environment" do
          expect(DynamoLister.list_all(environment: "dev")
            .length)
            .to eq(0)
        end
      end
    end

    Nothing too radical here. We are describing tests for the eminently useful DynamoLister class. We wish to test listing all tables and then listing only those with a given prefix. (In the absence of any sort of namespacing, it is recommended practice to prefix DynamoDB tables with the environment in which they run.)

    One thing that does beg explanation is the setting of Aws.config[:dynamodb]. Here we are telling the AWS SDK to stub responses for — and only for — calls made to the :list_tables method of any instance of the Aws::DynamoDb::Client. Furthermore, we are telling it exactly what should be returned when this call is made, a list_tables_response(prefix: test).

    Of what provenance is this mythical list_tables_response? Well, let me draw your attention, dear Devop, to the inclusion of the DynamoFactory module and its forebear, the require_relative "factories/dynamo_factory.rb".

    module DynamoFactory
      def list_tables_response(count: 10, prefix: "test")
        Aws::DynamoDB::Types::ListTablesOutput.new(
          table_names: Array.new(count) do |i|
                         "#{prefix}_table_#{i}"
                       end
        )
      end
    end

    This function returns a handcrafted Aws::DynamoDB::Types::ListTablesOutput, populated with a list of names of DynamoDB tables. This is exactly the response that the AWS SDK returns when a list_tables call is made. How do we know this? Because the AWS SDK docs are blessedly complete. We can easily identify the types and nested types that are expected to be returned from any API call. This happens to be a very simple example, we can see here that the ListTablesOutput consists simply of a table_names array of strings, and potentially a last_evaluated_table_name, which we’ll come to in a moment.

    The factory function takes 2 values: :count, which is the number of tables to be included in the response, and :prefix, a string which all of the tables will be prefixed with (i.e. the environment).

    So, we now know that any call to list_tables made against any instance of Aws::DynamoDB::Client will result in a list of 10 tables (the default :count value), all prefixed with “test” (the value we pass for :prefix in the initial stub_responses config). With this knowledge, the 2 test cases we have written should start to make sense. The first expects that our DynamoLister will return 10 tables, the second that it will return 0 tables when we ask only for those prefixed with “dev”.

    With the tests in place, we can go ahead and write a simple DynamoLister class:

    require "aws-sdk-core"
    
    class DynamoLister
      class << self
        def list_all(environment: "*")
          Aws::DynamoDB::Client.new
                               .list_tables
                               .table_names
                               .select { |t| /^#{environment}_/.match(t) }
        end
      end
    end

    This very simply invokes the list_tables call, and processes the returned list of table_names, weeding out those which do not match the given prefix. The prefix defaults to all (*) so by default all tables are returned.

    A quick run of rspec will confirm that our tests pass.

    However we can’t in all good conscience leave our test suite here.

    The AWS docs tell us explicitly that:

    The output from ListTables is paginated, with each page returning a maximum of 100 table names.

    This means that, if we ever create more than 100 DynamoDB tables (which, across all environments, is wholly feasible) then our DynamoLister will cease to work. We need to ensure it can handle paginated responses. Fortunately, we can easily figure out what a paginated response might look like, and cause our test suite to return one.

    Recall from above that the list_tables response may also include a last_evaluated_table_name. Also note that the call to the list_tables may include an exclusive_start_table_name and you can probably start to figure out how to deal with paginated responses: if a response includes a last_evaluated_table_name then we must make the call again, passing that value as an exclusive_start_table_name.

    Again, let’s write the test first:

      context "paginated response" do
          it "lists all tables with paginated response" do
            @client = Aws::DynamoDB::Client.new(
              stub_responses: {
                list_tables: list_tables_paginated_response(count: 100)
              }
            )
            expect(Aws::DynamoDB::Client)
              .to receive(:new)
              .and_return(@client)
            expect(@client).to receive(:list_tables)
              .with(no_args)
              .and_call_original
            expect(@client).to receive(:list_tables)
              .with(exclusive_start_table_name: "test_table_100")
              .and_return(list_tables_response(count: 10))
    
            expect(DynamoLister.list_all.length)
              .to eq(110)
          end
        end

    The assignment of a stubbed Aws::DynamoDB::Client instance to @client shows that you don’t have to stub responses globally, you can do so on a per-instance basis. The next thing we do here is to cause our doctored @client object to be returned when we instantiate an Aws::DynamoDB::Client object (i.e. the call to :new).

    Next we tell RSpec we expect that @client to receive a call to :list_tables with no arguments, and when it does we are to ‘call_original’. In this case calling original will result in our stubbed list_tables_paginated_response(count: 100) being returned. Lets take a look at this response (which has been added to dynamo_factory.rb):

    def list_tables_paginated_response(count: 10, prefix: "test")
        Aws::DynamoDB::Types::ListTablesOutput.new(
          last_evaluated_table_name: "#{prefix}_table_#{count}",
          table_names: Array.new(count) do |i|
                         "#{prefix}_table_#{i}"
                       end
        )
      end

    It is almost identical to the non-paginated response function apart from the addition of a last_evaluated_table_name, which in our case will be set to test_table_100.

    Back in the test above, the next expectation we state is that the @client object will receive a second call to :list_tables, with an exclusive_start_table_name parameter. This time it is to return a standard, non-paginated response of 10 tables. Finally we state that we expect the total number of tables to be returned by the DynamoLister.list_all call will be 110.

    The last thing we need to do is amend the DynamoLister itself to be aware of paginated responses by telling it to loop until last_evaluated_table_name is nil:

    class DynamoLister
      class << self
        def list_all(environment: "*")
          client = Aws::DynamoDB::Client.new
          resp = client.list_tables
          tables = resp.table_names
          while resp.last_evaluated_table_name
            resp = client.list_tables(
              exclusive_start_table_name: resp.last_evaluated_table_name
            )
            tables.concat(resp.table_names)
          end
          tables.select { |t| /^#{environment}_/.match(t) }
        end
      end
    end

    The last last thing we need to do is to run RSpec and check that all of our tests pass. Success!

    The clear benefit to having written this test suite is that we were able to develop the DynamoLister and ensure it worked with over 100 tables without using any AWS resources. But a less obvious benefit is that we were able to completely refactor the code, and still ensure backwards functionality.

    P.S. Would you like to learn how to build sustainable Rails apps and ship more often? We’ve recently published an ebook covering just that — “Rails Testing Handbook”. Learn more and download a free copy.

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    Avatar
    Writen by:
    Louis is Lead DevOps Engineer at Space Ape Games in London. He has a keen interest in Ruby, Go and all things DevOps. Find him on Twitter and on his blog.