Software almost never runs in isolation. Today’s systems integrate with a vast number of external services. Ensuring reliability is difficult because the external dependencies, be it a database or an authentication system, adds an element of unpredictability which is difficult to emulate in isolation. A reliable system should account for the behavior of its dependencies. What does it help that an API is up and running when the underlying service it talks to hasn’t been accounted for a specific edge case and is causing an unexpected latency to my clients under certain conditions?
Emulating input and output is easy with regular unit tests. Many people rely on the so called mocks to emulate external systems. Mocks are the pieces of code emulating external dependencies and behaving almost like them. Almost being the key word.
An example is go
httptest.Server. It gives a fully functional, easy to configure HTTP server but the handlers do not necessarily behave like the dependency. These mocks only get us so far. Often the output depends on the quirks of the input, or the configuration of the system we integrate with. These edge cases may be very difficult to cover because they require reimplementing said quirks.
That’s exactly what I personally dislike about mocking. Reinventing this once or twice is maybe okay. Reinventing this for one, maybe two simple dependencies is maybe okay. When we intend to iterate our systems for years, we want to focus on our business problem. Our dependencies will mature, change scope, reimplement features and grow in complexity. We will add more dependencies and maybe replace existing ones with alternatives.
Maintaining such mocked services adds unnecessary overhead. When a new version of the dependency is released, not only we have to adapt our services. Our testing infrastructure must be adapted to cover the changes.
Yes, it does give us the opportunity to understand how our dependencies work. This is always good know but this knowledge should come via means other than rewriting the logic of the dependency. If you have to use a hash map, you don’t write your own. You use a library. Why would an OAuth server be any different?
A library is directly called in the path of our code. When the library dependency is upgraded, there’s an instant feedback loop. If the interfaces have changed or the behavior differs, the code will not compile or existing tests will fail immediately. With an external system, things are not often so obvious. Just because the API hasn’t changed doesn’t mean the dependency is doing the same thing. What would be great is to treat an external system in tests like a library.
Some organizations have enough resources to supply the development team with a shared integration environment to run the tests against. This is great because there is no need to maintain these mocking services anymore. Tests run against a real database, areal authentication system, a real queue, and so on.
The cost is the most obvious downside. That infrastructure costs money and someone has to maintain it. If you’re a solo developer or a small team, you either may not have the financial resources to simply make it happen, or simply do not have the time to maintain all of that yourself.
There’s also the element of inflexibility. There’s a central system configured once, sharing the state between multiple instances of tests. This can be improved on if the developers have access to good equipment and can run parts of the infrastructure locally. The elephant in the room is that this leads to the duplication in environment setup. There’s a production system and another copy of the system which needs (sometimes can’t) be configured as close to the production as possible.
§testing with containers
Containers are the third available option. Writing a system in go and need to integrate with Kafka? Writing a Ruby app and need to talk to Postgres? This is easy with Docker. Just start Kafka in Docker, connect to it in the test and run real code paths right in tests. Need Redis? No problem, start a container. Postgres? Why not a container. etcd, MySQL, Minio, Hydra and Vault together for a complex integration? Containers can do this, regardless if your system is PHP, Ruby, Rust, go or ColdFusion!
The ultimate method is to spin up the containers right in tests. There are libraries in virtually every programming languages enabling this.
§go and ory/dockertest
Here, I’d like to focus on go. I’ve spent some time over more than two years evaluating the Ory platform and the
dockertest library from Ory became an invaluable asset.
dockertest is a very nice abstraction layer on top of the
go-dockerclient making it so much easier to configure and execute containers with the focus on short lived tests.
Evaluating the Ory platform meant setting up Hydra, Keto and Kratos, in many different configurations. Reducing the time of every iteration, even as simple as changing the underlying configuration, was crucial. What’s the better way than spinning up a fresh setup in every test and configuring it in-test?
I haven’t found any better method than running containers. So what does it look like to run a container with
dockertest? This is an example from its readme:
This gives a running Postgres database server. It’s so easy that there is no reason not to do it.
A couple of weeks ago I have open sourced the
app-kit-orytest library. This library provides preconfigured Hydra, Keto and Kratos components running against Postgres database and is available on GitHub.
app-kit-orytest, the result is a full IAM / IdP environment right in the test.
What does it take to have Hydra, Kratos and Keto in the test? It’s only a few lines of code - here’s an example:
The code above starts Hydra, Keto and Kratos. It ensures the migrations are executed, the configuration files are written, the volumes are mounted and services listen on random ports making it easy to run multiple tests in parallel, if needed. With this library, I was able to very quickly test different Ory configuration.
Testing using external CI/CD might get complicated because third-party CI/CD platforms very often do not allow controlling of the Docker daemon by the unit under test but integration tests have often different testing pipelines anyway.
Writing tests using containers requires testing code hygiene. To cover potential dependency change, not too much of the test setup logic should live outside of the library. To what extent this is a downside, one has to answer themself.
With Docker and
dockertest, I was able to achieve quicker turn around when testing different configurations and approaches. That’s not only proving to be a great method for maintaining a reliable system but also enables quick exploration. A test can be also a scratch pad.