21 April 2015

Unit testing has evolved a long way over the years, continuous integration services even more so. Few years back, when mentioning CI, Jenkins was the only reasonable choice someone would take. It was kind of easy to set up, depending on what one was planning to do with it. Looking at this space today there are plenty of services offering painless CI for the masses. The most known one - Travis CI. Compared to Jenkins it’s night and day. Simply sign in with Github, go to settings, enable repository to be tested and you are off. The build can be customised by modifying .travis.yml file placed in the root of the repository to be tested. Every push to Github, every pull request and every merge is automatically tested. Results are available on Github together with full history of the project. Easy.

I love Travis CI, like I would most likely love a lot of similar products, Snap CI comes to mind, there are dozens of those out there in the wild. Travis has changed how people think of unit testing / CI. If you are working on an open source project, there’s no reason not to use Travis or similar solution. For open source projects Travis CI is available for free. But there is one problem with such products. Obviously, people who run them are not charities. This is not to criticise, we all write software and would like to make for living somehow. Travis, like any other platform of such kind, has a paid version. The free version is a driver to bring customers to the paid version. The free version has limits. In case of Travis, there are two most obvious. Free version allows public repositories only and there’s a global resource limit available for all users. As the product grows in popularity, testing becomes slower and slower. To the point where it becomes, by far, the slowest process of all, 10 minute wait is not uncommon these days. Please keep in mind, this is not a complaint, it’s great that these products exist and can be used for free. The problem still exists though. There is a lot of great free tools that don’t get any direct financial support and a paid version is not an option. Any kind of payment may be a stretch.

Let’s have a look at what the modern open source project cycle is. There is a (distributed) version control system. The members (committers) are most likely distributed around the world, they contribute at different times and different pace. Their work is available to the public via central repository. Most often Github, but also Bitbucket and probably more I never heard of. A central testing / CI mechanism is employed, such as Travis, to enable full visibility to everybody involved in development and everyone who wants to use the product. It’s a central source of truth. Any delay hinders the progress.

Is there a (better) way to do this? A few weeks ago, while talking to one of my colleagues at work, I mentioned an idea of a system which would enable testing / CI of the open source products on a big scale using resources already available to the developers. The purpose is to fulfil the process described above. How could such system work?

There would be a central website displaying build / testing progress, like Travis does. This website would be communicating with systems such as Github or Bitbucket, it would receive notifications about pushes, pull requests and merges. It would display every build status and progress. It would report results back to Github, Bitbucket and such. The problem to solve is the resources' availability to run tests / CI. The website could have a number of virtual machines available for download, say VMWare, VirtualBox, QEMU, KVM and so on. Each of those would be configured to receive a build / test / CI request from the central system, execute it according to the given specification and send the results in real-time back to the central system. People could contribute to the overall processing capacity of such system by simply running these VMs. The more of them out there, the more processing capacity available for everybody. Such VMs could either run Docker images, for fast tests that don’t require root permissions (or sudo as one prefers), or could be used directly to run tests with all the options enabled.

There are obvious problems to solve. Security — how to prevent this system being turned into a bot net and ensure that the results are not manipulated with. How to ensure that tests are run in timely fashion, the results are available. How to enable the system to be fault tolerant. Once the build is accepted by a VM worker, it has to be completed. A VM failure could prevent the results from arriving at the central system. The job may have to be replayed somewhere else. The tools are available. It would be in an interest of open source developers to run such VMs, maybe some organisation could donate computing power to run more VMs? There is cost involved, running a “central” system wouldn’t come for free. This would have to be load balanced and replicated across different geo-locations to make it fault tolerant in itself. But it would be significantly cheaper to run than having a fleet of machines on stand-by ready to execute tests for the world.

Just a thought...