BDD and test parallelisation with Lettuce
One of the most appreciated agile practices, that proved to be notably valuable in many projects is Test Driven Development (TDD). Many teams use TDD as a way to validate their work, in many stages of the product's lifecycle. The value of having a quick feedback on the status of the code every time a change is made is not difficult to figure out, and we have already discussed about the advantages of having a safety net when refactoring or rewriting small to big pieces of code, which is one of the many positive "side-effects" that come for free when adopting such a test driven approach.
Of the many shapes that TDD has taken since its conception, one of the most interesting is certainly BDD, for the power it has of describing the features of a system in a way that can be shared and understood not only by the developers of the system, but all of the roles involved in the project (end users, stakeholders…). For web applications, a test written using the BDD approach normally means having a browser opened up, usually a step that takes care of preparing the data needed to exercise the system under test, and then one or more steps that serve as a validation of the expected behavior.
Many tools exist that help in both the task of translating BDD test definitions into something that can be actually executed and automated, and the task of interacting with the browser.
At Agilo Software, we make extensive use of two of them:
- Lettuce (http://lettuce.it) which is one of the Python alternatives to the Ruby world's Cucumber;
- Selenium (http://www.seleniumhq.org) for web browser automation.
In spite of all the efforts targeting optimization, performance, and speed of execution, this kind of tests are time consuming, besides being quite expensive to maintain and these two downsides are much more prominent when comparing this kind of tests to different types like unit tests, for example.Furthermore, the aspect related to maintenance costs is in turn affected by the aspect of slow speed of execution, as a slow feedback cycle does nothing but making things worse when debugging such tests. We have found a good way of dealing with the execution time problem hence reducing dramatically the maintenance costs as a consequence, and this solution is based on the parallelisation of tests execution.
Of all the existing solutions we investigated, none seemed to meet our needs:
Selenium grid: is a good solution to run the same (or different) tests using a shared server, different browser instances, possibly of different vendors, and even different platforms. In our case using Lettuce as the main test runner, having multiple browser available for parallel testing does not solve the problem of running the different scenarios in parallel;
Parallel_tests takes advantage of multiple CPU cores and is a native cucumber-ruby tool. Although it is possible to run non-ruby tests with Cucumber, we already had quite some infrastructure code tied to Lettuce and preferred not to add another adaptation layer to it, which could have affected, among other things, the speed execution that we were just trying to reduce. Furthermore parallel_tests does not support running the tests batches spread over different machines.
We decided to go for a custom solution that could allow us to scale easily with a continuously growing number of scenarios and execution time. Because of this, having a test runner which splits the tests in groups that can be ran by different process was not enough, so we decided to design the testing infrastructure in a way that the tests could be split among different processes, in turn split among different machines. Of course, in such a distributed environment, all of the processes need to report to the same central process that can build a collective view on the results of the tests.
Turns out that a solution with the tools at hand was not too complicated, though involving quite some thinking, but the results have been definitely satisfying with a 60% reduction in execution time. So we decided to share this experience with our blog readers and here is what we did:
Created a VM template that can be used to replicate the base environment
Each node (VM) has an updated local copy of the production code and the test code, and starts running multiple instances of all of the Lettuce tests excluding the ones tagged with the "exclude" tag, each time with a different environment variable, identifying both the node and the runner
Each time a runner examines a scenario, a logic is used to determine which of the existing runners should be in charge of running that scenario, based on the environment variables previously defined. If it turns out to be a different runner, the "exclude" tag is added to the scenario at runtime.
The logic used to decide whether or not to exclude a specific scenario is common and reproducible among all of the runners, on all of the existing nodes, and it's basically based on a mix of hashing and modulus operations, involving the node and runner ids. In this way each runner executes a subset of the available scenarios (tests) and produces an xml file with the results of its specific subset, which can then be parsed by a central process, together with the results from the other runners.In our case this process is a Continuous Integration server, which also takes care of showing a live status of these tests, and publish the results on a web page.
The results vary depending on the overall number of runners, which in turns depends on the number of available nodes and the maximum number of runners per node. We have found that running more 10 instances of a browser (our tests were using Google Chrome) on a Linux VM (Ubuntu in our case) with 6GB of RAM and 2 virtual CPU cores is the best tradeoff between parallelism and resource-usage related degradation. Of course, these values strictly depend on the system in use and the type of tests, though.
There are many modifications and possible improvements for this approach and we are eager to hear from you about this. How? <strong>Comments!</strong>
- Agilo for Scrum is retiring
- Django-treebeard and Wagtail page creation
- The Charity Sport Tournament in Lublin
- New Release of Agilo for Trac (0.9.15/1.3.15)
- Incontro DevOps Italia 2016
- Configuring Test Kitchen output for Jenkins
- Configuring Test Kitchen on Jenkins
- Better infrastructure management a.k.a. IAC (Infrastructure as Code)
- Our approach to automated visual regression testing
- Test parallelization with Lettuce, take 2