Blog

January 16th, 2015

Test parallelization with Lettuce, take 2

by Stefano Rago

It’s been a while since we blogged about Behaviour Driven Development and our approach to test parallelisation based on Lettuce. In this blog post we want to discuss an improvement to that solution as well as share some code snippets that are key in our implementation of this approach.

The solution we proposed in our first blog post brought huge benefits as it reduced the time to run the test suite hence reducing the feedback cycle. With time, though, a few shortcomings started showing up and we decided to dedicate some more effort to this crucial aspect of our workflow. Here at Agilo Software we embrace the Continuous Improvement principles as the foundation of our approach to our every—day job, and this has given us a chance to challenge our original ideas.

The problem

The main disadvantage of our approach lies in the fact that each of the parallel test runners was getting a set of scenarios to run at the beginning, prior to start running the tests, using pseudo-rundom criteria. This does not take into account the ‘complexity’ of each scenario, which in this case translates to time to run. In fact, we started observing that some of the runners were terminating much earlier than others, and this led to very unbalanced running times (see chart).

Lettuce parallelization - Table, old approach

Lettuce parallelization - Chart, old approach

Possible solutions

Of the possible solutions that we gathered, two gained the most attention in our team:

  • weight-based approach: consists in looking at the previous run times for the scenarios and assigning a weight to each of them, allowing for a more balanced assignation of the sets, based on such weights
  • queue-based approach: all of the test runners share a common data structure to dynamically look up and determine what scenarios to run

The advantage of the first point is particularly important in situations where the overhead caused by the lookup can be considerable. In this case, in fact, there is no need to lookup a centralized data structure to know how to proceed with the handling of the workload, since its definition is done at the beginning, possibly by a separate process, which has an overview on the distribution characteristics.

In our case, though, we did not have this problem, and our team decided to focus on the latter point, as it seemed simpler and at the same time capable of overcoming some downsides of the first point (e.g. new tests don’t have a history, or tests can change and so their time to run).

Our solution

The solution we implemented is based on a data structure which serves as a kind of lock table. Each runner goes through the list of all scenarios, and for each of them, it checks this table to see if any of the other runners already has a lock on it. If not, it locks that scenario and then starts running it. In this way, no two runners can run the same scenario, and each of them pulls the work based on their availability. Our experiments showed that, as expected, this solution is capable of distributing the workload in a much more balanced way (see chart).

Lettuce parallelization - Table, new approach

Lettuce parallelization - Chart, new approach

Implementation

To implement the locking data structure we decided to use a simple sqlite database, which can be created (or flushed) at the beginning of the test suite, by means of a wrapper script, for example. In our case, we used something along the lines of this snippet in a bash script:

sqlite3 /tmp/scenarios.db "create table scenarios (name TEXT PRIMARY KEY);"

This table contains just one column, holding the name of the scenarios being locked. In order to lock a scenario, the runner simply needs to insert the name of the scenario it is about to run into the table. Since there is a uniqueness constraint, if the insert fails because of such constraint, the runner assumes that the scenario is already locked and moves on with the next one. In order to implement this functionality on the runner side, we decided to fork Lettuce and augment the “run” method of the Feature class. Our fork is publicly accessible here and this specific change is probably generic enough to be useful in other projects as well.

for scenario in scenarios_to_run:

    scenarios_db = os.getenv('SCENARIOS_DB', '/tmp/scenarios.db')

    connection = sqlite3.connect(scenarios_db)

    try:

        with connection:

            connection.execute("INSERT INTO scenarios(name) values (\"%s\")" % scenario.name)

        connection.close()

        scenarios_ran.extend(scenario.run(ignore_case, failfast=failfast))

    except sqlite3.IntegrityError, e:

        connection.close()

Collecting the results from all of the tests was a no brainer since each runner produces an xml file with the results that can be parsed individually by the Continuous Integration server (Jenkins in our case).

What follows is an excerpt of our wrapper script that shows how we are launching the parallel lettuce runners

TOTAL_RUNNERS=3

for (( i=1; i<=${TOTAL_RUNNERS}; i++)) do

    python ./manage.py harvest --with-xunit --xunit-file=lettuce_test_results_${i}.xml ../features &

    PIDS_TO_WAIT_FOR+=($!);

    sleep 3

done

for PID in "${PIDS_TO_WAIT_FOR[@]}"; do

    echo "waiting for" ${PID}

    set +e # this allows us to continue on the loop even if the waited process had already quit

    wait ${PID};

   set -e

done 

As always, we are eager to hear comments and improvement ideas from you, so just head over to the comments section and start typing!

sterago

Stefano Rago

Stefano Rago joined the agile42 and Agilo Software team in 2010 and he has been growing his agile skills as a scrum team member ever since. The main technical aspects he has been focusing on include continuous integration and delivery, test driven development and refactoring. He's also a technical trainer and coach at agile42, helping and challenging teams to find ways of getting always better.

Posted in Development
blog comments powered by Disqus