Low Budget Regression Testing for Developers
Having a regression test suite can increase developer
productivity. Verification that changes do not breaking existing
functionality allows for confident, aggressive coding.
Unfortunately, one is often placed into a situation where there is no
regression suite. In this article, I describe tactics I've used to
quickly create a regression suite designed to facilitate development.
Low Budget Objectives
The objectives for a low budget regression suite are:
Requires a small amount of time to build and maintain,
Runs quickly, and
Provides a simple pass/fail result.
The overriding goal is to keep the level of effort low. The
optimal level of effort is low enough that you don't need to mention to your
manager that you're building a regression suite; otherwise, you'll waste
time defending the need to do so.
I'll run this type of test suite multiple times per day, right after
I complete a small to medium sized set of changes. That
way, when a failure is encountered, it is pretty obvious where the fault was
introduced. This also relieves the need to have detailed test
reporting. The last set of changes is known so a
simple pass/fail indication is often sufficient information to track
down the problem.
There is a trade-off between execution time and
Designing for quick execution implies less that perfect coverage; finding
the right balance between execution time and coverage is tricky. The
goal here is to provide immediate feedback for a series of small
changes. The proper balance with this goal in mind is heavily
weighted towards quick execution times.
The Perfect World
In a perfect world, testing will be multi-layered and frequent. A
nice model is the
integration approach. Unit test are developed along with the
application code. Changes are committed frequently and the application
is tested daily with an automated process.
For Java developers, JUnit
is the standard for unit testing. Another nice touch is a code
tool to insure that sufficient unit tests have been written to engage all
code branches. A through approach will perform an automated testing
cycle on a daily basis including pulling the latest sources from a repository,
building the application, performing the tests, and creating reports.
There are many
available designed to automating these tasks.
The Real World
But alas, reality intrudes. I usually find myself new to a project,
asked to implement a significant new feature, and there are no test cases
in sight. Major re-factoring might look like the right course of
action. However, without being confident about not breaking existing
code, it takes a lot of nerve to aggressively attack a problem. The
right answer for me is to build a quick, low budget regression
suite. With a regression suite in place, I feel the confidence to
do the right thing and leave the code base cleaner when I'm done.
There are many reasons for not having test cases. It is easy to
assign blame in this area. But let's face the facts; developing and
maintaining a test suite can be expensive. A project team
is often constrained by available time and resources. Implementing new
features and fixing existing bugs is typically given more emphasis than
developing project infrastructure. This emphasis on features is for a
good reason; features and bugs are much more apparent to users.
I also believe that many managers and customers fail to make
the connection between a good testing program and high quality software.
The Elements of a Regression Test Suite
The quickest way to generate a test suite is to do
box testing. A black box test presents a set of
inputs to the system, captures the output, and compares the output
to known expected values. Figure 1 shows the major elements
need to put this plan into effect.
Figure 1. Elements and execution flow for a regression test suite.
The main software element needed is a test harness. The test harness
reads the test cases (step 1) and uses the contained information to
invoke the application (step 2). The test harness then collects the
output from the application (step 3). I call the output from the
application the actual results. The next step is for the test harness to
compare the actual results to a known set of expected results (step 4).
The final step is to emit a test report (step 5).
One item not shown is setup and tear-down phases. In the best
case the test harness will be able to start with a clean system
and load any required supporting elements (such as reference data in a
database). Even nicer is a test harness that cleans up after itself.
The main way to keep this plan "low budget" is to Keep It
Simple. For the test case file, I like a nice simple text format that
can be easily edited in a text editor. XML is nice for this, but may
involve more overhead when building your test harness. I tend pick
a well known format such the as the Java properties format or the Windows .ini
format. Several other well established text file formats can be
Art of Unix Programming.
The use of text files applies to the expected and actual output files as
well. The comparison can then be done by a diff between the two
files. This avoids having to write your own code to make the
comparison. One common issue is that the actual and expected files may
have fields that are expected to differ between runs (such as sequence numbers
or timestamps) and need to be ignored in the comparison. I
recommend developing a
expression based filter to blank out the fields in
question. The filter is then applied to both the expected and
actual output files prior to comparing them with a diff program.
For development level testing, the status I wanted reported is whether or not
I broke anything. The test harness output can be as simple as a line
count of the diff program's output. If there are no
differences, things are in good shape and coding can continue. One can
even do this by hand:
<mybox> diff actual.txt expected.txt | wc
Unix command line tools are well suited for this kind of text file
processing. When I am working on a windows box, I install either
for these tasks. For examining the
Merge is a good choice.
Preparing expected results can be difficult. One approach is to craft
what the output should be based on the test cases. This can be a lot of
work; consider the output of a web application where the expected
results could be many pages of complex HTML. In some cases,
you might not know enough about the application to prepare the expected
The approach I take is to just go with the current state of the system.
Sure, they may be a few (or a lot) of bugs. However, the goal
is to avoid unexpected changes. The quickest way to generate
expected results is to run the test harness and capture the actual
output. Then, simply use this as the expected output going forward.
The Test Harness
The test harness needs to process text files and interact with the
application. Scripting languages are a perfect fit for this task as they
feature quick development and excellent text processing features.
Interacting with the application may involve command line invocations,
creating HTTP request, or perhaps calling stored procedures. Most
scripting languages will have libraries that make short work of invoking the
My favorite scripting language is
Perl. Libraries for
most any task can be found on
CPAN. For Java
is a scripting language that will make the most of your existing Java
skills. Every developer should have scripting language expertise in
their tool box. I believe that the choice of language is not
overly critical, so if you don't know a scripting language, just pick
one from the
and learn it.
The main points to keep in mind are:
Do develop a regression suite to increase development productivity and
Keep the test harness simple.
Favor quick execution and development time over extensive
Use text files for inputs and outputs.
Implement the test harness in a scripting language.
Having a regression suite helps to insure success in difficult
situations. I think you'll find the effort of developing a low budget
regression suite to be well worth your time.