Self Stocking Test Suites: Dynamic Testing Made Easy

Dynamic generation of tests in the unittest framework can seem difficult at first, but there's nothing fundamentally broken about unittest in this regard, recent comments by Ian Bicking and Jeremy Hylton notwithstanding. A few simple helper classes make all the difference.

A common example used to demonstrate unittest sucking is automatic checking of code from a database of input and correct output. The first attempt of most unittest newbies' looks like this:

def generateOutput(input): 
    return input # or, do some real work

def getOutputTests(): 
    for check in file('outputChecks.txt', 'r'): 
        yield check.rstrip().split('\t')

class OutputCheck(unittest.TestCase): 
    def runTest(self): 
        for input, expectedOutput in getOutputTests(): 
            output = generateOutput(input)
            self.failUnlessEqual(output, expectedOutput)

This is a lot easier to maintain than a single monolithic test case with a test_ method for each input/output combination you want to check, but is brittle: for any test run, you'll only find out about the first problem, not all of the problems. Some people find it so frustrating to suffer more crash, compile, edit cycles than necessary that they abandon unittest entirely, but it needn't be that much work.

A generic output check class

A generic TestCase for checking output can be implemented as follows:

class OutputCheck(unittest.TestCase):
    "Basic comparison test."
    
    def __init__(self, input, expectedOutput, methodName='runTest'):
        "Initialise the `OutputCheck`."
        unittest.TestCase.__init__(self, methodName=methodName)
        self.input = input
        self.expectedOutput = expectedOutput

    def generateOutput(self, input):
        "Override this method with your choice of output code."
        return input
        
    def runTest(self):
        "Verify `generateOutput` generates correct output."
        self.failUnlessEqual(self.generateOutput(self.input),
                             self.expectedOutput)

To use OutputCheck in your own tests, subclass it and override the generateOutput method.

Which check failed?

If you're running a hundred output checks from a database, you probably don't want to have to count the dots to figure out precisely which test failed. This is another point many people run into trouble with dynamic testing: TestCase prints the doc-string of the method that failed, which is great for normal tests but breaks down when all hundred tests are made from the same runTest method. With OutputCheck as defined above, all you'd see is a dozen failures all labelled "FAIL: Verify generateOutput generates correct output." -- less than helpful.

The good news is, for short simple tests you can rely on the traceback generated by TestCase.failUnlessEqual, in which the input and output are displayed. As your testing life gets more complicated, though, you'll run into two problems.

The first is that if you're not comparing the basic types, you're completely dependent on the __str__ method of your object to see what's going on. Most objects lack their own __str__, so all you're likely to see is something like this:

========================================================================
FAIL: Verify `generateOutput` generates correct output.
------------------------------------------------------------------------
Traceback (most recent call last):
  File "test.py", line 48, in runTest
    self.expectedOutput)
  File "unittest.py", line 302, in failUnlessEqual
    raise self.failureException, \
AssertionError: <F instance at 0x00CC4B48> != <F instance at 0x00CDE5F8>
------------------------------------------------------------------------

Even if you're comparing strings, the representation of the input and output in a traceback won't necessarily be that helpful to you. In either case, being able to supply a test ID or description greatly helps. A simple adaptation to OutputCheck does the job:

class OutputCheck(unittest.TestCase):
    "Basic comparison test."
    
    def __init__(self, input, expectedOutput, testDoc=None, 
                 methodName='runTest'):
        "Initialise the `OutputCheck`."
        unittest.TestCase.__init__(self, methodName=methodName)
        self.input = input
        self.expectedOutput = expectedOutput
        if methodName == 'runTest': 
            if testDoc is None:
                testDoc = input
            self._TestCase__testMethodDoc = testDoc # Filth!

    def generateOutput(self, input):
        "Override this method with your choice of output code."
        return input
        
    def runTest(self):
        "Verify `generateOutput` generates correct output."
        self.failUnlessEqual(self.generateOutput(self.input),
                             self.expectedOutput)

If you pass a testDoc argument to OutputCheck.__init__, it will rudely over-write TestCase.__init__'s private definition of __testMethodDoc. I get paranoid when over-riding someone else's encapsulation in this way, so I only do it when methodName == 'runTest'.

Stocking the test suite

Having derived your own comparison test class with an appropriate generateOutput method, it's time to stock a test suite with instances:

def getOutputTests(): 
    for check in file('outputChecks.txt', 'r'): 
        yield check.rstrip().split('\t')

if __name__ == '__main__': 
    suite = unittest.TestSuite()
    for title, input, expectedOutput in getOutputTests(): 
        suite.addTest(OutputCheck(input, expectedOutput, testDoc=title)
    unittest.TextTestRunner(verbosity=2).run(suite)

That's fine for a stand-alone test script, but not so good as a test module because it denies access to the generated suite from other modules that might import it. We could move the first three lines after if __name__ to the main body of the module to fix the problem, but reading the tests when the module is imported makes me nervous; I'd prefer to defer the heavy listing until when the user expects it: when the tests are run.

Having the test suite stock itself at run-time

Enter SelfStockingTestSuite, which stocks itself with tests at run-time. Simply overload the stockSelf method to run your own tests rather than the default self-test:

class SelfStockingTestSuite(unittest.TestSuite):
    "A `TestSuite` that stocks itself with test cases at run-time."
    
    def __call__(self, result):
        "Stock self with tests via `stockSelf` before running tests."
        self.stockSelf()
        return unittest.TestSuite.__call__(self, result)
    
    def stockSelf(self):
        "Stock self with tests."
        "Overload this in a subclass."
        def testMethod():
            print "Self-checked OK."
        test = unittest.FunctionTestCase(testMethod, 
            description="SelfStockingTestSuite self-test")
        self.addTest(test)

My self-testing test script runner, test.py, uses this technique both to import test modules from within a test suite, and then to run the tests defined in those test modules, but I haven't yet published the updates which use this code.

A generic automatic output checking suite

From SelfStockingTestSuite, it's easy to derive a generic automatic output checking test suite:

class AutomaticOutputChecker(SelfStockingTestSuite):
    "A `SelfStockingTestSuite` to compare output with expected results."

    outputCheckClass = OutputCheck
    
    def stockSelf(self):
        "Stock self with comparison tests."
        for title, input, output in self.getOutputTests():
            test = self.outputCheckClass(input, output, testDoc=title)
            self.addTest(test)

    def getOutputTests(self):
        yield ('forgot to over-ride getOutputTests', 0, 1)

To use, simply specify the appropriate comparisonTestClass and over-ride getOutputTests with your own method returning or yielding items from a sequence of (title, input, expectedOutput) tuples.

Who shall test the testers?

The next issue is that failures outside of test cases don't increment the FAILED or ERROR counts; instead, you get an extra exception. That's quite easily tolerated if you're running one test script at a time, but can get a little confusing if you're trying to run a few thousand tests defined in a few dozen test scripts.

To modify SelfStockingTestSuite to register an error when stockSelf throws an exception, we need to call result.addError from __call__. It turns out that addError expects its first argument to be a TestCase, but it seems sufficient to add a shortDescription method, after which the code looks like this:

class SelfStockingTestSuite(unittest.TestSuite):
    "A `TestSuite` that stocks itself with test cases at run-time."
    
    def __call__(self, result):
        "Stock self with tests via `stockSelf` before running tests."
        try: 
            self.stockSelf()
        except:
            result.addError(self, sys.exc_info())
        return unittest.TestSuite.__call__(self, result)
    
    def shortDescription(self):
        "Pretend we're a `TestCase` for `result.addError`."
        return self.__class__.__name__
    
    def stockSelf(self):
        "Stock self with tests."
        "Overload this in a subclass."
        def testMethod():
            print "Self-checked OK."
        test = unittest.FunctionTestCase(testMethod, 
            description="SelfStockingTestSuite self-test")
        self.addTest(test)

What next?

unittest.TestLoader only picks up subclasses of TestCase, not TestSuite subclasses or instances of either. I need to re-work test.py to include a custom TestLoader to address that.

With any luck, more discussion will prompt some minor changes to unittest to make some of this work easier.