Narrative tests are lousy unit tests

I want to stop people abusing Python's doctest format. Many of the tests I've seen written as doctest files would have been better off as plain unittest files. I'm going to try explain why. I have many gripes about how people use doctests, but probably the biggest is that narrative tests are lousy unit tests.

Narratives tell a story. Something happens, then another thing, and another thing, one after the other, in sequence. Earlier events influence later ones as the story gradually assembles a complete picture. Humans like stories, our brains are used to telling them and receiving them.

Technical documentation is often written with a narrative. Tutorials are an obvious case, but not the only one. A guide to an API may show a series of different examples, each contrasting with the others in ways that explain to the reader what they need to understand.

Automated tests can have narratives too, of course. A narrative test is quite easy to write: write some code that does something (and check the result), then do something else (and check that result), and so on until you've done (and checked) everything you want to do (and check). Doctests make this particularly easy. Here's a toy example of a doctest:

   Instantiate a Frobber.

     >>> frobber = Frobber()
     >>> frobber.has_frobbed()
     False

   Now frob it.

     >>> frobber.frob()
     >>> frobber.has_frobbed()
     True

   It can't be frobbed twice.

     >>> frobber.frob()
     Traceback (most recent call last):
     ...
     AlreadyFrobbedError: ...

Narrative tests can be good acceptance tests. An acceptance test often takes the form of a story; an example might be “an unlogged in user visits a web page. They click a particular link that needs a logged in user, so they get taken to a login screen. The user has no account yet, so they walk through the account creation wizard. Once the wizard is completed, the account is created and they logged in, and they are taken to the link they originally clicked on.

So, having shown how they are easy to write, and appropriate for some tests, I'll now explain why narratives make lousy unit tests.

A typical unit test has four phases:

  1. Set up a fixture
  2. Interact with the system-under-test
  3. Verify the outcome
  4. Tear down the fixture
Or phrase it the way Behaviour-driven Development people might, each unit test says: “Given situation X, when I do Y, then Z happens.”

Good unit tests are small and specific: they will test just one condition per test method, i.e. the X and the Y will be as minimal as reasonably possible. There's considerable benefit to this style:

So that's why I think narrative tests are poor unit tests. And I think unit tests ought to be the bulk of most automated test suites.

See also Tests are code, doctests aren't.

Andrew Bennetts, October 2008