Choosing a Common Lisp Unit Testing Framework

I have recently become dissatisfied with the unit testing framework I was using: LIFT. After reading Phil Gold’s fairly comprehensive Common Lisp Testing Frameworks I decided to switch to Stefil.

So what’s so wrong with LIFT? Whilst I don’t want to detract from metabangs efforts, LIFT was annoying me enough that I was considering writing my own unit-testing framework! No one wants YAUTF (yet another unit testing framework), especially mine, so I went shopping. I should also say that I’m overjoyed with other metabang creations like bind and log5 but LIFT doesn’t seem to elevate me much any more (groan).

In my experience, your mileage might vary, LIFT seems slow for what it does. Yes, my machine is a little old and beat-up but still, the unit-testing machinery should not be a significant burden to the unit testing process itself! To illustrate this point look let’s look at a highly subjective example. Suppose I want to test the plain and simple truth, but I want to do it 10,000 times – I do this because I never take “yes” for an answer. Here’s a REPL snippet doing just that in LIFT

CL-USER> (lift:deftestsuite test-lift () ()
		(lift:ensure t))))

Start: TEST-LIFT#<Results for TEST-LIFT [1 Successful test]>
CL-USER> (time (loop for i from 1 to 10000 do (lift:run-tests :suite 'test-lift)))

<snip 9,997 lines;>
Evaluation took:
  4.029 seconds of real time
  2.100131 seconds of user run time
  0.076005 seconds of system run time
  [Run times include 0.06 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  60,780,256 bytes consed.

And then let’s do the same for Stefil

CL-USER> (stefil:defsuite* test-stefil)
CL-USER> (stefil:deftest test-true ()
	   (stefil:is t))
<snip 9,997 lines;>
Evaluation took:
  1.238 seconds of real time
  0.932059 seconds of user run time
  0.116008 seconds of system run time
  [Run times include 0.357 seconds GC run time.]
  0 calls to %EVAL
  0 page faults and
  88,813,344 bytes consed.

Part of the slowness might be that LIFT prints “Start: TEST-LIFT” 10,000 times, but I didn’t dig any deeper. LIFT seems slow when just running a handful of suites. Apart from the slowness the output produced by LIFT isn’t really particularly useful, it’s better than nothing, but I can’t really be sure of the testing progress within a suite. Ideally I would just like to see some incremental idea of progress, and a single “.” per test and a new line after each suite, like Stefil does, is much cleaner.

Secondly, and this is the kicker, I find it difficult with LIFT to find out what went wrong and where. Which is surely the whole point of unit-testing. We expect stuff to fail and hunting down the causes of failure in LIFT is a bit tiresome via the inspector. Conversely, Stefil supports test failures by dropping you straight into the debugger when an assertion fails. Which is perfect because you can look at the code that caused the error, dig about in the source, fix it and continue the test. This is a natural way to go about developing test driven software. It also leverages the REPL making it a far more interactive experience. The only snag is that this sort of behaviour is not always what you want if you want to run automated test & build environments. Stefil provides a special variable *debug-on-assertion-failure* which registers the failure but doesn’t drop you in the debugger. It seems that LIFT does have a testing parameter break-on-error? however this only catches errors, but it probably also needs a break-on-assertion? as well.

Finally, Stefil just seems more concise & natural. Since what we’re doing here is creating functions that test other functions surely we should be able call tests like functions. In my view classes are not the primary units of a test, functions are. And so it is in Stefil because every suite & test are callable functions. In LIFT you have to tell the function lift:run-test to find you a test/suite class with a specific name and then run it.

I didn’t want this blog entry to be a ‘hatchet-job’ on LIFT. I don’t want that because that’s not constructive, and there’s already too much way-too much ranting on the internet. However, in the final analysis, LIFT could be made to be a lot better than it is. Since the effort in switching wasn’t really that great I decided to switch to Stefil rather than persevere and try to directly improve LIFT.

Phil Gold actually makes two conclusions in Common Lisp Testing Frameworks , Stefil and fiveam. I would have tried fiveam, which was Phil’s framework of choice, but it wouldn’t install via asdf. Whilst not being asdf installable isn’t a huge barrier to entry it suggests something (perhaps wrongly) about the quality of the solution. So I skipped it.