Assertions (by )

Assertions are a useful tool in defensive programming. If you're writing a bit of code that assumes some condition to be true, since that condition being true is a design constraint of your system so it can only possibly be untrue if there is a bug elsewhere in your code, then it can help to put an explicit assert of that condition before your code.

For example, if you have a program that handles the progress of patients through a suite of operating theatres in a large hospital, you might have a Person object that has a theatre pointer, which is NULL when the person is not in a theatre, or contains a reference to a Theatre object when the person is in theatre.

Clearly, some methods of Person might only make sense while that person is in a theatre, and will dereference the theatre pointer. If they just assume they are only called when the person is in theatre, and for some reason they are called when the person is not, then you'll get a segmentation violation in C++ or a NullPointerException in Java, or its equivalents in other languages with checked references.

Which is a little unfriendly; much better if you make an up-front assertion upon entry to your methods that theatre != NULL. That makes it clear where the problem is - a method that needs theatre to be non-NULL has been called when theatre is NULL; your code still dies - with a SIGABRT in C-based languages or some kind of assertion-failure exception - but now the source of the exception, found in the stacktrace, points to much closer to where the problem lies (bear in mind that the method might not directly reference theatre itself but call other methods that eventually do, meaning that the actual problem - the call of a method in a context it shouldn't be called in - might be anywhere in the backtrace, and thus hard to find).

An assertion is usually implemented in the output code as something along the lines of if (!condition) { abort() }, where abort() is either "kill the process and force a core dump" or "throw an assertion failure exception"; so at run time, the condition is checked.

But in principle, a smart compiler could examine assertions and use them as input to a compile-time static analysis of the program. If a method starts by asserting that some field of the object is not NULL, why not look through all the other methods and see which ones assign to that field, and see if we can statically deduce the state of the field in cases where the method is called? This can have several beneficial effects:

  1. The compiler might find bugs: if it sees you calling the method in a context where the field could well be NULL at run time, it can point out how that field could be NULL in that context, thus revealing a bug.
  2. The compiler might be able to optimise code; if it can prove that the field will never be NULL it can skip the conditional in the assertion, and after the assertion it will know that the field cannot possibly be NULL (please, let's keep multithreading out of this for now...) and so can skip checks for NULL in the generated object code, thus speeding up execution.
  3. The presence of assertions can simplify the compiler's deductions concerning the state of the system used for checking other assertions. For example, if every method of the class and the constructor all assert that the field is not NULL before they return, then without needing to do potentially undecidable analysis of the bodies of the methods, we can know that the field will never by NULL on entry to any method (although it could become NULL temporarily during the body of a method).

This is really a generalisation of type inference; in a dynamically typed language we might assert that certain expressions have certain types at certain points in the program, but we might make different assertions for the same field in different contexts, expressing how the type might change; and as well as simple type declarations - foo is an integer - they can also express arbitrary relationships - foo is less than the sine of bar.

So it might be interesting to construct a static semantic inferencing system for a programming language that's driven entirely by assertions. Since the assertions might be arbitrary expressions, and thus not necessarily statically (or even dynamically!) decidable, it would have to be an inferencing system that chooses when to give up trying to prove something and makes a conservative assumption, dropping back towards run-time dynamic checks for things it can't prove at compile time - in effect, making a best-effort to deduce what it can in advance as a form of optimisation; one that optimises for execution time and space by removing explicit checks and type tags, and that optimises for programmer time by performing compile-time bug detection!

But I wouldn't implement such a thing directly in terms of an assert construct in your base language. Better to still implement assert as a macro that expands to if (!condition) { abort() }, and make that abort() into a special construct that means "This should not happen".

Then the static analyser becomes a front-end to a compiler, that takes a dynamically typed language at the input and outputs a type-safe program in a simpler type-unsafe language (by which I mean, one where you can attempt to add a floating point number to a pointer and the system will happily return a value that is neither a useful pointer nor a useful floating point number, without complaining, such as C or FORTH) that a backend compiler then compiles into something useful.

The analyser would compile x+y into a direct integer addition of x and y if it could deduce they're both integers in all possible contexts, and if not, it would compile it into code that has explicit checks for the type of x and y, branching off to appropriate addition functions; and its goal would be to reduce the amount of such testing it generates as much as possible.

Any assertions it can't entirely prove at compile time would have to result in runtime checks - in which case, the abort construct would indeed compile to code that generates a run-time core dump / backtrace.

But the reason I think that we should make abort the fundamental construct rather than assert is purely because it's more general, in a way that will make certain optimisations easier. abort might appear in places other than asserts - for example, we might write switch statements with a default case that aborts, rather than either prefacing the switch with assert(x == 1 || x ==2 || x ==3) (checking the valued switched upon is one of the expected values, but being rather long-winded and having to be manually modified whenever the list of cases in the switch is modified), or adding default: assert(1==2) to the switch (which produces an ugly and misleading error message; the problem is that x was not an expected value, not that 1==2).

Through standard program simplification rules, the compiler can find some bugs simply by managing to eliminate conditionals, thus exposing an abort to code that is always called - and thus trivially marking that code as illegal.

1 Comment

  • By Faré, Thu 21st May 2009 @ 4:14 am

    Have you seen Spec# ? It basically does that!

Other Links to this Post

RSS feed for comments on this post.

Leave a comment

WordPress Themes

Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales
Creative Commons Attribution-NonCommercial-ShareAlike 2.0 UK: England & Wales