Sound and Complete

A jury reaches a verdict: guilty! But was she, in fact, innocent after all?

Throughout life, we must make decisions: evaluate, then accept or reject. Life being what it is, there are more ways to screw up than not. The field of formal logic gives us two properties we can use to describe any evaluation procedure, such as the jury above: soundness and completeness. A sound jury will never send a guilty person free, while a complete jury will never send an innocent person to jail. (As an aside, I think the concrete example of a jury and its verdicts is much easier to grok than Wikipedia’s treatment, which says “Informally, a soundness theorem for a deductive system expresses that all provable sentences are true. Completeness states that all true sentences are provable.”) Intuitively, soundness and completeness are orthogonal: you can have any combination of (non-) soundness and (non-) completeness.

The ideal, of course, is sound and complete: the perfect justice system that will never exist. What we have is neither sound nor complete: innocent people are sent to jail, and guilty people go free.

Anyways, soundness and completeness are also pervasively used in static analysis and programming language type theory, each of which are close siblings to formal logic. Well, type theory is more like a mirrored reflection of formal logic, but that’s neither here nor there. A sound type system (or analysis) is one that rejects all erroneous programs, and a complete system accepts all correct programs.

Now, the interesting question is: which is more important? Is it better to have a complete but unsound static analysis, which will report every error that exists, and then maybe some errors that don’t exist? Or better to have the opposite, sound and incomplete, where every error that it reports is an actual error, but it might not find all the issues in the program? I happen to think sound and incomplete is better: first, because sound analyses compose, and second, because many of the properties you really want to have verified (such as “does this rendering look right?”) don’t fall in the scope of the verification system. Complete and unsound certainly has its appeal: after working through all the reported errors (and dealing with the wild goose chases), you’ll know that your program is free of all the bugs that the analysis is capable of finding. Of course, completeness is unachievable for many of the most interesting and critical problems, whereas sound-but-incomplete is more within reach.

What about for type systems? A sound type system is one that is not lenient, in that it rejects all invalid programs plus some number of valid programs. A complete type system accepts all valid programs, and some invalid ones to boot. Which to pick?

The situation isn’t quite as obvious as with a static analysis, for two reasons. The first difference is that a programming language with a complete and unsound (static) type system has a fallback in the form of dynamic typing. In fact, one of the most powerful type system features, dependent typing, is not statically verifiable unless carefully restricted. Dependent type systems therefore blur the distinction between compile-time type checking and runtime type checking, and thus between static and dynamic typing. The second complication is that the arithmetic mean programmer tends to value expressiveness rather more than compile-time verification. Programmers dislike having the computer reject a program that would have run fine, simply because the computer couldn’t make sure it would run fine without actually running it. In short, restrictive type systems drive programmers to more flexible environments. Consider how Mozilla Firefox is designed: a core of statically-typed C++ to do heavy lifting, with dynamically-typed JavaScript running the user interface. Not coincidentally, this static core/dynamic shell is shared by many large video games (like World of WarCraft: C++ and Lua).

The target domain matters, of course. One hopes that the language for a program controlling an airliner’s fuel supply preferred soundness over completeness. And type inference raises another issue: there exist type systems that are sound and complete, but are not amenable to type inference.

Decisions, decisions…

p.s. Since I’m a visual learner, I created a reference chart:

Chart of soundness and completeness properties

Chart of soundness and completeness properties


3 thoughts on “Sound and Complete

  1. The definition given here is exactly wrong: “Is it better to have a complete but unsound static analysis, which will report every error that exists, and then maybe some errors that don’t exist?”. A complete but unsound static analysis will report no false positives but may report false negatives i.e. every bug reported is an actual bug, but not all bugs may be reported.

  2. Hi Finn, I double-checked the definition — I think the post is correct. From :

    “A proof system is sound if everything that is provable is in fact true. […]
    A proof system is complete if everything that is true has a proof.”

    Translating terminology, a bug report is a “proof”, and the presence or absence of an underlying flaw is ground truth. So soundness means all bug reports correspond to actual bugs (no false positives). Completeness means all bugs (“everything that is true”) get reported (“has a proof”) (no false negatives).

    Of the two properties, I think completeness is the easier one to keep intuitively straight in my head because the formal and informal meanings are closer — “incomplete” is roughly interchangeable with “missing” so “complete” means “nothing missing”.

  3. I’m on Finn side. They often talk about sound type systems, but not complete. ML-family (OCaml, Haskell) is sound – e.g. if the program compiles it is proven to not have type errors. (Sidenote: But still can have dynamic errors, like division by zero or integer overflow – anything which is not possible to express as a type error. What you can express as type error depends on your type system.)

    > So soundness means all bug reports correspond to actual bugs (no false positives)

    Yes. And if there are no reports it means there are no bugs due to type errors.

    > Completeness means all bugs (“everything that is true”) get reported (“has a proof”) (no false negatives).

    This is the flaw in your judgment. Completeness means that type checker will not reject programs that actually does make sense but can be rejected by the sound type checker. But because it is so permissive it is also can pass programs with type errors.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s