The aim of these notes is to achieve a basic understanding of the concepts of mathematical logic. The process of logical deduction is clearly a central theme of mathematics, where the idea is to prove a stated result by the means of an argument which is broken down into small steps, each of which should be obviously valid to the intended audience. This is typically done in a slightly informal fashion, where the validity of each stage of the proof is supposed to be clear, but does not necessary follow a fixed and clearly stated framework. The study of logic is meant to clarify how this process works, and give a clearly defined framework for logical reasoning. While the history of logic goes back thousands of years, a solid mathematical foundation was only developed in the late nineteenth and early twentieth centuries. For example, the Principia Mathematica published in 1910 by Whitehead and Russell was an attempt to show that all of mathematics can be derived from some precisely stated set of axioms together with inference rules for obtaining conclusions from lists of premises. From this standpoint, logic becomes another field of mathematical study. Just as rings, fields, measure spaces, etc., are objects of mathematical study, so is logic. As in any of these other areas, we use standard mathematical reasoning to prove results about logical systems. However, these results can then shed light on the process of mathematical reasoning itself, as well as providing justification for the underlying frameworks used for much of mathematics.
This has been a very successful endeavor, with theories such as Zermelo-Fraenkel set theory (ZF) providing the standard set of axioms used by mathematicians in many different fields of study. Many important ideas and results have arisen from the study of logic, such as the relative consistency of different theories and independence of certain statements. The axiom of choice, for example, is often considered to be intuitively obvious but, at other times, has been considered controversial. Thanks to the mathematical study of logic, it is now known to be independent of the other axioms of ZFC. Similarly, the continuum hypothesis which Georg Cantor spent many years trying to prove, has also been shown to be independent of ZFC. It is also known that, although the axiom of choice can be used to construct sets which are not Lebesgue measurable, it is consistent with the axiom of dependent choice that all sets of real numbers are measurable. In the other direction, reverse mathematics has been successful in determining precisely which axioms are really required for various mathematical theorems. At a higher level, results such as the completeness theorem have put the theory on a solid footing, establishing the equivalence of semantic truth and syntactic provabilty in first order logics. The incompleteness theorem, on the other hand, shows that no recursively enumerable proof system can prove all true statements about the arithmetic of the natural numbers and, furthermore, no sufficiently strong proof system is able to prove its own consistency (unless it is actually inconsistent). For example, it is not possible to prove the consistency of ZFC just by using the axioms and rules of ZFC itself, although it is possible in the presence of additional large cardinal axioms. Similarly, Peano arithmetic is not able to prove its own consistency, but it is possible if the well-ordering property of the ordinal is added.
The mathematical study of logic has also made clear the distinction between classical and intuitionistic or constructive logics where the law of the excluded middle does not hold. Another consequence of putting logic on a solid mathematical foundation is that it should be possible to check the validity of mathematical proofs in an entirely systematic way and could, in theory, be checked by computer. There are various proof-checkers available, although they are currently much more difficult and tedious to use than writing out proofs for human readers, so tend not to be used for most mathematics. There is also a large intersection between mathematical logic and computer science. For example, the Curry-Howard correspondence gives a one-to-one relation between statements of intuitionistic implicational logic and the types of valid programs in simply typed lambda calculus, with the programs or lambda expressions playing the part of proofs. This correspondence has been used as the basis for various computer implementations of proof systems.
The starting point for most logical theories is a language in which statements can be formed according to rules for what constitutes a valid statement (wffs, or well-formed formulas). This generally includes certain special logical connectives which allow statements to be formed which express a logical connection between its component parts. For example, if `P’ and `Q’ are valid sentences, then `‘ is also valid (meaning, `if P then Q’). These logical connectives come with prescribed rules of inference and, together with a fixed list of axioms, we can prove theorems. This does, however, raise various questions. What is the `correct’ set of connectives and what is the correct set of rules of inference that they should follow. It is possible that, with different connectives or rules of inference, an entirely separate set of theorems would result. In this post, I take a step back from such specific theories or rules. The idea is to first look at the most general concept of what a logic is, and only after that, can we can determine exactly what connectives or rules of inference are possible or desirable.
Possibly the most basic concept in logic is that of entailment or inference. We start with a collection of premises, which are statements in some language and, according to some rules, we establish a result. The starting point is a set , which can be thought of as a set of well-formed statements in some language although, to be as general as possible, we just assume that is a set with no specific restriction or interpretation assumed of its elements. We write
for , to mean that entails or, equivalently, that is a logical consequence of . Quite what the relation really means is left open at this stage. For example, considering a collection of interpretations or models, each of which assigns truth values to the elements of , (1) can be taken to mean that any model assigning the truth value 1 to the also assigns the value 1 to . Alternatively, (1) could mean that, with respect to some formal proof system, there exists a proof of from the premises . More generally,
means that is a logical consequence of the set of premises . For convenience, we often put a list of subsets or elements of as the list of premises, so that
for and , is just another way of writing
Similarly, the statement with no premises is just another way of writing .
As a matter of terminology, expression (2) is a sequent, and the symbol is referred to as a turnstile. The left hand side, , is the set of premises or the antecedent and is the consequent, and we say that entails or that is a consequence of . Although we are being rather general here and not considering any particular interpretation of the set or of the relation , there is a short list of `obvious’ properties which should hold for logical inference.
Definition 1 A consequence relation on is a relation on satisfying
- (reflexivity/axiom of identity),
- if then (weakening),
- if for all and then (transitivity/rule of cut),
for all and . We will say that is finitary if, whenever then for some finite .
A pair consisting of a set together with a consequence relation on will be known as a logic. We will often just use to denote the logic .
The notion of consequence relations goes back to Tarski (Logic, Semantics, Metamathematics). For logical consequence to be finitary means that the proof of a statement only ever requires invoking finitely many premises, which is clearly a desirable property if we are wanting to have a computable theory, where the truth of consequences could potentially be verified by a computer or expressed on paper. Note that any consequence relation , whether or not it is finitary, will uniquely determine a finitary consequence relation by iff for some finite .
It should be noted that there are more general logics than those given by definition 1. We could drop any, or all, of the defining properties. For example, non-monotonic logics do not satisfy the second condition. Furthermore, it is possible to generalise the antecedent to be a multiset where the number of times that each element occurs is important, or even a sequence where the number of occurences and the order of the premises matters. I will not, however, be concerned with such logics, with those satisfying the definition above including everything which I intend to consider.
There will generally be many different consequence relations on a set . A relation on can alternatively be viewed as a subset of consisting of all pairs for which . These can be ordered by inclusion , inducing an ordering on relations iff implies . We will also say that is weaker than or, equivalently, that is stronger. In particular, this induces an ordering on the consequence relations. The following are examples of consequence relations.
- The minimal consequence relation on any set , given by if and only if .
- The maximal consequence relation on any set , given by for all and .
- The Boolean domain containing two values, thought of as `false’ and `true’. These are variously denoted as , , or simply . The consequence relation on is defined by if either or . It can be seen that this is the minimal consequence relation satisfying .
- Any map (truth valuation) from to the Boolean logic defines a consequence relation on given by iff whenever for all . Equivalently, . This corresponds to the rather idealised situation where is a set of logical statements, each of which is either true or false. Assuming that is a set of true statements, then iff is true.
- A set of models or interpretations on defines, for each , a truth valuation . We define a consequence relation on by iff, for each satisfying for all , we have . Equivalently, for all . This is known as semantic consequence, and has an intuitive interpretation as follows. Consider as a collection of universes, within which, every statement in has a definite truth value. A set of premises restricts us to being in , the set of universes in which all statements in are true. Writing signifies that is true for all universes in .
This interpretation leads us to consider the concepts of consistency and explosion. A set is consistent if there exists at least one model in which all elements of are true and, otherwise, is inconsistent. From an inconsistent set of premises , everything follows, so for all . This is the principle of explosion (“from falsehood, everything”). On the other hand, for a reasonably expressive logic system, every model will make some statements false. For example, if contains a statement `P’ and another statement `not P’, then these should not both be true in any model and, hence, if is consistent then it will not entail everything in . This suggests the concept of consistent and inconsistent sets of premises in a general logic: is inconsistent if for all and, otherwise, is consistent. The logic itself is inconsistent if for all and, otherwise, is consistent.
- We can define a consequence relation on any collection of subsets of a set . Say if . For example, could be the power set , or it could be the collection of open (resp., closed, compact) subsets of a topological space.
This consequence relation can also be defined by the collection of truth valuations given by if and if .
- Any collection of consequence relations () on a set has a unique infimum given by iff for all .
- For any relation on , there will be a unique minimal consequence relation such that whenever . This is the consequence relation generated by .
- For any logic and set of premises , there exists a minimal consequence relation, on satisfying whenever , and, for all . This has the rather simple construction as iff . Such constructions are common. For example, may consist of statements built from some specified list of symbols combined with logical connectives. The base consequence relation would then satisfy the deduction rules associated with those connectives, and we add a set of axioms specific to the theory at hand to obtain the required consequence relation.
- Let be a partially ordered set, and write iff for some . Then, is a consequence relation.
- Syntactic consequence: Given rules for constructing well-formed formulas (wffs) in a given language, and rules of inference to generate new wffs from old ones, can be taken to denote that the wff can be deduced from those in by a sequence of applications of the deduction rules.
- For a strongly typed programming language (e.g., simply typed lambda calculus), we can let be the collection of types and means that there exists a function defined in the language with return type and whose arguments are of types in .
Finitary consequence relations are uniquely determined by their restriction to finite sets of antecedents. That is, in definition 1, we need only consider to be a relation on , where denotes the collection of finite subsets of . Then, the sets in the definition should be taken to be finite. Any such relation uniquely extends to a finitary consequence relation on by iff for some finite . For finitary consequence relations, the statements of definition 1 can be significantly weakened.
Theorem 2 A relation on is a consequence relation iff,
- if then ,
- if and then ,
for all finite and .
Proof: Note that the first statement is the axiom of identity for the special case , the second statement is weakening for the case , and the third statement is transitivity for the case . So, consequence relations satisfy the properties above. It remains to show that if satisfies the properties of the theorem, then it is a consequence relation.
If and then, to prove weakening, it needs to be shown that . We will use induction on the size of . If is empty, then the statement is trivial. Otherwise, we can write for strictly included in . By the induction hypothesis, . The second statement of the theorem says that or, equivalently, as required. The axiom of identity, follows from the first statement of the theorem together with weakening.
For transitivity, suppose that for all and that . We need to show that . Again, induction on the size of will be used. If is empty then the statement is trivial. Suppose that for strictly smaller than . Then, we have for all and . So, by the induction hypothesis, and . By the third statement of the theorem, this gives as required. ⬜
An alternative approach to logic is via closure operators. The logical closure of is the set of all for which , and can be shown to satisfy the axioms of a closure operator. Similarly, is said to be logically closed if every satisfying is itself in . That is, is logically closed if it equals its own logical closure.
Definition 3 A closure operator is a map satisfying
- if then .
We say that is finitary iff
Note that if is finitary then the second condition in the definition of a closure operator can be removed, as it already follows from (3). Specifying a closure operator on a set is equivalent to specifying a consequence relation.
Theorem 4 Given a set , there is a one-to-one relation between each of the following
- a set which is closed under intersections and such that .
- a closure operator .
- a consequence relation on .
These are related by
and is finitary iff is finitary.
Proof: If is a consequence relation, define the map by the second equality of (4). We need to show that this satisfies each the three properties of definition 3 of a closure operator. The first property, follows immediately from the axiom of identity and the second from weakening. For the third, if then, setting , we have for all and so, by transitivity, , showing that and, hence, . Conversely, if is a closure operator then define if . The axiom of identity follows from and weakening follows from for . For transitivity, suppose that for all and . Then, and
and, hence, , showing that is a consequence relation.
For a closure operator , define to be the set of with , so that the third equality of (4) holds. By definition, every is of the form . Also, as , can also be expressed as the sets for , showing that the final equality of (4) also holds. We have . If over , then setting ,
Taking the intersection over shows that and, hence, is closed under intersections. Conversely, suppose that is any collection of subsets of and define by the first equality of (4). It is clear that this satisfies the first two conditions of definition 3. Next, for every with we have by definition so,
So, is a closure operator. We have not even used the conditions that or that is closed under intersections yet. This is only required to verify the final identity of (4). It is clear that every is the intersection of all containing , so that . Conversely, it remains to show that for all . As , there does exist at least one containing and, as is closed under intersections, is then in .
Finally, suppose that is finitary. Then, for any , we have and, hence, for a finite . This shows that and, hence (3) holds and is finitary. Conversely, if is finitary and then and, by (3), for a finite . So, , showing that is finitary. ⬜
As noted above, any relation on generates a consequence operator. Furthermore, if is finitary, then is also finitary, and an entailment can be established by a finite sequence of applications of , as stated below. In particular, if is recursively enumerable, then so are the consequences , and proofs of these entailments can be checked by following a recursively enumerable set of rules, so could potentially be verified by computer.
Lemma 5 Let be the consequence operator generated by a relation on . Then iff or there exists with
Furthermore, is finitary.
Proof: We show that by induction on n. First, if then and, by weakening, . Next, for , we have for each by the induction hypothesis and, hence, for all . As , transitivity gives .
Conversely, it needs to be shown that every with can be expressed as above. Let us start by defining to mean that satisfies the conclusion of the theorem. If it can be shown that is a finitary consequence relation then we are done. The axiom of weakening is immediate and, as the construction above only refers to finitely many finite subsets of , it is finitary. The axiom of identity is also immediate as, by definition, for all .
By theorem 2, it only remains to show that if and then . If or then the result is immediate. So, we only need to consider the case where we have finite sequences , finite sets and satisfying and , and , . In that case, set and for . Similarly, set and for . We then have , and , so as required. ⬜
A consequence relation defines a preorder on , given by . This can be expressed in either of the following ways.
Lemma 6 For any , the following are equivalent.
- for all satisfying .
- for all and satisfying .
Proof: If the first statement holds and then by transitivity.
Suppose that the second statement holds and . As , the second statement gives . Transitivity gives .
Finally, suppose that the third statement holds. Using , we have and, hence, . Putting gives the first statement. ⬜
Consequently, defines an equivalence relation on given by and . We say that and are logically equivalent.
Corollary 7 For any , the following are equivalent.
- and .
- iff , for all .
- iff , for all and .
If any (or all) of these hold, then we say that are logically equivalent and write .
The following is just a statement of the fact that a consequence relation is not able to distinguish logically equivalent statements.
Lemma 8 Let and . Suppose that and for every there is an with .
If then .
Proof: For every , we have for some . As , this implies that . By transitivity, . So, . ⬜
Note that for any logic we can define its set of equivalence classes, together with a natural function taking each to its equivalence class. We can then define a consequence relation on by iff for and .