The aim of these notes is to achieve a basic understanding of the concepts of mathematical logic. The process of logical deduction is clearly a central theme of mathematics, where the idea is to prove a stated result by the means of an argument which is broken down into small steps, each of which should be obviously valid to the intended audience. This is typically done in a slightly informal fashion, where the validity of each stage of the proof is supposed to be clear, but does not necessary follow a fixed and clearly stated framework. The study of logic is meant to clarify how this process works, and give a clearly defined framework for logical reasoning. While the history of logic goes back thousands of years, a solid mathematical foundation was only developed in the late nineteenth and early twentieth centuries. For example, the Principia Mathematica published in 1910 by Whitehead and Russell was an attempt to show that all of mathematics can be derived from some precisely stated set of axioms together with inference rules for obtaining conclusions from lists of premises. From this standpoint, logic becomes another field of mathematical study. Just as rings, fields, measure spaces, etc., are objects of mathematical study, so is logic. As in any of these other areas, we use standard mathematical reasoning to prove results about logical systems. However, these results can then shed light on the process of mathematical reasoning itself, as well as providing justification for the underlying frameworks used for much of mathematics.
This has been a very successful endeavor, with theories such as Zermelo-Fraenkel set theory (ZF) providing the standard set of axioms used by mathematicians in many different fields of study. Many important ideas and results have arisen from the study of logic, such as the relative consistency of different theories and independence of certain statements. The axiom of choice, for example, is often considered to be intuitively obvious but, at other times, has been considered controversial. Thanks to the mathematical study of logic, it is now known to be independent of the other axioms of ZFC. Similarly, the continuum hypothesis which Georg Cantor spent many years trying to prove, has also been shown to be independent of ZFC. It is also known that, although the axiom of choice can be used to construct sets which are not Lebesgue measurable, it is consistent with the axiom of dependent choice that all sets of real numbers are measurable. In the other direction, reverse mathematics has been successful in determining precisely which axioms are really required for various mathematical theorems. At a higher level, results such as the completeness theorem have put the theory on a solid footing, establishing the equivalence of semantic truth and syntactic provabilty in first order logics. The incompleteness theorem, on the other hand, shows that no recursively enumerable proof system can prove all true statements about the arithmetic of the natural numbers and, furthermore, no sufficiently strong proof system is able to prove its own consistency (unless it is actually inconsistent). For example, it is not possible to prove the consistency of ZFC just by using the axioms and rules of ZFC itself, although it is possible in the presence of additional large cardinal axioms. Similarly, Peano arithmetic is not able to prove its own consistency, but it is possible if the well-ordering property of the ordinal is added.
The mathematical study of logic has also made clear the distinction between classical and intuitionistic or constructive logics where the law of the excluded middle does not hold. Another consequence of putting logic on a solid mathematical foundation is that it should be possible to check the validity of mathematical proofs in an entirely systematic way and could, in theory, be checked by computer. There are various proof-checkers available, although they are currently much more difficult and tedious to use than writing out proofs for human readers, so tend not to be used for most mathematics. There is also a large intersection between mathematical logic and computer science. For example, the Curry-Howard correspondence gives a one-to-one relation between statements of intuitionistic implicational logic and the types of valid programs in simply typed lambda calculus, with the programs or lambda expressions playing the part of proofs. This correspondence has been used as the basis for various computer implementations of proof systems.
The starting point for most logical theories is a language in which statements can be formed according to rules for what constitutes a valid statement (wffs, or well-formed formulas). This generally includes certain special logical connectives which allow statements to be formed which express a logical connection between its component parts. For example, if `P’ and `Q’ are valid sentences, then `‘ is also valid (meaning, `if P then Q’). These logical connectives come with prescribed rules of inference and, together with a fixed list of axioms, we can prove theorems. This does, however, raise various questions. What is the `correct’ set of connectives and what is the correct set of rules of inference that they should follow. It is possible that, with different connectives or rules of inference, an entirely separate set of theorems would result. In this post, I take a step back from such specific theories or rules. The idea is to first look at the most general concept of what a logic is, and only after that, can we can determine exactly what connectives or rules of inference are possible or desirable.
Consequence Relations
Possibly the most basic concept in logic is that of entailment or inference. We start with a collection of premises, which are statements in some language and, according to some rules, we establish a result. The starting point is a set , which can be thought of as a set of well-formed statements in some language although, to be as general as possible, we just assume that
is a set with no specific restriction or interpretation assumed of its elements. We write
(1) |
for , to mean that
entails
or, equivalently, that
is a logical consequence of
. Quite what the relation
really means is left open at this stage. For example, considering a collection of interpretations or models, each of which assigns truth values to the elements of
, (1) can be taken to mean that any model assigning the truth value 1 to the
also assigns the value 1 to
. Alternatively, (1) could mean that, with respect to some formal proof system, there exists a proof of
from the premises
. More generally,
(2) |
means that is a logical consequence of the set of premises
. For convenience, we often put a list of subsets or elements of
as the list of premises, so that
for and
, is just another way of writing
Similarly, the statement with no premises is just another way of writing
.
As a matter of terminology, expression (2) is a sequent, and the symbol is referred to as a turnstile. The left hand side,
, is the set of premises or the antecedent and
is the consequent, and we say that
entails
or that
is a consequence of
. Although we are being rather general here and not considering any particular interpretation of the set
or of the relation
, there is a short list of `obvious’ properties which should hold for logical inference.
Definition 1 A consequence relation
on
is a relation on
satisfying
(reflexivity/axiom of identity),
- if
then
(weakening),
- if
for all
and
then
(transitivity/rule of cut),
for all
and
. We will say that
is finitary if, whenever
then
for some finite
.
A pair
consisting of a set
together with a consequence relation
on
will be known as a logic. We will often just use
to denote the logic
.
The notion of consequence relations goes back to Tarski (Logic, Semantics, Metamathematics). For logical consequence to be finitary means that the proof of a statement only ever requires invoking finitely many premises, which is clearly a desirable property if we are wanting to have a computable theory, where the truth of consequences could potentially be verified by a computer or expressed on paper. Note that any consequence relation , whether or not it is finitary, will uniquely determine a finitary consequence relation by
iff
for some finite
.
It should be noted that there are more general logics than those given by definition 1. We could drop any, or all, of the defining properties. For example, non-monotonic logics do not satisfy the second condition. Furthermore, it is possible to generalise the antecedent to be a multiset where the number of times that each element occurs is important, or even a sequence where the number of occurences and the order of the premises matters. I will not, however, be concerned with such logics, with those satisfying the definition above including everything which I intend to consider.
There will generally be many different consequence relations on a set . A relation
on
can alternatively be viewed as a subset of
consisting of all pairs
for which
. These can be ordered by inclusion
, inducing an ordering on relations
iff
implies
. We will also say that
is weaker than
or, equivalently, that
is stronger. In particular, this induces an ordering on the consequence relations. The following are examples of consequence relations.
- The minimal consequence relation on any set
, given by
if and only if
.
- The maximal consequence relation on any set
, given by
for all
and
.
- The Boolean domain containing two values, thought of as `false’ and `true’. These are variously denoted as
,
,
or simply
. The consequence relation on
is defined by
if either
or
. It can be seen that this is the minimal consequence relation satisfying
.
- Any map (truth valuation)
from
to the Boolean logic
defines a consequence relation on
given by
iff
whenever
for all
. Equivalently,
. This corresponds to the rather idealised situation where
is a set of logical statements, each of which is either true or false. Assuming that
is a set of true statements, then
iff
is true.
- A set
of models or interpretations on
defines, for each
, a truth valuation
. We define a consequence relation on
by
iff, for each
satisfying
for all
, we have
. Equivalently,
for all
. This is known as semantic consequence, and has an intuitive interpretation as follows. Consider
as a collection of universes, within which, every statement in
has a definite truth value. A set
of premises restricts us to being in
, the set of universes in which all statements in
are true. Writing
signifies that
is true for all universes in
.
This interpretation leads us to consider the concepts of consistency and explosion. A set
is consistent if there exists at least one model
in which all elements of
are true and, otherwise, is inconsistent. From an inconsistent set of premises
, everything follows, so
for all
. This is the principle of explosion (“from falsehood, everything”). On the other hand, for a reasonably expressive logic system, every model will make some statements false. For example, if
contains a statement `P’ and another statement `not P’, then these should not both be true in any model and, hence, if
is consistent then it will not entail everything in
. This suggests the concept of consistent and inconsistent sets of premises in a general logic:
is inconsistent if
for all
and, otherwise, is consistent. The logic itself is inconsistent if
for all
and, otherwise, is consistent.
- We can define a consequence relation on any collection
of subsets of a set
. Say
if
. For example,
could be the power set
, or it could be the collection of open (resp., closed, compact) subsets of a topological space.
This consequence relation can also be defined by the collection
of truth valuations given by
if
and
if
.
- Any collection of consequence relations
(
) on a set
has a unique infimum given by
iff
for all
.
- For any relation
on
, there will be a unique minimal consequence relation such that
whenever
. This is the consequence relation generated by
.
- For any logic
and set of premises
, there exists a minimal consequence relation,
on
satisfying
whenever
, and,
for all
. This has the rather simple construction as
iff
. Such constructions are common. For example,
may consist of statements built from some specified list of symbols combined with logical connectives. The base consequence relation
would then satisfy the deduction rules associated with those connectives, and we add a set
of axioms specific to the theory at hand to obtain the required consequence relation.
- Let
be a partially ordered set, and write
iff
for some
. Then,
is a consequence relation.
- Syntactic consequence: Given rules for constructing well-formed formulas (wffs) in a given language, and rules of inference to generate new wffs from old ones,
can be taken to denote that the wff
can be deduced from those in
by a sequence of applications of the deduction rules.
- For a strongly typed programming language (e.g., simply typed lambda calculus), we can let
be the collection of types and
means that there exists a function defined in the language with return type
and whose arguments are of types in
.
Finitary consequence relations are uniquely determined by their restriction to finite sets of antecedents. That is, in definition 1, we need only consider to be a relation on
, where
denotes the collection of finite subsets of
. Then, the sets
in the definition should be taken to be finite. Any such relation
uniquely extends to a finitary consequence relation on
by
iff
for some finite
. For finitary consequence relations, the statements of definition 1 can be significantly weakened.
Theorem 2 A relation on
is a consequence relation iff,
,
- if
then
,
- if
and
then
,
for all finite
and
.
Proof: Note that the first statement is the axiom of identity for the special case , the second statement is weakening for the case
, and the third statement is transitivity for the case
. So, consequence relations satisfy the properties above. It remains to show that if
satisfies the properties of the theorem, then it is a consequence relation.
If and
then, to prove weakening, it needs to be shown that
. We will use induction on the size of
. If
is empty, then the statement is trivial. Otherwise, we can write
for
strictly included in
. By the induction hypothesis,
. The second statement of the theorem says that
or, equivalently,
as required. The axiom of identity,
follows from the first statement of the theorem together with weakening.
For transitivity, suppose that for all
and that
. We need to show that
. Again, induction on the size of
will be used. If
is empty then the statement is trivial. Suppose that
for
strictly smaller than
. Then, we have
for all
and
. So, by the induction hypothesis,
and
. By the third statement of the theorem, this gives
as required. ⬜
An alternative approach to logic is via closure operators. The logical closure of is the set of all
for which
, and can be shown to satisfy the axioms of a closure operator. Similarly,
is said to be logically closed if every
satisfying
is itself in
. That is,
is logically closed if it equals its own logical closure.
Definition 3 A closure operator is a map
satisfying
.
- if
then
.
.
We say that
is finitary iff
(3)
Note that if is finitary then the second condition in the definition of a closure operator can be removed, as it already follows from (3). Specifying a closure operator on a set is equivalent to specifying a consequence relation.
Theorem 4 Given a set
, there is a one-to-one relation between each of the following
- a set
which is closed under intersections and such that
.
- a closure operator
.
- a consequence relation
on
.
These are related by
(4) and
is finitary iff
is finitary.
Proof: If is a consequence relation, define the map
by the second equality of (4). We need to show that this satisfies each the three properties of definition 3 of a closure operator. The first property,
follows immediately from the axiom of identity and the second from weakening. For the third, if
then, setting
, we have
for all
and
so, by transitivity,
, showing that
and, hence,
. Conversely, if
is a closure operator then define
if
. The axiom of identity follows from
and weakening follows from
for
. For transitivity, suppose that
for all
and
. Then,
and
and, hence, , showing that
is a consequence relation.
For a closure operator , define
to be the set of
with
, so that the third equality of (4) holds. By definition, every
is of the form
. Also, as
,
can also be expressed as the sets
for
, showing that the final equality of (4) also holds. We have
. If
over
, then setting
,
Taking the intersection over shows that
and, hence,
is closed under intersections. Conversely, suppose that
is any collection of subsets of
and define
by the first equality of (4). It is clear that this satisfies the first two conditions of definition 3. Next, for every
with
we have
by definition so,
So, is a closure operator. We have not even used the conditions that
or that
is closed under intersections yet. This is only required to verify the final identity of (4). It is clear that every
is the intersection of all
containing
, so that
. Conversely, it remains to show that
for all
. As
, there does exist at least one
containing
and, as
is closed under intersections,
is then in
.
Finally, suppose that is finitary. Then, for any
, we have
and, hence,
for a finite
. This shows that
and, hence (3) holds and
is finitary. Conversely, if
is finitary and
then
and, by (3),
for a finite
. So,
, showing that
is finitary. ⬜
As noted above, any relation on
generates a consequence operator. Furthermore, if
is finitary, then
is also finitary, and an entailment
can be established by a finite sequence of applications of
, as stated below. In particular, if
is recursively enumerable, then so are the consequences
, and proofs of these entailments can be checked by following a recursively enumerable set of rules, so could potentially be verified by computer.
Lemma 5 Let
be the consequence operator generated by a relation
on
. Then
iff
or there exists
with
and
.
Furthermore,
is finitary.
Proof: We show that by induction on n. First, if
then
and, by weakening,
. Next, for
, we have
for each
by the induction hypothesis and, hence,
for all
. As
, transitivity gives
.
Conversely, it needs to be shown that every with
can be expressed as above. Let us start by defining
to mean that
satisfies the conclusion of the theorem. If it can be shown that
is a finitary consequence relation then we are done. The axiom of weakening is immediate and, as the construction above only refers to finitely many finite subsets of
, it is finitary. The axiom of identity is also immediate as, by definition,
for all
.
By theorem 2, it only remains to show that if and
then
. If
or
then the result is immediate. So, we only need to consider the case where we have finite sequences
, finite sets
and
satisfying
and
, and
,
. In that case, set
and
for
. Similarly, set
and
for
. We then have
,
and
, so
as required. ⬜
A consequence relation defines a preorder on
, given by
. This can be expressed in either of the following ways.
Lemma 6 For any
, the following are equivalent.
.
for all
satisfying
.
for all
and
satisfying
.
Proof: If the first statement holds and then
by transitivity.
Suppose that the second statement holds and . As
, the second statement gives
. Transitivity gives
.
Finally, suppose that the third statement holds. Using , we have
and, hence,
. Putting
gives the first statement. ⬜
Consequently, defines an equivalence relation on
given by
and
. We say that
and
are logically equivalent.
Corollary 7 For any
, the following are equivalent.
and
.
iff
, for all
.
iff
, for all
and
.
If any (or all) of these hold, then we say that
are logically equivalent and write
.
The following is just a statement of the fact that a consequence relation is not able to distinguish logically equivalent statements.
Lemma 8 Let
and
. Suppose that
and for every
there is an
with
.
If
then
.
Proof: For every , we have
for some
. As
, this implies that
. By transitivity,
. So,
. ⬜
Note that for any logic we can define its set
of equivalence classes, together with a natural function
taking each
to its equivalence class. We can then define a consequence relation on
by
iff
for
and
.