Zermelo–Fraenkel set theory

Zermelo–Fraenkel set theory is a first-order axiomatic set theory. Under this name are known two axiomatic systems - a system without axiom of choice (abbreviated ZF) and one with axiom of choice (abbreviated ZFC). Both systems are very well known foundational systems for mathematics, thanks to their expressive power.

Fundamentals
What follows is a beginner's overview of the background necessary to understand what ZFC exactly is. We assume that the reader is familiar with sets and the notation of formal logic.

Formal languages and theories
A formal language consists of an alphabet and a collection of formation rules. The alphabet is a collection of valid symbols. Sentences within the language consist of strings of symbols, and the formation rules specify precisely which sentences are considered valid. Intuitively, formation rules can be thought of as being grammars telling us if a sentence is correctly written.

A theory is a set of sentences in a given formal language, called the axioms of the theory. A deductive system within a theory is a set of inference rules, which we use to produce new sentences given a set of existing sentences. If we start with a set of axioms and then keep applying all sorts of inference rules, we build up a collection of sentences that we call theorems. The intuitive connection to make here is that axioms are things that we assume to be true, and the theorems are things that we prove to be true. The inference rules are the basic elements necessary to make a proof -- any time you apply an inference rule to a true statement, we get another true statement.

One theory, called predicate calculus, is very useful to set theorists. Its most important feature is to provide us with logical connectives \(\land,\lor,\neg,\Rightarrow,\Leftrightarrow,\forall,\exists,\) as part of its alphabet. It also gives us rules of inference to give these logical connectives meaning. A complete description is beyond the scope of this article; see ProofWiki for that.

The language of set theory augments predicate calculus with infinitely many set variables, denoted by letters, and two set connectives \(=,\in\). A single set theory is therefore a theory employing the language of set theory.

Truth, consistency, and incompleteness

 * From here, we'll assume that the theories we are dealing with are extensions of predicate calculus. For example, the notion of refutation is meaningless if our theory doesn't even have a negation operator.

Starting from the axioms of a theory, we can deduce which statements are true, and which are false. Given a theory \(T\), we say that \(T\) proves sentence \(\phi\), denoted \(T\vdash\phi\) if the rules of inference can be repeatedly applied to axioms ultimately leading to \(\phi\). We also say that \(T\) proves \(\phi\) true. The negation of \(\phi\), denoted \(\neg\phi\), can be thought of as the logical opposite of \(\phi\). If \(T\vdash\neg\phi\), we say \(T\) refutes \(\phi\), or proves \(\phi\) false.

If a statement can be proven true, then common sense tells us that it cannot be proven false. Unfortunately, in formal languages, it's not that simple. Here's trivial example: theory \(T\) contains both \(\phi\) and \(\neg\phi\) as axioms. It's clear that such a theory won't have very wide uses: why bother working with true and false if we don't have a dichotomy? But it turns out that truth is more brutal — if we can prove that some statement is both true and false, the tells us that we can prove that everything is true and false! A theory which proves some statement as well as its negation is called inconsistent, and when this is impossible, the theory is consistent.

On the other hand, in a theory there can be sentences which cannot be proven nor refuted by axioms of the theory, even though it's a properly written sentence. If such sentence exists, we say that theory is incomplete, and such a statement is said to be independent of the theory. Of course, it can only happen in a theory which is consistent, because otherwise every sentence is provable.

A famous example is the, which states that there are no cardinal numbers strictly between \(\aleph_0\) and \(2^{\aleph_0}\). Kurt Gödel showed that it cannot be refuted in ZFC, but later Paul Cohen showed that it cannot be proven in ZFC either. Oops. So we say that the continuum hypothesis has been proven independent of ZFC, which isn't a very satisfactory answer to the problem. Another well-known example is the axiom of choice — it's known to be independent from ZF. Set theorists realized that the axiom of choice is a very reasonable assumption, so it was added to form the now widely-accepted ZFC. (You can see here that, in mathematical logic, we must walk the line as far as what axioms we'll accept. What makes a reasonable axiom is a controversial and subjective matter — the early 20th century was rife with academic debates on this topic.)

shows that if the language of a theory is expressive enough to form statements about basic arithmetic, and the set of axioms is simple enough to be written down by a Turing machine, then it cannot be both consistent and complete. That is, there either is a sentence which is both provable and refutable, or is neither provable nor refutable. In very fuzzy and informal terms:


 * We can't develop a sensible, sufficiently powerful theory that conclusively solves every problem.

For an in-depth, beginner-level coverage of this topic, see the Pulitzer Prize-winning book  by Douglas Hofstadter. Among the many interwoven topics in the book, one of them is a walkthrough of the construction of an axiomatic theory of arithmetic and a specific example of how to "stump" it with an independent statement.

It's worth noting that there are some theories which are both complete and consistent. Recall that the first incompleteness theorem applies to theories expressive enough to form statements about basic arithmetic, but not necessarily to those that can't., for example, is complete and consistent, but it is not strong enough to form statements about multiplication.

Models: Informal
A theory is quite a bit like a code of law. It dictates what's legal and what isn't. But what's the use of a code of law without a city that enforces and abides by the law? This is where the notion of a model comes in. A theory gives us rules (theorems) to abide by, whereas a model is an actual implementation of those rules.

After all this discussion of provability and independence, it may seem weird to say that every valid statement in a given formal language is either objectively true or false inside a set. You could visualize a set as an infinite table of valid statements, each assigned to a truth value:

There's no notion of independence here, since we haven't dealt with inference rules or axioms. Every statement is 100% true or 100% false. (Imagine that these statements are part of a formal system, which English is not.)

A theory, on the other hand, is a collection of axioms and deduction rules that generate a table like this one:

Now we can move on to the notion of satisfaction. Given a theory \(T\) and a set \(M\), if every statement that's true in \(T\) is also true for \(M\), and every statement that's false in \(T\) is also false in \(M\), then \(M\) is a model of \(T\), or \(M\) models \(T\). The set represented by the first table is a valid model of the theory represented by the second table. The truth values of the non-independent statements match up with those of the model.

Models: Formal
The motivation for most of our theories is to formalize our attempts to prove statements about abstract objects. For example, Peano arithmetic was created to formalize theorems about the set of natural numbers. The set of natural numbers satisfies all of Peano's axioms, and we thus can say that this set models these axioms, or that it's a model of PA. Similarly, one can argue that the is a model of ZF theory. The difference between these two examples is that universe of all sets isn't a set — it's something known as proper class. This is a problem because neither ZF nor ZFC can talk about proper classes. We want the model to be a set.

Inside a set, every sentence is either true or false. This is not subject to provability: the validity of every sentence is now an inherent property of the set. If, inside this set, all the axioms of a theory \(T\) are true, then we say that this set is a model of \(T\).

An important property of predicate calculus is that, if sentence \(\phi\) can be deduced from sentences which are true inside the model, then \(\phi\) is true in the model as well. Thanks to this, we can see that if \(T\) is inconsistent, then it doesn't have a model — inside a model, no sentence can be both true and false. So every theory with a model is consistent.

The converse is also true — every theory which is consistent has a model. This is exactly the. This equivalence is very important, because working with models of set theory allows us to prove the independence of many statements.

Axioms
First eight of these axioms are axioms of ZF. Together with the last one, they form ZFC. Intended interpretation for \(\in\) is "is a member of" and for \(=\) is "is the same set as".

Axiom of Extensionality
\[\forall x \forall y (\forall z: (z \in x \Leftrightarrow z \in y) \Rightarrow (x = y))\]

Given sets \(x,y\), if they have exactly the same elements, then they are equal.

This can also be treated as a definition of equality rather than an axiom, as it's not required for us to have \(=\) in our alphabet.

Axiom of Pairing
\[\forall x \forall y \exists z \forall w (w \in z \Rightarrow (w = x \lor w = y))\]

Given sets \(x,y\) there exists set \(z=\{x,y\}\). By using induction, we can show that for sets \(x_1,...,x_n\) there exists a set \(z=\{x_1,...,x_n\}\).

Axiom of Regularity
Also known as axiom of foundation.

\[\forall x (\exists a (a \in x) \Rightarrow \exists y (y \in x \land \neg \exists z (z \in y \land z \in x)))\]

If set \(x\) is non-empty (i.e. it has an element \(a\)), then there exists an element \(y\in x\) such that \(y\) and \(x\) are disjoint. Equivalent formulation is following: suppose we start with a set \(x_0\), and at every step we take some element \(x_{i+1}\in x_i\). Then this process eventually stops, as we get to empty set.

Example application is that no set can be an element of itself.

Axiom Schema of Specification
Axiom schema is an infinite class of axioms which share their overall structure. For every sentence \(\phi(a,b)\) of set theory we throw in following axiom:

\[\forall x \forall w \exists y \forall z (z \in y \Leftrightarrow (z \in x \land \phi(z, w)))\]

Intuitive meaning is that definable subset of a set is a set itself. To be exact, for every set \(x\), formula \(\phi\) and parameter \(w\) there exists a set \(y\) consisting of these elements \(z\in x\) satisfying \(\phi(z,w)\).

This axiom schema is also known as axiom schema of restricted comprehension, as opposed to unrestricted comprehension. The latter axiom throws out the requirement \(z\in x\). That axiom is however inconsistent, as, by using \(\phi(a,b):=\neg a\in a\), we lead to Russel's paradox.

Axiom of Power Set
\[\forall x \exists y \forall z (z \in y \Leftrightarrow \forall w (w \in z \Rightarrow w \in x))\]

For every \(x\) there is a set \(y\) elements of which are exactly subsets of \(x\). This set is called power set of \(x\) and is denoted \(\mathcal{P}(x)\). By, this allows us to construct sets of increasing sizes.

Axiom of Infinity
\[\exists x (\emptyset \in x \land \forall y (y \in x \Rightarrow (y \cup \{y\}) \in x))\]

Note that this is the only axiom which asserts existence of anything. It says that there exists a set containing infinitely many elements, and it follows from this axiom that empty set exists.

Note that \(\emptyset\) and \(y \cup \{y\}\) is just a notational shorthands for objects expressible purely in set theory language.

Axiom of Union
\[\forall x \exists y \forall z (z \in y \Leftrightarrow \exists w (w \in x \land z \in w))\]

For every set \(x\) there exists a set \(y=\bigcup_{z\in x}z\), a union of all elements of x.

Axiom Schema of Replacement
Another axiom schema. The following is an axiom for every formula \(\phi(a,b,c)\):

\[\forall p ( \forall x \forall y \forall z (\phi(x, y, p) \land \phi(x, z, p) \Rightarrow y = z) \Rightarrow \forall x \exists y \forall z (z \in y \Leftrightarrow \exists w (w \in x \Rightarrow \phi(w, z, p))))\]

Here \(p\) is parameter, and \(\phi(x,y,p)\) expresses that some function \(f\) takes value \(y\) at \(x\). Then for every \(x\) there exists a set \(y\) which is the image of \(x\) under \(f\), so \(z\) is in \(y\) iff we have some \(w\) with \(f(w)=z\).

Axiom of Choice
As already noted, this is the only point where ZF and ZFC differ.

\[\forall x (\neg(\emptyset \in x)\land\forall p\forall q\forall r(p\in q\land p\in r\land q\in x\land r\in x\Rightarrow q=r) \Rightarrow \exists y \forall z (z\in x\Rightarrow \exists! w (w\in z\land w \in y)))\]

If \(x\) is a collection of disjoint sets, then there is a set \(y\) which has exactly one element in common with every element of \(x\).

\(\exists!a\phi(a)\) is a notational shorthand saying "there exists a unique \(a\) satisfying \(\phi(a)\)", that is, \(\exists a(\phi(a)\land\forall b(\phi(b)\Rightarrow b=a))\).

This important and controversial axiom allows us to make existence proofs, such that we can't point out a single example satisfying the theorem. Because of this axiom of choice is a highly nonconstructive statement.