Forcing is a technique invented by Paul Cohen in 1963, first used in set theory to prove that the continuum hypothesis is independentIt cannot be proved nor disproved, using the axioms of set theory from the axioms of set theory. It requires quite a bit of machinery, thus the aim of this text is to provide, for mathematicians or uni students, an entry point into forcing with almost no prior knowledge in logic needed. A lot of things are not proved or explicited, but I tried my best to outline what was going on and to justify the definitions. So it should be easier for you to dive into more detailed books (references are given at the end).
Set theory is often described as a rigorous foundation for mathematics. But in my opinion (and others) it is best to see it as a theory which studies the notion of infinity. To do so it uses objects called sets. What these sets can do is described by a list of axioms based on our intuition of how they should behave, this list is called ZFC (Zermelo-Fraenkel + the axiom of Choice). A mathematical structure satisfying these axioms is called a model of set theory ; just like a structure can satisfy the axioms that define a group.
The definitions of models, true statements and proofs are all made precise by the formalism of first-order logic. The latter also provides powerful theorems and tools, for example:
A fundamental question quickly arised in set theory when Georg Cantor proved that the infinity of the natural numbers (written \(\aleph_0\)) was strictly smaller that the infinity of the reals (called continuum, written \(2^{\aleph_0}\)): is there an in-between? The continuum hypothesis (CH), which Cantor believed was true, is the statement answering "no" to this question ; it can be written as \(2^{\aleph_0}=\aleph_1\), where \(\aleph_1\) is the first uncountable cardinal (and \(\aleph_2\) is the first strictly bigger than \(\aleph_1\), and so on...). But despite its simple appearance no one managed to prove or disprove it.
The last assertion I made about the Lowenheim-Skolem theorem might have shocked you. Indeed this would imply, if ZFC does have a model, that it also has a countable model. But there are uncountable sets like \(\mathcal{P}(\mathbb{N})\) so how could that be? Recall that a set \(A\) is uncountable if there is no bijection between \(\mathbb{N}\) and \(A\). Now actually have to distinguish two things: the point of view of the model, and the point of view from outside of it. Outside the model we can see that it is countable, but inside of it we can't. Because even if you can see that \(A\) is countable from the exterior, there is in fact no bijection between \(A\) and \(\mathbb{N}\) inside of the model. The same sort of reasoning applies for similar situations involving the powerset for example. Do not hesitate to pause and ponder over all this as it is quite mind-boggling, you can also search online to have explanations from other people.
In model theory, different models of a same theory may not always agree on what is true and what is not, this applies to ZFC as well. For example consider the structures \(\langle\{-1, 1\}, \times\rangle\) and \(\langle\mathbb{Q}, \times\rangle\) both satisfying the axioms of groups. For the former, the sentence \(\forall x, x^2=1\) is true, but not for the latter. Formulas that are true (or false) in all the different models are called absolute. Absoluteness is important to point out when you need to be formal, but in this text I will try not to get into that to make things a bit simpler.
Around 1940, Gödel proved (via inner models) that if ZFC is consistent then so is ZFC + CH. In other words, one cannot disprove CH from ZFC, because otherwise this would mean that ZFC + CH proves both CH and ¬CH (the negation of CH), and thus it would not be consistent. So two possibilities remain, and so CH would be independant from ZFC. For the second option we need to prove that if ZFC is consistent then so is ZFC + ¬CH. Thanks to the completeness theorem, we can equivalently show that if there is a model of ZFC then there is a model of ZFC + ¬CH.
So suppose we have a model \(M\) of ZFC, what should we do? Refine Gödel's method maybe? Unfortunately I've been told it cannot work... But in that case maybe we can have more luck with extensions. An analogy of the situation can be made with field extensions. If \(K\) is a field in which a root \(\alpha\) of a polynomial \(P\) is missing, we can construct an extension \(K(\alpha)\) of \(K\) in which \(P(\alpha)=0\) this time. This construction can be controlled "from the ground" by considering \(K[X]\) and then quotienting. The trouble is, the axioms of ZFC are much more complicated than the ones for fields, a substantial part of mathematics can be carried out with them after all. So it is not possible to just adjoin an arbitrary set and expect to have another model of ZFC every time.
But in fact, and surprisingly, you can find sets \(G\) such that \(M[G]\) remains a model of ZFC and such that the properties of \(M[G]\) have been controlled from the ground model \(M\)! This is what the forcing technique manages to do, by using a set (in the ground model) of approximations of the desired object, and a relation \(\Vdash\), called the forcing relation, relating the truth value of statements in \(M[G]\) with these approximations. To set up all of this, we are going to suppose that \(M\) is a countable transitive model (that's possible thanks to techniques like Lowenheim-Skolem and Mostowki collapse) ; we will see what this means and explain why later.
The proof of this theorem can be seen as a simple application of forcing genericity, we will see how later, but first let's prove it.
By contradiction suppose \(F:\mathbb{N}\to2^\mathbb{N}\) is such a surjection. To establish a contradiction, we want to show \(F\) is actually not surjective by finding a function \(g\in2^\mathbb{N}\) which is not listed by \(F\).
To do so we can try to construct one. And make sure it is different from all the functions listed by \(F\). Though since this seems like a cumbersome task, we would rather proceed one function at time, by making sure the following requirement is satisfied for all \(n\) : \[\mathcal{R}_n := g\neq F(n)\] This can be explicited a tad further: \(\mathcal{R}_n := \exists x, g(x)\neq F(n)(x)\). And to make sure each of them is fulfilled, we are going to define \(g\) one value at a time via induction, ensuring \(\mathcal{R}_n\) is fulfilled at step \(n\).
We start with the empty function for initialisation (step -1). And for heredity suppose \(g\) is defined on \(\{0, \ldots, n-1\}\) ; we wish to extend \(g\) so it is also defined on \(n\) and fulfils \(\mathcal{R}_n\). Maybe \(g\) is already different (on its domain) from \(F(n)\), but maybe it is not. In all cases we can just define: \[g(n):=\begin{cases}0&\text{ if }F(n)(n)=1\\ 1&\text{ otherwise}\end{cases}\] Or in a more concise manner: \(g(n):=1-F(n)(n)\).
Before proceeding, here's another way to look at the induction: say \(g_n\) corresponds to \(g\) as it is defined at the \(n\)-th step of the induction, we shall call it an approximation. We know that it has a finite domain \(\{0, \ldots, n\}\) and also that \(g_{n+1}\) is an extension of \(g_n\). By using these two facts, the unionIn set theory a function \(f\) is represented as the set of pairs \((x, y)\) where \(x\) is in the domain of \(f\), and \(y\) is its unique image \(\bigcup g_n\) is a well-defined function from \(\mathbb{N}\) to \(\{0, 1\}\), and it actually is the definition \(g\). And since \(g_n\) satisfies the \(n\)-th first requirements, we have that \(g\) satisfies all of them.
To see how this relates to forcing we are going to analyse the proof a bit to unveil some definitions. We can distinguish three types of objects here:
The finite approximations have a certain structure ; some can be extensions of others like \(0, 1\mapsto 0\) and \(0\mapsto 0\), but others can also be incompatibles like \(0\mapsto 0\) and \(0\mapsto 1\). They form a partial order\(f\leqslant g\) iff \(\operatorname{dom}f\subseteq\operatorname{dom}g\) and \(g_{\upharpoonright\operatorname{dom}f}=f\) and moreover this order contains a smallest element\(m\) is a smallest element of a set \(S\) iff \(\forall s\in S, m\leqslant s\): the empty function. All this motivates our first definition.
In our case we had the forcing notion \(\{f:\mathbb{N}\to\{0, 1\}\mid\operatorname{dom}f\text{ is finite}\}\), it is called Cohen forcing.
As we saw earlier, to construct \(g\) from the approximations \(g_n\), it was crucial for them to be two-by-two compatible. To be more precise, we needed to have a common extension for any couple of approximations. Moreover, to avoid ending up with a trivial limit object the sequence cannot be empty. This leads us to our new representation of limit objects:
Our induction worked because at each step we had the possibility to extend \(g_n\), who did not necessarily satisfy \(\mathcal{R}_{n+1}\), into a function satisfying it. Requirements possessing this property are the ones we are interested in, due to their flexibility, as we can satisfy them at any time during a construction.
Now a requirement \(\mathcal{R}\) is represented by the subset \(R:=\{p\in\mathbb{P}\mid\mathcal{R}\text{ is satisfied by }p\}\subseteq\mathbb{P}\), this way any subset of \(\mathbb{P}\) represent a requirement. Notice how a condition or an ideal satisfying \(\mathcal{R}\) can now be simply written \(p\in R\) and \(G\cap R\neq\varnothing\). This leads us to:Why this name generic? When studying the sets that made adjunction to the ground model impossible, Cohen realised that they all had specific features. So he had the idea that adjoining a set with no particularities could make it work. Here is how he puts it:
"Now one can trivially adjoin an element already in M. To test the intuition, one should try to adjoin to M an element which enjoys no “specific” property to M, i.e., something akin to a variable adjunction to a field. I called such an element a “generic” element. Now the problem is to make precise this notion of a generic element." (exctract of The origin of forcing)
With that in mind we can see how being \(\mathbb{P}\)-generic over \(M\) captures this idea, as it intuitively means "possessing all the properties you can have in \(M\)".Let \(\mathbb{P}\) be Cohen forcing, and let \[ D_n := \{p\in\mathbb{P}\mid \exists x, p(x)\neq F(n)(x)\}\\ E_m := \{p\in\mathbb{P}\mid m\in\operatorname{dom}p\}\] These sets are dense. Indeed for any \(D_n\) and \(p\in\mathbb{P}\), since \(p\) has finite domain, we can find \(x\notin\operatorname{dom}p\) and then have the extension \(q:=p\cup\{(x, 1-F(n)(x))\}\in D_n\). And for any \(E_m\) and \(p\in\mathbb{P}\), either \(m\) in already in \(\operatorname{dom}p\), or it is not and thus we can define the extension \(q:=p\cup\{(m, 0)\}\in E_m\).
Thus we can use the existence lemma to get the \(D\cup E\)-generic function \(g:\mathbb{N}\to\{0, 1\}\). By \(D\) genericity it is not in the range of \(F\), and by \(E\) genericity its domain is total.Due to the axioms of ZFC, adding \(G\) to our ground model, will also add all the sets that can be built using it. So we need to be able to know who those "stowaway" sets are, as we are building \(G\) with approximations, in order to influence its construction, and thus get \(M[G]\) to be a model of ZFC + our desired property. To achieve all this, while remaining in our ground model \(M\), a bit of work is needed, and this is what names are for.
Names are basically recipes or blueprints using an unknown ideal to define who they are. You can also keep in mind the analogy with polynomials, for whom the value is only determined after the value of the unknowns are given. For example: \[A:=\{n\in\mathbb{N}\mid 2n\in G\}\quad\text{or}\quad B:=\begin{cases}G&\text{ if }\varnothing\in G\\\varnothing&\text{otherwise}\end{cases}\] Notice how you can describe these sets without knowing what \(G\) contains. You don't know their actual value until you know \(G\). Besides, you can use previously defined recipes to define new ones. Again, much like polynomials which you can add or multiply together to get a new one. For example: if \(A\) and \(B\) are defined recipes then you can define \(C:=\{A, B\}\)
So a name is a set which contains potential elements, the ones we end up keeping will depend on the ideal we have. Thus we need a way to keep track of that, for each potential element. A simple solution would be to have some sort of labeling, where labels represent ideals. Though we can't use them directly, because we ultimately will use generic ideals, which are not in our ground model (indeed if \(G\in M\) then so is \(\mathbb{P}-G\), and you can check it is dense, so it intersects \(G\)... a contradiction) But there's no problem in using their approximations, which leads us to the following definition:
When we proved Cantor's theorem, there was no problem to establish that the limit object would still satisfy the same requirements as its approximations. This is because we could prove that all the limit objects you could get starting from a certain approximationall the limit objects you could construct from \(p\) would satisfy the same requirements. But in more general situations things could be more complicated, this is why we need the forcing relation. Its definition corresponds to what we just describe:
This definition is great but it depends on all possible \(G\), and such generic set are not in \(M\). This means that in \(M\) we cannot know if \(p\) forces \(\varphi\) or not.
To address this issue the idea consists of defining a syntactic equivalent of the forcing relation, i.e. we define a relation \(p\Vdash^*\varphi\) by induction on \(\varphi\)In first-order logic, formulas (or statements) are defined by an inductive process, so by following the same process we can define things that apply for all formulas.. We then can verify the following:The basic idea to tackle the problem is add \(\aleph_2\)-many different elements of \(2^\mathbb{N}\) to our ground model \(M\). Consequently, using the truth lemma, we will have \(|2^\mathbb{N}|\geqslant\aleph_2>\aleph_1\) in \(M[G]\) (which is a model of ZFC since \(G\) is generic), hence refuting CH.
To do so we are going to use the existence lemma on Cohen's forcing \(\mathbb{P}:=\{f:\aleph_2^M\times\mathbb{N}\to\{0, 1\}\mid\operatorname{dom}f\text{ is finite }\}\) to find an \(M\)-generic ideal \(G\), this is possible because \(M\) is countable. Hence \(\bigcup G\) is a total function encoding \(\aleph_2^M\) different elements of \(2^\mathbb{N}\), because the following sets are dense: \[ D_{\alpha, \beta}:=\{p\in\mathbb{P}\mid\exists n\in\mathbb{N}, p(\alpha, n)\neq p(\beta, n)\}\\ E_{\alpha, n}:=\{p\in\mathbb{P}\mid (\alpha, n)\in\operatorname{dom}p\}\]
We made the assumption that our model was countable and transitive, which was the historical approach, but other approaches were developped since then. One of them, attributed to Scott and Solovay, is based on boolean valued models and doesn't require for the ground model to be countable and transitive. It is well covered in Thomas Jech's Set Theory and Timothy Chow's beginner's guide to forcing. Note also that nowadays forcing is not restricted to set theory anymore, it has reached many fields, has connections with topology, and there are many other things i omitted like Martin's axiom or iterated forcing for instance.
Here are some resources I gathered on the subject, I used most of them.
I have an MSc in logic but I am not an expert in set theory or even forcing. The present document was produced, in a hurry, for the first Summer of Math Exposition. I'm not entirely satisfied with it, so if you find a mistake, have a suggestion or a comment, you are more than welcome to contact me at william.gaudelier@gmail.com