Cantor-Schroeder-Bernstein made easy
Earlier this month I made a post about one of Felix Bernstein’s theorems on infinite cardinals, and briefly mentioned something much more famous with his name on it: the Cantor-Schroeder-Bernstein theorem. Though the result of this very important theorem is intuitively obvious, its proof gave me a lot of trouble as an undergrad. Recently I tried going through it again, and was surprised to find it isn’t so hard after all. This proof will follow the attractive one given in Stewart & Tall’s excellent book The Foundations of Mathematics (Oxford Science Publications):

Prerequisites: Bijective functions, Cardinality, Inverse functions
Theorem (Cantor, Schroeder, Bernstein): Suppose f:X -> Y and g:Y -> X are injections. Then there exists a bijection b:X -> Y.
(In terms of cardinalities, this says that if we know both |X| =< |Y| & |Y| =< |X|, then in fact |X| = |Y|. Of course, you may say. If two objects are both smaller than or equal to each other in size, their size must equal. Yet this fact requires a careful proof, for X and Y may both be infinitely large, and f & g generally have nothing to do with each other.)
Proof: The trick is to use f and g to partition X and Y into three subsets each, and then to construct b out of bijections found between those subsets. Given an arbitrary element y in Y, we can build a certain chain leading to three distinct possibilities, as follows:
If there happens to exist an element x in X such that f(x) = y, then that x is unique, because f is an injection, i.e. one-to-one function. We don’t assume f to be a surjection, so it’s possible no such x exists. Suppose there is such an x; then we can ask the same question about x, whether or not there exists a y* in Y with g(y*) = x. If such a y* does exist, again it is unique, since g is an injection. At this point, there may or may not exist a unique x* with f(x*) = y*…and so on…and so on…
Let’s suppose this chain goes up to x* and then stops, so that there’s no y** in Y satisfying g(y**) = x*. Then the picture looks like this:

Now in general for our arbitrary y in Y, three things can happen:
Possibility 1: the chain comes to an end with an element of X, as pictured above.
Possibility 2: the chain comes to an end with an element of Y. This case covers the situation in which the chain never starts in the first place.
Possibility 3: the chain goes on forever.
It doesn’t take much thought to see these three possibilities cover every situation, without any overlap. Therefore, Y gets divided into three disjoint zones: the place where Possibility 1 happens we’ll call Y_X, and similarly we have Y_Y and Y_∞ for Possibilities 2 and 3 respectively. Symmetrically, we can build chains in X to obtain disjoint subsets X_X, X_Y, and X_∞. Here’s the picture now, forgetting about elements:

Note that in general these zones will be scattered around X and Y in a disorderly way, but since they’re disjoint, for conceptual clarity we can represent them in this clean way.
Now consider restricting f to X_X; elements in X_X sent through f go to Y_X in a one-to-one fashion. Furthermore, for any element in Y_X, there’s a unique element in X_X sent to that element via f. In other words, f restricted to X_X is both injective and surjective - so this is a bijection between X_X and Y_X! In a symmetric way, if we restrict g to Y_Y, we get a bijection between Y_Y and X_Y. There are two bijections available between X_∞ and Y_∞ - f restricted to the former, and g restricted to the latter. In the following picture, I used f restricted to X_∞, and also included all of the inverse functions that we get for free.

(Yes, these pictures were painstakingly produced in MS Paint.)
Now it should be clear that there is a bijection b from all of X to all of Y. We define b piecewise: if x is in X_X, b(x):= (f|X_X)(x); if x is in X_Y, b(x):= (g|Y_Y)^(-1)(x); and if x is in X_∞, b(x):= (f|X_∞)(x). Clearly b is a bijection because its constituent parts are bijections.
QED!
This is the type of proof that takes some words to get down, but which the picture makes extremely clear. Having it under our belt, we can safely compare the sizes of any two infinite sets in a consistent way. Woe if that weren’t actually possible!