## PREVIEW

we prove that every natural number is the sum of four integer squares, following a proof of Hurwitz. This proof has been chosen because it resembles the proof of Fermat’s two square theorem already given in Chapter 6 , and because it introduces the quaternions, a mathematical structure with many beautiful algebraic and geometric features.

We define the quaternions to be the matrices $\left(\begin{array}{cc}a+d i & b+c i \ -b+c i & a-d i\end{array}\right)$ where $a, b, c, d \in \mathbb{R}$, after verifying that the matrices $\left(\begin{array}{cc}a & b \ -b & a\end{array}\right)$ behave like the complex numbers. In this representation, the norm is just the determinant, and its multiplicative property follows from the multiplicative property of determinants. On complex-number matrices, the determinant gives again the two square identity, and on quaternions it gives a four square identity.

“Quaternion integers” should be the quaternions with $a, b, c, d \in \mathbb{Z}$. However, these lack the division property. To bring it in we augment them with “half integer points” to form the so-called Hurwitz integers. We can then establish a Euclidean algorithm and a prime divisor property. (The quaternion product is noncommutative, which is a slight obstacle, but we get around it by taking care always to multiply and divide on the same side.)

The proof of the four square theorem then follows the proof of the two square theorem very closely.

• Using conjugates, any ordinary prime that is not a Hurwitz prime is shown to be a sum of four squares.
• If an ordinary prime $p$ divides a Hurwitz integer product $\alpha \beta$, then $p$ divides $\alpha$ or $p$ divides $\beta$.
• Any ordinary odd prime $p$ divides a natural number of the form $1+l^{2}+m^{2}$ (analogous to Lagrange’s lemma in Section $6.5$ but easier to prove).
• The number $1+l^{2}+m^{2}$ factorizes in the Hurwitz integers. Hence $p$ is not a Hurwitz prime and therefore $p$ is a sum of four squares.
• Since every natural number $n$ is a product of odd primes and the prime 2 (which equals $0^{2}+0^{2}+1^{2}+1^{2}$ ), the four square identity shows that $n$ is a sum of four squares.

## Real matrices and $\mathbb{C}$

We introduce 4 -dimensional “hypercomplex numbers” called quaternions. A quaternion is easily defined as a $2 \times 2$ matrix of complex numbers, but to see why we might expect matrices to behave like numbers, we first show how to model the complex numbers by $2 \times 2$ real matrices.
For each $a+b i \in \mathbb{C}$, with real $a$ and $b$, consider the matrix
$$M(a+b i)=\left(\begin{array}{rr} a & b \ -b & a \end{array}\right)$$
It is easy to check (exercise) that
\begin{aligned} M\left(a_{1}+b_{1} i\right)+M\left(a_{2}+b_{2} i\right) &=M\left(a_{1}+a_{2}+\left(b_{1}+b_{2}\right) i\right) \ &=M\left(\left(a_{1}+b_{1} i\right)+\left(a_{2}+b_{2} i\right)\right) \ M\left(a_{1}+b_{1} i\right) M\left(a_{2}+b_{2} i\right)=M &\left(a_{1} a_{2}-b_{1} b_{2}+\left(a_{1} b_{2}+b_{1} a_{2}\right) i\right) \ =M &\left(\left(a_{1}+b_{1} i\right)\left(a_{2}+b_{2} i\right)\right) \end{aligned}
Thus matrix sum and product correspond to complex sum and product, and therefore the matrices
$$\left(\begin{array}{rr} a & b \ -b & a \end{array}\right) \text { for } a, b \in \mathbb{R}$$
behave exactly like the complex numbers $a+b i$.

Another way to see this is to write
$$\left(\begin{array}{rr} a & b \ -b & a \end{array}\right)=a\left(\begin{array}{ll} 1 & 0 \ 0 & 1 \end{array}\right)+b\left(\begin{array}{rr} 0 & 1 \ -1 & 0 \end{array}\right)=a \mathbf{1}+b \mathbf{i}$$
The identity matrix
$$\mathbf{1}=\left(\begin{array}{ll} 1 & 0 \ 0 & 1 \end{array}\right)$$
behaves like the number 1, and
$$\mathbf{i}=\left(\begin{array}{rr} 0 & 1 \ -1 & 0 \end{array}\right)$$
behaves like $\sqrt{-1}$. Indeed
$$\mathbf{i}^{2}=\left(\begin{array}{rr} -1 & 0 \ 0 & -1 \end{array}\right)=-1$$
Not only does this matrix representation of $\mathbb{C}$ have natural counterparts of 1 and $i$, it also has a natural interpretation of the norm on $\mathbb{C}$ as the determinant. This is so because
$$\operatorname{norm}(a+b i)=a^{2}+b^{2}=\operatorname{det}\left(\begin{array}{rr} a & b \ -b & a \end{array}\right)$$
The multiplicative property of the norm follows from the multiplicative property of the determinant:
$$\operatorname{det}\left(\begin{array}{rl} a_{1} & b_{1} \ -b_{1} & a_{1} \end{array}\right) \operatorname{det}\left(\begin{array}{rr} a_{2} & b_{2} \ -b_{2} & a_{2} \end{array}\right)=\operatorname{det}\left(\left(\begin{array}{rr} a_{1} & b_{1} \ -b_{1} & a_{1} \end{array}\right)\left(\begin{array}{rr} a_{2} & b_{2} \ -b_{2} & a_{2} \end{array}\right)\right)$$
And since the matrix product on the right-hand side equals
$$\left(\begin{array}{rr} a_{1} a_{2}-b_{1} b_{2} & a_{1} b_{2}+b_{1} a_{2} \ -a_{1} b_{2}-b_{1} a_{2} & a_{1} a_{2}-b_{1} b_{2} \end{array}\right)$$
equation () gives a new way to derive the Diophantus two square identity. Replacing each det $\left(\begin{array}{rr}a & b \ -b & a\end{array}\right)$ in $()$ by $a^{2}+b^{2}$ we get
$$\left(a_{1}^{2}+b_{1}^{2}\right)\left(a_{2}^{2}+b_{2}^{2}\right)=\left(a_{1} a_{1}-b_{1} b_{2}\right)^{2}+\left(a_{1} b_{2}+b_{1} a_{2}\right)^{2}$$

## A geometrie property of multiplieation

Here is a good place to point out a property of multiplication that we have previously observed in special cases in Chapters 6 and $7:$ multiplication of all members of $\mathbb{C}$ by some fixed nonzero $z_{0} \in \mathbb{C}$ is a similarity or shapepreserving map, that is, it multiplies all distances by a constant (namely, $\left.\left|z_{0}\right|\right)$

This is because the distance between complex numbers $z_{1}$ and $z_{2}$ equals $\left|z_{2}-z_{1}\right| .$ When multiplied by $z_{0}, z_{1}$ and $z_{2}$ are sent to $z_{0} z_{1}$ and $z_{0} z_{2}$, the distance between which is
$$\left|z_{0} z_{2}-z_{0} z_{1}\right|=\left|z_{0}\left(z_{2}-z_{1}\right)\right|=\left|z_{0}\right|\left|z_{2}-z_{1}\right|$$
by the multiplicative property of the norm.
We observed cases of this in Chapter 6 , where multiplying $\mathbb{Z}[i]$ by some $\beta \neq 0$ gave a grid of the same square shape, and in Chapter 7 where multiplying $\mathbb{Z}[\sqrt{-2}]$ by $\beta \neq 0$ gave a grid of the same rectangular shape. In Section $8.4$ we use the multiplicative property of the quaternion norm to show similarly that any nonzero multiple of the quaternion “integers” is a grid of the same shape in $\mathbb{R}^{4}$. (Here we use the word “grid” rather loosely, since the quaternion integers are not simply a grid of 4 -dimensional cubes).

Hence $|\rho|<|\beta|$, as required.
The units in $\mathbb{Z}[\sqrt{-2}]$, as in $\mathbb{Z}$, are just $\pm 1$. We prove this using the norm $a^{2}+2 b^{2}$ of $a+b \sqrt{-2}$. Units are elements of norm 1 , and $a^{2}+2 b^{2}=1$ only if $b=0$ and $a=\pm 1$.

Now suppose we have a factorization of a cube into relatively prime factors $s$ and $t$ in $\mathbb{Z}[\sqrt{-2}]$ :
$$y^{3}=s t$$
Since $s$ and $t$ have no prime factor in common, the cubed prime factors of $y^{3}$ must separate into cubes inside $s$ and cubes inside $t$. There could also be unit factors in $s$ and $t$, but these can only be 1 or $-1$, both of which are cubes. Hence the relatively prime factors of a cube are themselves cubes.
This fills another gap in Euler’s solution of $y^{3}=x^{2}+2$. The only gap that now remains is to show that $\operatorname{gcd}(x-\sqrt{-2}, x+\sqrt{-2})=1$

## Complex matrices and $\mathbb{H}$

For each pair $\alpha, \beta \in \mathbb{C}$ we consider the matrix
$$\left(\begin{array}{rr} \alpha & \beta \ -\bar{\beta} & \bar{\alpha} \end{array}\right)$$
which we call a quaternion.

The set of quaternions is called $\mathbb{H}$, after Hamilton, who discovered them in 1843 (the matrix definition, however, is due to Cayley (1858)).
It is easy to check that the sum and difference of quaternions are again quaternions. So, too, is the product because
$$\left(\begin{array}{cc} \frac{\alpha_{1}}{\beta_{1}} & \frac{\beta_{1}}{\alpha_{1}} \end{array}\right)\left(\begin{array}{rr} -\frac{\alpha_{2}}{\beta_{2}} & \frac{\beta_{2}}{\alpha_{2}} \end{array}\right)=\left(\begin{array}{rr} \frac{\alpha_{3}}{\beta_{3}} & \frac{\beta_{3}}{\alpha_{3}} \end{array}\right)$$
where
$$\alpha_{3}=\alpha_{1} \alpha_{2}-\beta_{1} \overline{\beta_{2}}, \quad \beta_{3}=\alpha_{1} \beta_{2}+\beta_{1} \overline{\alpha_{2}}$$
This can be verified by matrix multiplication and complex conjugation.
The norm of a quaternion $q$ is defined to be its determinant, hence if $q=\left(\begin{array}{rr}\alpha & \beta \ -\bar{\beta} & \bar{\alpha}\end{array}\right)$ then $\operatorname{norm}(q)$ is
$$\operatorname{det}\left(\begin{array}{rr} \alpha & \beta \ -\bar{\beta} & \bar{\alpha} \end{array}\right)=\alpha \bar{\alpha}+\beta \bar{\beta}=|\alpha|^{2}+|\beta|^{2}$$
The multiplicative property of determinants now gives a “complex two square identity” similar to the Diophantus two square identity:
$$\left(\left|\alpha_{1}\right|^{2}+\left|\beta_{1}\right|^{2}\right)\left(\left|\alpha_{2}\right|^{2}+\left|\beta_{2}\right|^{2}\right)=\left|\alpha_{1} \alpha_{2}-\beta_{1} \overline{\beta_{2}}\right|^{2}+\left|\alpha_{1} \beta_{2}+\beta_{1} \overline{\alpha_{2}}\right|^{2}$$
This identity was discovered by Gauss around 1820 , but he left it unpublished.

## The quaternion units

If we write $\alpha=a+d i$ and $\beta=b+c i$, where $a, b, c, d \in \mathbb{R}$, then each quaternion can be viewed as a linear combination of four special matrices $\mathbf{1}, \mathbf{i}, \mathbf{j}, \mathbf{k}$ called quaternion units.
\begin{aligned} \left(\begin{array}{rr} \alpha & \beta \ -\bar{\beta} & \bar{\alpha} \end{array}\right) &=\left(\begin{array}{rr} a+d i & b+c i \ -b+c i & a-d i \end{array}\right) \ &=a\left(\begin{array}{ll} 1 & 0 \ 0 & 1 \end{array}\right)+b\left(\begin{array}{rr} 0 & 1 \ -1 & 0 \end{array}\right)+c\left(\begin{array}{cc} 0 & i \ i & 0 \end{array}\right)+d\left(\begin{array}{cc} i & 0 \ 0 & -i \end{array}\right) \ &=a \mathbf{1}+b \mathbf{i}+c \mathbf{j}+d \mathbf{k} \end{aligned}
The matrices $\mathbf{1}, \mathbf{i}, \mathbf{j}, \mathbf{k}$ are quaternions of norm 1 that satisfy the following easily verified relations:
\begin{aligned} &\mathbf{i}^{2}=\mathbf{j}^{2}=\mathbf{k}^{2}=-1 \ &\mathbf{i j}=\mathbf{k}=-\mathbf{j i} \ &\mathbf{j k}=\mathbf{i}=-\mathbf{k} \mathbf{j} \ &\mathbf{k i}=\mathbf{j}=-\mathbf{i k} \end{aligned}

Thus the product of quaternions is generally noncommutative:
$$q_{1} q_{2} \neq q_{2} q_{1}$$
Apart from this, however, the quaternions have the same basic properties as numbers. They form an abelian group under addition, the nonzero quaternions form a group under multiplication, and we also have
\begin{aligned} &q_{1}\left(q_{2}+q_{3}\right)=q_{1} q_{2}+q_{1} q_{3} \ &\left(q_{2}+q_{3}\right) q_{1}=q_{2} q_{1}+q_{3} q_{1} \end{aligned}
(left and right distributive laws).
The four square identity
If $q=a \mathbf{1}+b \mathbf{i}+c \mathbf{j}+d \mathbf{k}$ then $\operatorname{norm}(q)$ is
$$\operatorname{det}\left(\begin{array}{rr} a+d i & b+c i \ -b+c i & a-d i \end{array}\right)=a^{2}+b^{2}+c^{2}+d^{2}$$
Since $\operatorname{det}\left(q_{1}\right) \operatorname{det}\left(q_{2}\right)=\operatorname{det}\left(q_{1} q_{2}\right)$, we can also write the “complex two square identity” as a real four square identity, which turns out to be
\begin{aligned} \left(a_{1}^{2}+b_{1}^{2}+c_{1}^{2}+d_{1}^{2}\right)\left(a_{2}^{2}+b_{2}^{2}+c_{2}^{2}+d_{2}^{2}\right)=&\left(a_{1} a_{2}-b_{1} b_{2}-c_{1} c_{2}-d_{1} d_{2}\right)^{2} \ &+\left(a_{1} b_{2}+b_{1} a_{2}+c_{1} d_{2}-d_{1} c_{2}\right)^{2} \ &+\left(a_{1} c_{2}-b_{1} d_{2}+c_{1} a_{2}+d_{1} b_{2}\right)^{2} \ &+\left(a_{1} d_{2}+b_{1} c_{2}-c_{1} b_{2}+d_{1} a_{2}\right)^{2} \end{aligned}
Remarkably, the four square identity was discovered by Euler in 1748 , nearly 100 years before the discovery of quaternions. Euler hoped to use it to prove that every natural number is the sum of four squares, by proving also that every prime is the sum of four squares. This was first proved by Lagrange in $1770 .$ We can now give a simpler proof with the help of quaternions. This will be done in the next few sections.