What is Spin?

Motivating the property of spin as it relates to quantum mechanics

Introduction

For me, Griffiths Quantum Mechanics was the textbook used to cover topics in my introductory Quantum Mechanics course. Spin and angular momentum is brielfy discussed, in the textbook, at the very end of section 4.4.3 the following statement is made.

In a mathematical sense this is all applied group theory —what we are talking about is the decomposition of the direct product of two irreducible representations of the rotation group into a direct sum of irreducible representations (you can quote that, to impress your friends).

If none of what you've just read makes any sense at all, then this might be the perfect introduction to understand spin in Quamtum Mechanics. The whole purpose of this post is to (hopefully) clear some of the confusion about spin as it relates to Quantum Mechanics, and give you a mathematical intuition for some of the tools used along the way. All built from the ground up.

This blog post will be broken up into two separate posts. The first of which will motivate the idea of spin and some basic quantum mechanical tools. The following post will discuss how we handle the addition of spin and the mathematical formalism used to describe it.

Note: I am assuming some baseline knowledge of electromagnetism and classical mechanics.

Spin as a Classical Theory

Classical Angular Momentum

To describe spin, we must first cover over angular momentum. In classical mechanics, this is a metric on the rotational motion of an object, alongside its tendency to continue rotating. With respect to a mass about a central axis of rotation.

\[ \tag{1} \vec{L} = \vec{r} \times{} \vec{p} \]

Here, \(\vec{r}\) serves as the distance from the axis of rotation, and \( \vec{p}\) is the linear momentum of the object as it moves around a central axis. Where the direction of the vectors determined from the cross product which is determined by the right-hand rule.

Figure 1. Procedure for determining the resulting direction of a cross product. By pointing your right hand out in the direction of the first vector, and curl your hand in the direction of the second vector, the resulting direction of the cross product will be the direction your thumb is pointing.

Classical Spin

Spin on the other hand, is a similar metric with the the central axis rotation now about the objects center of mass.

\[ \vec{S} = I\vec{w} \]

Here our moment of intertia \( I \) is the tendency to keep rotating, and the angular velocity \( \vec{\omega{}} \) is the rotational motion. Now this is with the assumption that we are rotating the object about its axis of symmetry, but since we are mainly dealing with objects with spherical symmetry, this is not a problem.

Generally speaking, we can look at this spin classically and make note that computing this quantity is functionally the same as angular mometnum of a lot of small mass components making up the object that is spinning, we call this a rigid body, and all mass component \(dm\) have the same angular velocity \(\vec{\omega{}}\).

For instance the formula for the angular momentum of a mass of a finite volume about a central axis is:

\[ \tag {2} \vec{L} = \int{}\vec{r}\times{}(\vec{v}dm) \]

By the identity (and cyclic permutations of cross products) we have the following formulas

\[ \vec{\omega{}} = \vec{r} \times{} \vec{v} \\ \vec{v} = \vec{\omega{}} \times{} \vec{r} \]

Plugging this back into (2) we get:

\[ \vec{L} = \int{}\vec{r}\times{}(\vec{\omega{}} \times{} \vec{r})dm \]

by the BAC-CAB identity we can expand the triple product,

\[ \vec{L} = \int{} \left[ (\vec{r}\cdot{}\vec{r})\vec{\omega{}} - (\vec{r}\cdot{}\vec{\omega})\vec{r} \right]dm \\ \]

given that \(\vec{\omega{}}\) and \(\vec{r}\) are orthogonal, the second integral is 0 and

\[ \vec{L} = \left[ \int{} r^{2}dm \right] \vec{\omega{}} \]

at this point we can define this integral to be our moment of inertia \(I\).

We will come back to this notion of angular momentum and spin when we discuss the quantum mechanical analog to these quantities. For now we will make note of these properties.

Magnetic Dipole Moment

So, how can we measure these rotations on a scale small enough to measure potential quantum mechanical effects? We will begin with the description of a loop of current flowing through wire in a uniform magnetic field \(\vec{B}\). We begin with the Lorentz force law:

\[ \tag{3} \vec{F} = q\vec{E}+q(\vec{v}\times{}\vec{B}) \]

For sake of simplicity, we are going to assume a steady current \( \vec{E}_{net}=\vec{0}\), and that we are only concerned with some current loop \(I = nAv_{d}q \) where \(n\) is the number density \(q\) is the charge, \(A\) is the cross sectional area of the wire, and \( v_{d}\) is the drift velocity of the charge carriers in the wire.

If we consider a differentiably small section of the wire \(dl\) the number of charge carriers in the wire is

\[ dN = nAdl \]

thus each charge carrier in this section of the wire experiences a force by equation (3),

\[ d\vec{F} = dNq(\vec{v_d}\times{}\vec{B}) \\ d\vec{F} = nAdlq(\vec{v_d}\times{}\vec{B}) \]

assuming that the velocity of the charge carriers to be in the same direction as the loop of the wire \(\vec{v_{d}}=v_{d}\hat{l} \).

\[ d\vec{F} = nAv_{d}q(d\vec{l}\times{}\vec{B}) \\ d\vec{F} = I(d\vec{l}\times{}\vec{B}) \]

this generates a torque, which can be calculated using the definition

\[ \vec{\tau{}} = \oint{}\vec{r}\times{}d\vec{F} \\ \vec{\tau{}} = \oint{}\vec{r}\times{}I(d\vec{l}\times{}\vec{B}) \\ \]

we can rearrange the vector triple product by cyclic permutation (in the case of the triple product it is anticommutative)

\[ \vec{\tau{}} = -I\left[ \vec{B} \times{} \oint{}\vec{r}\times{}d\vec{l} \right] \]

where the integral is equal to the cross sectional area of the wire \(\vec{A}\)

\[ \vec{\tau{}} = -I\left[ \vec{B} \times{} \vec{A} \right] \] swapping components \(\vec{A}\) and \(\vec{B}\) we get the following formula, \[ \vec{\tau{}} = I\vec{A} \times{} \vec{B} \] from this we define the magnetic dipole moment: \[ \vec{\mu{}} = I\vec{A} \]

lets first review what we have done. We have an amperian loop of current \(I\) that for all purposes can be in any orientation. This loop by Faraday's law generates its own magnetic field pointing out of the loop. When this loop is subjected to an external magnetic field, this loop will experience torque to reorient the loop such that its dipole moment is parallel with the external magnetic field.

In essence, this quantity, tells us how strongly a magnet will align with an external magnetic field

Amperian loop comparison animation
Figure 2. Comparison of amperian loop behavior in an external magnetic field. Left: Precession + Alignment - The loop's magnetic moment precesses around the external field direction (z-axis) while gradually aligning with it. Right: Alignment Only - The loop aligns directly with the field without precession. The magnetic field is depicted in blue, the amperian loop in red, and the loop's normal vector (magnetic moment direction) in green.

The torque we derived tells us how the dipole wants to rotate, but we can also understand this from an energy perspective. To rotate the dipole from one orientation to another, we must do work against the magnetic field. The work done by an external agent to rotate the dipole through an angle \( d\theta{} \) is:

\[ dW = \tau{}d\theta{} \]

Using our expression for torque \( \vec{\tau{}} = \vec{\mu{}} \times{} \vec{B} \), and noting that the magnitude of the torque is \( \tau{} = |\vec{\mu{}} \times{} \vec{B}| = \mu{}B\sin\theta{} \), where \( \theta{} \) is the angle between \( \vec{\mu{}} \) and \( \vec{B} \), we have:

\[ dW = \mu{}B\sin\theta{}d\theta{} \]

The work done to rotate the dipole from an initial angle \( \theta_{i} \) to a final angle \( \theta_{f} \) is:

\[ W = \int_{\theta_{i}}^{\theta_{f}} \mu{}B\sin\theta{}d\theta{} = \mu{}B(-\cos\theta{})\Big|_{\theta_{i}}^{\theta_{f}} = \mu{}B(\cos\theta_{i} - \cos\theta_{f}) \]

This work is stored as potential energy. If we define the potential energy to be zero when the dipole is perpendicular to the field (\( \theta{} = \pi{}/2 \)), then the potential energy at an angle \( \theta{} \) is:

\[ U(\theta{}) = \mu{}B(\cos(\pi{}/2) - \cos\theta{}) = -\mu{}B\cos\theta{} \]

More generally, using vector notation, the potential energy of a magnetic dipole in an external magnetic field is:

\[ \tag{4a} U = -\vec{\mu{}} \cdot{} \vec{B} \]

This expression shows that the energy is minimized when \( \vec{\mu{}} \) and \( \vec{B} \) are parallel (aligned), and maximized when they are antiparallel. This is consistent with our earlier observation that the torque tends to align the dipole with the field.

The energy difference between the aligned and anti-aligned states is:

\[ \Delta{}U = U(\theta{} = \pi{}) - U(\theta{} = 0) = \mu{}B - (-\mu{}B) = 2\mu{}B \]

This energy shift will be crucial when we discuss the quantum mechanical treatment, where the orientation of the dipole becomes quantized and can only take discrete values.

The magnetic diplole moment is also directly related to the property of angular momentum. And how we might measure spin.

If we ignore the orientation for a moment, and just focus on the properties that might influence a magnetic dipole's alignment

\[ \mu{} = IA \] \[ \mu{} = \left( \frac{qv}{2\pi{}r} \right)(\pi{}r^{2}) \] \[ \mu{} = \frac{qvr}{2} \]

given the definition of the angular momentum \(L = mvr\) from equation (1), we obtain the identity

\[ \mu{} = \frac{q}{2m}L \]

This is a common identity for determining the angular momentum of charged particles even today. Though with the exception of a constant \(g\)

\[ \tag{4} \vec{\mu{}} = g\frac{q}{2m}\vec{L} \]

which is known as the g-factor. For the amperian loops used in this example, this factor is set to \( g=1\). When we soon discuss this same idea in quantum mechanics, and well find that for many structures, this factor relates the magnetic moment and angular momenym of an atom, nucleus, or single particle.

Spin as a Quantum Theory

Now with the mathemtaical tools to measure a small particle's angular momentum, we need to be able to understand how experimentally the first measurements of this angular momentum were acutally taken. This bring us to the Zeeman effect. At the time of the discovery, it was not known that angular momentum would be the cause for this phenomena, but the formalism brought forth with quantum mechanics would correctly explain many of these results.

The Zeeman Effect

The Zeeman effect is a result from an experiment first successfully conducted all the way back in 1896. The Zeeman effect comes from an experiment in which asbestos is soaked in a table salt solution and set on an open flame. As this sodium in the salt solution is heated, it emits photons of specific wavelengths, that when passed through a diffraction grating create distinct lines.

These lines had been known as the spectral lines for the sodium atom. As each atom had their own individual spectral lines, which had already been determined decades before.

Figure 3. Spectral lines for the sodium atom, transparency of the spectral lines is relative to their intensity. Data retrieved from NIST.

However, what was interesting about this partiuclar experiment is the inclusion of two strong magnets on either side of the sample, when a strong enough magnetic field was applied, these lines would spread apart, and eventually split into their own distinct lines.

Figure 4. The Anomalous Zeeman Effect for the bright yellow spectral lines of sodium. These lines are commonly referred to as the D2 and D1 spectral lines. What made this splitting 'anomalous' was the irregular splitting of these lines. In more simple examples, these lines will predictably split into 3. Side Note: This splitting was not initially observed for sodium in 1896, it simply increased the line width due to experimental limitation. The official splitting of lines would eventually be measured by Zeeman in 1897 with blue cadmium spectral lines.

These discoveries made during this time would often take about 30 years to even begin to describe in a way that correctly predicted these results. It wouldn't be until the Bohr model of the atom that a working theory was constructed to describe this behavior.

The Schrödinger Equation

While the historical background of the mathematics involved is interesting, I will skip these developments for the sake of brevity and skip directly to the first model of Quantum Mechanics that we still use today: The Schrödinger equation.

We will first discuss the Time Independent Schrödinger equation (TISE). This equation we can view as our \( \vec{F}=m\vec{a} \). Though it's derivation and existence can be pretty confusing as to 'Why this particular equation?', I think there are other resources online that do a great justification for this equation that stands as the basis for many of the obervables we can measure in Quantum Mechanics

For those curious I will link this video here which I've found is a pretty interesting video on how the Schrödinger equation came to be developed, for now we will take this equation as a postulate and move on.

The Time Independent Schrödinger equation (TISE) is written as:

\[ -\frac{\hbar{}^{2}}{2m}\nabla{}^{2}\psi{} + V\psi{} = E \psi{} \]

Let us break down each component of this equation:

The TISE is an eigenvalue equation: when we operate on the wave function \( \psi{} \) with the Hamiltonian operator \( \hat{H} = -\frac{\hbar{}^{2}}{2m}\nabla{}^{2} + V \), we get back the same function multiplied by the energy eigenvalue \( E \). This means \( \psi{} \) is an eigenfunction of the Hamiltonian, and \( E \) is the corresponding eigenvalue. For bound states, only certain discrete values of \( E \) are allowed, which explains why atomic energy levels are quantized.

Given this, the TISE is often better written in it's simplified form.

\[ \hat{H}\psi{} = E \psi{} \]

Here is really the essence of what 90% of what Quantum Mechanics is about, these wave functions are treated as vectors, and subsequently the operators are treated as matrices acting on these vectors. Despite the fact that these are functions, we can treat them as such. Given all of this, most problems in quantum mechancics boil down to trying to find ways of solving these eigenvalue problems for different operators, and as we will see in the next section, even for the simplest of cases this can be difficult.

Now, starting with the TISE

\[ -\frac{\hbar{}^{2}}{2m}\nabla{}^{2}\psi{} + V\psi{} = E \psi{} \]

the laplacian in spherical coordinates is defined as

\[ \nabla{}^{2} = \frac{1}{r^2}\frac{\partial{}}{\partial{}r}\left( r^{2}\frac{\partial{}\psi{}}{\partial{}r} \right) + \frac{1}{r^2\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}\psi{}}{\partial{\theta{}}} \right) + \frac{1}{r^2\sin^{2}\theta{}} \frac{\partial{}^{2}\psi{}}{\partial{}\phi{}^{2}} \]

When we plug in this definition into the TISE

\[ -\frac{\hbar{}^{2}}{2m}\left[ \frac{1}{r^2}\frac{\partial{}}{\partial{}r}\left( r^{2}\frac{\partial{}\psi{}}{\partial{}r} \right) + \frac{1}{r^2\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}\psi{}}{\partial{\theta{}}} \right) + \frac{1}{r^2\sin^{2}\theta{}} \frac{\partial{}^{2}\psi{}}{\partial{}\phi{}^{2}} \right] \psi{} + V\psi{} = E \psi{} \]

we may use separation of variables to obtain two differential equations, via the relation

\[ \psi{}(r, \theta{}, \phi{}) = R(r)Y(\theta{}, \phi{}) \]

where \( R(r) \) depends only on the radial coordinate and \( Y(\theta{}, \phi{}) \) depends only on the angular coordinates. Substituting this ansatz into the TISE, we obtain:

\[ -\frac{\hbar{}^{2}}{2m}\left[ \frac{1}{r^2}\frac{\partial{}}{\partial{}r}\left( r^{2}\frac{\partial{}}{\partial{}r} \right) + \frac{1}{r^2\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}}{\partial{\theta{}}} \right) + \frac{1}{r^2\sin^{2}\theta{}} \frac{\partial{}^{2}}{\partial{}\phi{}^{2}} \right] R(r)Y(\theta{}, \phi{}) + V(r)R(r)Y(\theta{}, \phi{}) = E R(r)Y(\theta{}, \phi{}) \]

Expanding the derivatives and dividing through by \( R(r)Y(\theta{}, \phi{}) \):

\[ -\frac{\hbar{}^{2}}{2m}\left[ \frac{1}{R}\frac{1}{r^2}\frac{d}{dr}\left( r^{2}\frac{dR}{dr} \right) + \frac{1}{Y}\frac{1}{r^2\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}Y}{\partial{\theta{}}} \right) + \frac{1}{Y}\frac{1}{r^2\sin^{2}\theta{}} \frac{\partial{}^{2}Y}{\partial{}\phi{}^{2}} \right] + V(r) = E \]

Multiplying through by \( -2mr^2/\hbar{}^{2} \) and rearranging:

\[ \frac{1}{R}\frac{d}{dr}\left( r^{2}\frac{dR}{dr} \right) + \frac{2mr^2}{\hbar{}^{2}}(V(r) - E) + \frac{1}{Y\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}Y}{\partial{\theta{}}} \right) + \frac{1}{Y\sin^{2}\theta{}} \frac{\partial{}^{2}Y}{\partial{}\phi{}^{2}} = 0 \]

The first two terms depend only on \( r \), while the last two terms depend only on \( \theta{} \) and \( \phi{} \). For this equation to hold for all values of \( r \), \( \theta{} \), and \( \phi{} \), each group must equal a constant. Let us call this separation constant \( \lambda \). This gives us two equations:

\[ \frac{1}{R}\frac{d}{dr}\left( r^{2}\frac{dR}{dr} \right) + \frac{2mr^2}{\hbar{}^{2}}(V(r) - E) = \lambda \]

and

\[ \frac{1}{\sin\theta{}}\frac{\partial{}}{\partial{}\theta} \left( \sin\theta{}\frac{\partial{}Y}{\partial{\theta{}}} \right) + \frac{1}{\sin^{2}\theta{}} \frac{\partial{}^{2}Y}{\partial{}\phi{}^{2}} = -\lambda Y \tag{5} \]

Equation (5) is the angular equation. We will now solve this equation directly using separation of variables, without reference to angular momentum operators. Let us write \( Y(\theta{}, \phi{}) = \Theta{}(\theta{})\Phi{}(\phi{}) \). Substituting into equation (5):

\[ \frac{1}{\sin\theta{}}\frac{d}{d\theta{}} \left( \sin\theta{}\frac{d\Theta{}}{d\theta{}} \right)\Phi{} + \frac{1}{\sin^{2}\theta{}} \Theta{}\frac{d^{2}\Phi{}}{d\phi{}^{2}} = -\lambda \Theta{}\Phi{} \]

Dividing through by \( \Theta{}\Phi{} \) and multiplying by \( \sin^{2}\theta{} \):

\[ \frac{\sin\theta{}}{\Theta{}}\frac{d}{d\theta{}} \left( \sin\theta{}\frac{d\Theta{}}{d\theta{}} \right) + \lambda\sin^{2}\theta{} + \frac{1}{\Phi{}}\frac{d^{2}\Phi{}}{d\phi{}^{2}} = 0 \]

The first two terms depend only on \( \theta{} \), while the third term depends only on \( \phi{} \). For this equation to hold, they must each equal a constant. Let us call this constant \( -m^{2} \), where \( m \) is a constant to be determined. This gives us:

\[ \tag{6} \frac{d^{2}\Phi{}}{d\phi{}^{2}} = -m^{2}\Phi{} \]

and

\[ \tag{7} \frac{1}{\sin\theta{}}\frac{d}{d\theta{}} \left( \sin\theta{}\frac{d\Theta{}}{d\theta{}} \right) - \frac{m^{2}}{\sin^{2}\theta{}}\Theta{} + \lambda\Theta{} = 0 \]

Let us first solve equation (6). This is a second-order linear homogeneous differential equation with constant coefficients. The characteristic equation is \( r^{2} = -m^{2} \), so \( r = \pm{} im \). The general solution is:

\[ \Phi{}(\phi{}) = Ae^{im\phi{}} + Be^{-im\phi{}} \]

For the wave function to be single-valued (i.e., \( \Phi{}(\phi{} + 2\pi{}) = \Phi{}(\phi{}) \)), we require that \( e^{im(\phi{} + 2\pi{})} = e^{im\phi{}} \), which implies \( e^{2\pi{}im} = 1 \). This is satisfied if and only if \( m \) is an integer: \( m = 0, \pm{}1, \pm{}2, \ldots \).

Without loss of generality, we can take \( B = 0 \) (the general solution can be written using just one exponential since \( m \) can be negative). Normalizing over \( 0 \leq{} \phi{} \leq{} 2\pi{} \), we find:

\[ \int_{0}^{2\pi{}} |\Phi{}(\phi{})|^{2} d\phi{} = |A|^{2} \int_{0}^{2\pi{}} d\phi{} = 2\pi{}|A|^{2} = 1 \]

Therefore, \( A = 1/\sqrt{2\pi{}} \) (choosing the phase to be zero), and we obtain:

\[ \Phi{}_{m}(\phi{}) = \frac{1}{\sqrt{2\pi{}}}e^{im\phi{}} \]

where \( m = 0, \pm{}1, \pm{}2, \ldots \).

Now we turn to equation (7), which is more involved. Making the substitution \( x = \cos\theta{} \), we have: \( \sin^{2}\theta{} = 1 - x^{2} \), \( \sin\theta{} = \sqrt{1 - x^{2}} \) (for \( 0 \leq{} \theta{} \leq{} \pi{} \), so \( \sin\theta{} \geq{} 0 \)), and \( d/d\theta{} = -\sin\theta{}(d/dx) = -\sqrt{1 - x^{2}}(d/dx) \).

Let \( \Theta{}(\theta{}) = P(x) \), where \( x = \cos\theta{} \). Then:

\[ \frac{d\Theta{}}{d\theta{}} = \frac{dP}{dx}\frac{dx}{d\theta{}} = \frac{dP}{dx}(-\sin\theta{}) = -\sqrt{1-x^{2}}\frac{dP}{dx} \]

and

\[ \frac{d}{d\theta{}} \left( \sin\theta{}\frac{d\Theta{}}{d\theta{}} \right) = \frac{d}{d\theta{}} \left( \sin\theta{}(-\sqrt{1-x^{2}}\frac{dP}{dx}) \right) = \frac{d}{d\theta{}} \left( -(1-x^{2})\frac{dP}{dx} \right) \]

Using the chain rule:

\[ \frac{d}{d\theta{}} = \frac{dx}{d\theta{}}\frac{d}{dx} = -\sin\theta{}\frac{d}{dx} = -\sqrt{1-x^{2}}\frac{d}{dx} \]

Therefore:

\[ \frac{d}{d\theta{}} \left( \sin\theta{}\frac{d\Theta{}}{d\theta{}} \right) = -\sqrt{1-x^{2}}\frac{d}{dx} \left( -(1-x^{2})\frac{dP}{dx} \right) = \sqrt{1-x^{2}}\frac{d}{dx} \left( (1-x^{2})\frac{dP}{dx} \right) \]

Substituting into equation (7) and dividing through by \( \sin\theta{} = \sqrt{1-x^{2}} \):

\[ \frac{1}{\sqrt{1-x^{2}}} \sqrt{1-x^{2}} \frac{d}{dx} \left( (1-x^{2})\frac{dP}{dx} \right) - \frac{m^{2}}{1-x^{2}}P + \lambda P = 0 \]

Simplifying:

\[ \frac{d}{dx} \left( (1-x^{2})\frac{dP}{dx} \right) + \left( \lambda - \frac{m^{2}}{1-x^{2}} \right)P = 0 \]

Expanding the derivative:

\[ (1-x^{2})\frac{d^{2}P}{dx^{2}} - 2x\frac{dP}{dx} + \left( \lambda - \frac{m^{2}}{1-x^{2}} \right)P = 0 \tag{8} \]

This is the associated Legendre equation. For the case \( m = 0 \), this reduces to Legendre's equation:

\[ (1-x^{2})\frac{d^{2}P}{dx^{2}} - 2x\frac{dP}{dx} + \lambda P = 0 \tag{9} \]

This equation has regular singular points at \( x = \pm{}1 \). Using the method of Frobenius, we seek a power series solution of the form \( P(x) = \sum_{n=0}^{\infty{}} a_{n}x^{n} \). Substituting into equation (9) and requiring that the series converges for all \( x \in{} [-1, 1] \), we find that \( \lambda = l(l+1) \), where \( l \) is a non-negative integer: \( l = 0, 1, 2, \ldots \).

The solutions are the Legendre polynomials \( P_{l}(x) \), which can be generated using Rodrigues' formula:

\[ P_{l}(x) = \frac{1}{2^{l}l!}\frac{d^{l}}{dx^{l}}(x^{2}-1)^{l} \]

For example:

\[ P_{0}(x) = 1, \quad P_{1}(x) = x, \quad P_{2}(x) = \frac{1}{2}(3x^{2}-1), \quad P_{3}(x) = \frac{1}{2}(5x^{3}-3x), \ldots \]

Now, for the case \( m \neq{} 0 \), equation (8) has solutions called the associated Legendre functions. They can be generated from the Legendre polynomials:

\[ P_{l}^{m}(x) = (1-x^{2})^{m/2}\frac{d^{m}}{dx^{m}}P_{l}(x) \]

where \( m \) must satisfy \( |m| \leq{} l \). Note that \( P_{l}^{m}(x) = 0 \) if \( |m| > l \), since the \( m \)-th derivative of a polynomial of degree \( l \) is zero if \( m > l \).

Returning to our original variable \( \theta{} \), we have \( \Theta{}_{lm}(\theta{}) = P_{l}^{m}(\cos\theta{}) \). To normalize this function, we need to compute:

\[ \int_{0}^{\pi{}} |\Theta{}_{lm}(\theta{})|^{2}\sin\theta{}d\theta{} = \int_{-1}^{1} |P_{l}^{m}(x)|^{2}dx \]

Using the orthogonality property of associated Legendre functions:

\[ \int_{-1}^{1} P_{l}^{m}(x)P_{l'}^{m}(x)dx = \frac{2}{2l+1}\frac{(l+|m|)!}{(l-|m|)!}\delta_{ll'} \]

The normalized angular solution is therefore:

\[ \Theta{}_{lm}(\theta{}) = \sqrt{\frac{(2l+1)(l-|m|)!}{2(l+|m|)!}}P_{l}^{|m|}(\cos\theta{}) \]

Combining \( \Theta{}_{lm}(\theta{}) \) and \( \Phi{}_{m}(\phi{}) \), we obtain the spherical harmonics:

\[ \tag{10} Y_{lm}(\theta{}, \phi{}) = \sqrt{\frac{(2l+1)(l-|m|)!}{4\pi{}(l+|m|)!}}P_{l}^{|m|}(\cos\theta{})e^{im\phi{}} \]

where:

These functions form a complete orthonormal set on the unit sphere:

\[ \int_{0}^{2\pi{}}\int_{0}^{\pi{}}Y_{lm}^{*}(\theta{}, \phi{})Y_{l'm'}(\theta{}, \phi{})\sin\theta{}d\theta{}d\phi{} = \delta_{ll'}\delta_{mm'} \]

This completes our derivation of the spherical harmonics directly from the differential equation, using only separation of variables and standard techniques for solving second-order linear differential equations. The quantum numbers \( l \) and \( m \) arise naturally from the requirement that solutions be well-behaved (finite, single-valued, and normalizable) on the sphere.

Note that this solution holds for any central potential \( V(r) \), and thus describes the angular part of the wave function for any spherically symmetric system.

Quantum Angular Momentum

Now that we have derived the spherical harmonics, let us understand their connection to angular momentum. In classical mechanics, angular momentum is defined as \( \vec{L} = \vec{r} \times{} \vec{p} \). To transition to quantum mechanics, we promote the classical quantities to operators using the correspondence principle:

\[ \vec{r} \rightarrow{} \hat{\vec{r}} = \vec{r}, \quad \vec{p} \rightarrow{} \hat{\vec{p}} = -i\hbar{}\vec{\nabla{}} \]

Therefore, the quantum mechanical angular momentum operator is:

\[ \tag{11} \hat{\vec{L}} = \hat{\vec{r}} \times{} \hat{\vec{p}} = -i\hbar{}\vec{r} \times{} \vec{\nabla{}} \]

In Cartesian coordinates, this gives us the three components:

\[ \hat{L}_{x} = -i\hbar{}\left( y\frac{\partial{}}{\partial{}z} - z\frac{\partial{}}{\partial{}y} \right) \] \[ \hat{L}_{y} = -i\hbar{}\left( z\frac{\partial{}}{\partial{}x} - x\frac{\partial{}}{\partial{}z} \right) \] \[ \hat{L}_{z} = -i\hbar{}\left( x\frac{\partial{}}{\partial{}y} - y\frac{\partial{}}{\partial{}x} \right) \]

Converting to spherical coordinates \( (r, \theta{}, \phi{}) \), where:

\[ x = r\sin\theta{}\cos\phi{}, \quad y = r\sin\theta{}\sin\phi{}, \quad z = r\cos\theta{} \]

we can express the angular momentum operators in spherical coordinates. After some algebra, we find:

\[ \hat{L}_{z} = -i\hbar{}\frac{\partial{}}{\partial{}\phi{}} \]

and

\[ \hat{L}^{2} = \hat{L}_{x}^{2} + \hat{L}_{y}^{2} + \hat{L}_{z}^{2} = -\hbar{}^{2}\left[ \frac{1}{\sin\theta{}}\frac{\partial{}}{\partial{}\theta{}}\left( \sin\theta{}\frac{\partial{}}{\partial{}\theta{}} \right) + \frac{1}{\sin^{2}\theta{}}\frac{\partial{}^{2}}{\partial{}\phi{}^{2}} \right] \]

Notice that this is exactly the operator that appeared in our angular equation (5)! This is no coincidence, the spherical harmonics are eigenfunctions of the angular momentum operators.

Let us verify this. Applying \( \hat{L}_{z} \) to \( Y_{lm}(\theta{}, \phi{}) \):

\[ \hat{L}_{z}Y_{lm} = -i\hbar{}\frac{\partial{}}{\partial{}\phi{}}\left[ \sqrt{\frac{(2l+1)(l-|m|)!}{4\pi{}(l+|m|)!}}P_{l}^{|m|}(\cos\theta{})e^{im\phi{}} \right] \] \[ = -i\hbar{}\sqrt{\frac{(2l+1)(l-|m|)!}{4\pi{}(l+|m|)!}}P_{l}^{|m|}(\cos\theta{})\frac{\partial{}}{\partial{}\phi{}}e^{im\phi{}} \] \[ = -i\hbar{}im\sqrt{\frac{(2l+1)(l-|m|)!}{4\pi{}(l+|m|)!}}P_{l}^{|m|}(\cos\theta{})e^{im\phi{}} \] \[ = m\hbar{}Y_{lm} \]

Similarly, applying \( \hat{L}^{2} \) to \( Y_{lm}(\theta{}, \phi{}) \), we find:

\[ \hat{L}^{2}Y_{lm} = l(l+1)\hbar{}^{2}Y_{lm} \]

Therefore, the spherical harmonics are simultaneous eigenfunctions of \( \hat{L}^{2} \) and \( \hat{L}_{z} \):

\[ \tag{12} \hat{L}^{2}Y_{lm} = l(l+1)\hbar{}^{2}Y_{lm} \] \[ \tag{13} \hat{L}_{z}Y_{lm} = m\hbar{}Y_{lm} \]

This explains why we called \( l \) the orbital angular momentum quantum number and \( m \) the magnetic quantum number. The quantum number \( l \) determines the magnitude of the angular momentum, while \( m \) determines its projection along the \( z \)-axis.

Commutation Relations

One of the most important properties of angular momentum operators is their commutation relations. Using the definitions in Cartesian coordinates, we can compute the commutators:

\[ [\hat{L}_{x}, \hat{L}_{y}] = i\hbar{}\hat{L}_{z} \] \[ [\hat{L}_{y}, \hat{L}_{z}] = i\hbar{}\hat{L}_{x} \] \[ [\hat{L}_{z}, \hat{L}_{x}] = i\hbar{}\hat{L}_{y} \]

These can be written more compactly using the Levi-Civita symbol:

\[ [\hat{L}_{i}, \hat{L}_{j}] = i\hbar{}\epsilon_{ijk}\hat{L}_{k} \]

where \( \epsilon_{ijk} \) is the totally antisymmetric tensor (equal to \( +1 \) for even permutations of \( (1,2,3) \), \( -1 \) for odd permutations, and \( 0 \) if any indices are repeated).

Importantly, \( \hat{L}^{2} \) commutes with all components of \( \hat{\vec{L}} \):

\[ [\hat{L}^{2}, \hat{L}_{i}] = 0, \quad i = x, y, z \]

This means that we can simultaneously measure \( \hat{L}^{2} \) and any one component of \( \hat{\vec{L}} \) (typically \( \hat{L}_{z} \)). However, we cannot simultaneously measure different components of \( \hat{\vec{L}} \) with arbitrary precision, this is a manifestation of Heisenberg's uncertainty principle.

Once again, an entire section could be dedicated to this property, but for now I will breifly mention the fact that this simultaneous measurment limitation comes solely from properties of quantum mechanics and this operator formalism. Once again, treat it as a law.

Ladder Operators

It is often convenient to work with the ladder (raising and lowering) operators:

\[ \hat{L}_{\pm{}} = \hat{L}_{x} \pm{} i\hat{L}_{y} \]

These operators have the following commutation relations:

\[ [\hat{L}_{z}, \hat{L}_{\pm{}}] = \pm{}\hbar{}\hat{L}_{\pm{}} \] \[ [\hat{L}^{2}, \hat{L}_{\pm{}}] = 0 \]

The ladder operators act on the spherical harmonics as follows:

\[ \hat{L}_{\pm{}}Y_{lm} = \hbar{}\sqrt{l(l+1) - m(m \pm{} 1)}Y_{l, m \pm{} 1} \]

As their name suggests, \( \hat{L}_{+} \) raises the value of \( m \) by 1, while \( \hat{L}_{-} \) lowers it by 1. This provides a convenient way to generate all spherical harmonics for a given \( l \) starting from \( Y_{l, -l} \) or \( Y_{l, l} \).

Physical Interpretation

The quantization of angular momentum has profound physical consequences. In classical mechanics, angular momentum can take any continuous value. In quantum mechanics, however:

This quantization explains many phenomena in atomic physics, including the discrete energy levels of atoms and the splitting of spectral lines in magnetic fields (the Zeeman effect, which we discussed earlier).

The fact that \( |\vec{L}| = \hbar{}\sqrt{l(l+1)} \) rather than \( \hbar{}l \) (which one might naively expect) is a purely quantum mechanical effect. It arises because the angular momentum components do not commute, and we cannot simultaneously specify all three components. The maximum projection along the \( z \)-axis is \( m = l \), but the magnitude is slightly larger due to the uncertainty in the \( x \) and \( y \) components.

The Zeeman Effect Explained

Now that we have developed the angular momentum formalism, we can understand the Zeeman effect from a quantum mechanical perspective. Recall from our classical discussion that a magnetic dipole moment \( \vec{\mu{}} \) in a magnetic field \( \vec{B} \) experiences an energy shift:

\[ \Delta{}E = -\vec{\mu{}} \cdot{} \vec{B} \]

In quantum mechanics, we promote this to an operator. For orbital angular momentum, we have from equation (4) that \( \vec{\mu{}} = g\frac{q}{2m}\vec{L} \), where for orbital motion \( g = 1 \). For an electron with charge \( q = -e \), the magnetic moment operator is:

\[ \hat{\vec{\mu{}}} = -\frac{e}{2m_{e}}\hat{\vec{L}} = -\frac{\mu_{B}}{\hbar{}}\hat{\vec{L}} \]

where \( \mu_{B} = e\hbar{}/2m_{e} \) is the Bohr magneton, a fundamental unit of magnetic moment in atomic physics.

When we apply a magnetic field along the \( z \)-axis, \( \vec{B} = B\hat{z} \), the energy shift becomes:

\[ \Delta{}E = -\hat{\vec{\mu{}}} \cdot{} \vec{B} = \frac{\mu_{B}}{\hbar{}}\hat{L}_{z}B = \mu_{B}mB \]

where we have used the fact that \( \hat{L}_{z}Y_{lm} = m\hbar{}Y_{lm} \). Therefore, each energy level with quantum number \( m \) shifts by an amount proportional to \( m \):

\[ \tag{14} E_{m} = E_{0} + \mu_{B}mB \]

where \( E_{0} \) is the energy in the absence of the magnetic field.

The Normal Zeeman Effect

Consider a transition between two energy levels. In the absence of a magnetic field, an atom emits a photon with energy \( \hbar{}\omega{} = E_{i} - E_{f} \), where \( E_{i} \) and \( E_{f} \) are the initial and final energies. When a magnetic field is applied, both levels split according to equation (14).

For a transition from a state with quantum numbers \( (l_{i}, m_{i}) \) to a state with \( (l_{f}, m_{f}) \), the energy of the emitted photon becomes:

\[ \hbar{}\omega{} = (E_{i,0} + \mu_{B}m_{i}B) - (E_{f,0} + \mu_{B}m_{f}B) = (E_{i,0} - E_{f,0}) + \mu_{B}(m_{i} - m_{f})B \]

The selection rules for electric dipole transitions require that \( \Delta{}m = m_{i} - m_{f} = 0, \pm{}1 \). This means that the spectral line splits into three components:

where \( \hbar{}\omega_{0} = E_{i,0} - E_{f,0} \) is the unperturbed transition energy. This is the normal Zeeman effect, which predicts that spectral lines split into exactly three components, equally spaced by \( \mu_{B}B \).

The normal Zeeman effect occurs when the total angular momentum of the atom is due solely to orbital angular momentum, and when the spin of the electron can be ignored. This is the case for atoms in singlet states (where all electron spins are paired) or for transitions between states with \( l = 0 \) and \( l = 1 \).

Lets take for instance the Cadmium atom spectra:

Figure 5. Spectral lines for the Cadmium atom (top), transparency of the spectral lines is relative to their intensity. Data retrieved from NIST. Paired with the normal Zeeman Splitting of the red cadmium line at 643.8nm (bottom). Note that each line that is split corresponds to a transition of an atom to an angular momentum of the atom along the z-axis to the resulting \( m \).

the red line at 643.8 nm exhibits precisely the three-component splitting predicted by the normal Zeeman effect. Each of the three lines corresponds to a different value of \( \Delta{}m \): one unchanged in frequency, one shifted to higher energy, and one to lower. The spacing between them matches \( \mu_{B}B \), confirming that tzhe atom's response to the magnetic field is governed by its orbital angular momentum alone. Cadmium's singlet states with paired electron spins that cancel make it an ideal case for an orbital only picture.

However, as we saw with Figure 4 the anomalous Zeeman irregular splitting seen in sodium and many other atoms cannot not be explained by orbital angular momentum alone. That puzzle led to the introduction of spin a new quantum mechanical degree of freedom that particles carry, independent of their motion through space. The full treatment of spin, its addition to orbital angular momentum, and the group theoretic structure hinted at in that Griffiths quote will be the subject of a follow up post.

  1. Griffiths, D. J., & Schroeter, D. F. (2018). Introduction to Quantum Mechanics (3rd ed.). Cambridge University Press.
  2. Kox, A. J. (1997). The discovery of the electron: II. The Zeeman effect. Eur. J. Phys., 18, 139–144.