One of the most useless concepts in physics is “force.” Admittedly, the concept has its use in teaching classical physics at an elementary level. When dealing with a repulsive force like the electrostatic one between charges of the same sign, or the magnetic one between equal polarities, we can fall back in imagination on our bodily familiarity with pushing, and when dealing with an attractive force like gravity, we can fall back in imagination on our bodily experience of pulling. But that’s as much as can be said in favor of “force.”
Consider a very basic example, Newtonian gravity. It is proportional to the product of two masses divided by the squared distance d between the two masses:
F = G m₁ m₂ / d².
This force is held responsible for the acceleration of m₁ according to F = m₁ a, and for the acceleration of m₂ (in the opposite direction) according to F = m₂ a. It is easy to get rid of the fictitious agent by eliminating F. Just plug F = m₁ a into the above equation to obtain the acceleration that m₁ undergoes (a = G m₂ / d²), and vice versa for m₂. We may, if we are so inclined, retain the concept of causation, thinking of the presence of one mass at its particular location as cause, and of the acceleration that the other mass undergoes as effect. But we must not cook up some physical mechanism of process by which the cause produces the effect.
Classical physics is what quantum physics degenerates into in the so-called classical limit, so let’s take a look at how this happens. This task is made (relatively) easy by the fact that it requires no metaphysics. It’s a simple matter of showing how one set of calculational tools degenerates into another. That’s the good new. The bad news is that this takes some talking math.
The parable of the vanishing slits
If you do not remember the iconic two-slit experiment with electrons, you may want to (re)visit this post. The setup features an electron gun, a plate with two slits, and a screen composed of an array of detectors, one of which is shown in the following diagram.
Let us modify the setup. First we replace the two slits by two holes. Then we add more plates. Then we drill more holes. And more plates. And more holes. What happens if we do the Zen-like thing of adding so many plates and drilling so many holes that there are no longer any plates with holes? Let’s find out.
If you take another look at the aforementioned post, you will come across two essential rules of the quantum game. Here they are in a simplified form:
[Preamble] In order to calculate the probability of a possible outcome of a measurement M₂, given the actual outcome of an earlier measurement M₁, choose a sequence of measurements that may be made in the meantime, and apply the appropriate rule. The possible sequences of intermediate outcomes are called “alternatives.” Associated with each alternative there is a complex number called “amplitude.”
[Rule 1] If the intermediate measurements are made, first square the magnitudes of the amplitudes associated with the alternatives, then add the results.
[Rule 2] If the intermediate measurements are not made, first add the amplitudes associated with the alternatives, then square the magnitude of the result.
As long as we are dealing with a finite number of plates and a finite number of holes in each plate, each intermediate measurement serves to determine the hole that an electron takes as it passes a given plate. If the intermediate measurements are not made, Rule 2 applies. This, then, is what happens in the limit in which the number of plates and the number of holes in each plate grow infinitely large, so that neither plates nor holes are left: the sum over alternatives (each contributing an amplitude) becomes an integral over all continuous paths leading from a starting point A (where the electron is launched) to an end point B (where it is detected). In other words, it becomes a path integral ∫𝒟𝒞 Z[𝒞].
For those familiar with integrals: In the case of an ordinary (Riemann) integral ∫[a,b] dx f(x), where f is a function that assigns a number y =f(x) to every number x from some interval [a,b], we image the interval [a,b] divided into infinitely many intervals of infinitely small width dx. As the word “infinitely” indicates, a limit is taken. In the case of ∫𝒟𝒞 Z[𝒞], the integrand is a functional, meaning that it assigns an (in this case complex) number Z[𝒞] to every continuous path 𝒞 leading from A to B. While the ordinary integral adds up contributions dx f(x) from infinitely small intervals, the path integral adds up contributions 𝒟𝒞 Z[𝒞] from infinitely narrow bundles of paths leading from A to B.
For everyone else: ∫𝒟𝒞 Z[𝒞] sums up contributions from infinitely many infinitely narrow bundles of paths, each path leading from a fixed starting point A to a fixed end point B. Each bundle contributes a complex number 𝒟𝒞 Z[𝒞], where 𝒟𝒞 represents the width of the bundle and Z[𝒞] is a mathematical machine that assigns a complex number Z to every path 𝒞. (Because Z[𝒞] assigns a number to a path, which mathematically is represented by a function, it is called a “functional.”)
To proceed, we must take account of the second pillar of contemporary physics, which is the theory of relativity. What this means is that we must sum contributions from paths in 4-dimensional spacetime, rather than contributions from paths in 3-dimensional space. As long as we are dealing with a freely moving stable particle, Z[𝒞] equals a complex number z of unit magnitude whose phase is proportional to a functional s[𝒞]. The proportionality factor is some positive constant b, and s[𝒞] assigns a positive number s to every continuous path 𝒞 from A to B. s[𝒞] turns out to be the length of 𝒞 that would be measured by a clock (thus in seconds) if one were to travel from A to B via 𝒞.
What is most memorable at this point is that the behavior of a freely moving stable particle is determined by a single positive number b (which can differ between particle types). And it bears repetition that what is meant by “the behavior of a particle” is encapsulated in a mathematical machine that assigns a probability to the possibility of detecting the particle at any given point and any given time, based on where and when it was last detected.
The meaning of mass
So what is the physical significance of this number b? Remember that z, being a complex number of unit magnitude, can be pictured as an arrow of unit length. The direction in which it points is determined by the phase of z, which equals the product b s[𝒞]. Let us imagine a particle that follows a spacetime path 𝒞, and let us take s to be the length of the path already traveled. As the particle continues to move along 𝒞, that arrow rotates, and this allows us to think of it as the hand of a watch which ticks each time a cycle is completed.
Since the phase of z it positive, by mathematical convention the arrow rotates anticlockwise. It is customary to insert a minus sign (i.e., to replace bs by –bs), which makes the arrow rotate clockwise, as befits a clock. It is also customary to multiply the phase by 2π, which allows us to think of b as the rate at which the watch ticks or as the number of cycles it completes each second. It is further customary to divide by Planck’s constant h, so that b is measured in units of energy, whereupon it comes to be known as the particle’s rest energy. And finally it is customary to multiply by the speed of light squared, so that b is measured in units of mass, whereupon it comes to be known as the particle's mass. And since particle physicists like to work with “natural” units (i.e., to set both the reduced Planck constant ℏ = h/2π and the speed of light c equal to 1), the end result is that the constant b is none other than the particle’s mass m.
The most appropriate way of thinking of the mass m of a freely moving stable particle thus is to think of the particle as carrying a watch, and to think of m as the rate at which this ticks as the particle travels along a path 𝒞 in spacetime. Caveat: The particle doesn’t travel along any particular path, nor does it travel along all possible paths. I am merely paraphrasing a way of calculating the statistical correlations which exist between particle detections, and which encapsulate the observable behavior of the simplest type of particle — one that moves freely and is stable (it doesn’t decay).
It is, moreover, worth keeping in mind that the assumption that two successive particle detections are detections of the same individual particle (rather than merely detections of the same type of particle) is justifiable only under exceptional experimental conditions. (You may want to revisit the discussion of sameness in this post.)
Geometry through and through
To get to the classical theory, according to which every material object follows a definite path, we need an important physical quantity, which goes by the name of “action.” Imagine, if you will, an infinitely small segment d𝒞 of a path in spacetime, situated at a point 𝒫 with coordinates (x,y,z,t). Its projections onto the four spacetime axes are the four intervals dx, dy, dz, dt. Because an infinitely short path segment is straight,1 the expression Z[d𝒞] is a function, unlike Z[𝒞], which is a functional. It assigns to d𝒞 a complex number of unit magnitude whose phase equals the infinitesimal action dS(x,y,z,t,dx,dy,dz,dt) divided by the reduced Planck constant ℏ. This infinitesimal action has the following property: multiplying each of the four infinitesimal intervals by a factor u has the same effect as multiplying dS itself by u. This makes it possible to interpret dS as defining a differential geometry — the kind of geometry that allows one to assign lengths to paths on a warped surface, in a warped space, or in a warped spacetime.
There are essentially two ways in which spacetime can be warped, one which affects different types of particles differently, and one which affects all particles equally. The first is described by a type of geometry that is named after Paul Finsler, the second by a type of geometry that is named after Bernhard Riemann. The first turns out to represent the effects that electrically charged particles have on electrically charged particles, and the second turns out to represent the gravitational effects that all particles have on each other. Because gravity affects all particles alike, the second is often spoken of as the geometry of spacetime. But this is misleading, inasmuch as neither space nor time nor spacetime comes equipped with an inbuilt metric (a way to measure distances and/or durations). The gravitational effect of material objects is to bend the trajectories of material objects, not to warp spacetime itself.
The classical limit
So how do we get objects that follow definite paths? We might start by asking: what is the probability of finding that a particle has traveled from spacetime point A to spacetime point B via a specific path 𝒞₀? But since it is strictly impossible to ascertain the specific path that a particle has taken, that question is meaningless. Let us then make the more realistic assumption that it is possible to determine whether a particle has traveled from A to B within some narrow bundle of paths. The probability of finding that a particle has traveled from A to B inside a narrow bundle 𝔅 of paths is formally given by the magnitude squared of a path integral I(𝔅) that sums up contributions from all paths contained in 𝔅. We need to consider two cases: either (i) 𝔅 contains a path 𝒞₀ whose action is less than the action of any path near 𝒞₀, or (ii) 𝔅 does not contain such a path.
If 𝔅 contains such a path, the actions of the paths close to 𝒞₀ will be almost equal to S[𝒞₀]. A more familiar analogue is the relative minimum f(x₀) of a continuous function f(x): for values of x close to x₀, f(x) is almost equal to f(x₀). Hence, there will be a large number of paths that make almost identical contributions to the path integral I(𝔅). These contributions are complex numbers, and if we represent them as arrows, their sum will be an arrow of considerable length, as the following illustration shows. The magnitude of the amplitude associated with the possibility of finding that the particle has traveled from A to B inside 𝔅 will therefore be large, and so will be the corresponding probability.
If 𝔅 does not contain a path whose action is less than the action of any neighboring path, most of the contributions to I(𝔅) will be represented by arrows that add up to some sort of coil. The sum of these arrows, which is an arrow that points from the tail of the last arrow to the head of the first, will never amount to much. Nor, therefore, will the probability of finding that the particle has traveled from A to B inside 𝔅.
Finally let us take into account that the contribution from a path 𝒞 to the integral I(𝔅) is a complex number of unit magnitude whose phase φ is proportional to S[𝒞] divided by the reduced Planck constant ℏ. The classical limit can formally be obtained by letting ℏ go to zero. As ℏ is taken to 0, φ grows infinitely large. The overall result therefore is that the probability of finding that the particle has traveled inside a bundle containing a path of least action (compared to its neighboring paths) tends to 1, while the probability of finding that the particle has traveled inside a bundle that does not contain a path of least action tends to 0.
This result is known as the principle of least action. Since it also holds for paths whose actions are larger than those of their close neighbors, it is also known as the principle of stationary action. The bottom line: in the classical limit of quantum mechanics, every material object follows a spacetime path whose action is stationary. In other words, it follows a geodesic of some differential geometry.
How the classical electromagnetic field bends the trajectories of electric charges
The rectangles below illustrate the effects that electrically charged objects have on electrically charged objects in the classical limit. For an object of given charge and mass, the curved spacetime path shown is the path of least action from A to C. For this reason — and for no other — said object moves from A to C along this path.
The rectangle on the left is situated in a spacetime plane that contains the time axis. In such a plane, curvature means acceleration or deceleration. If you follow the path from A to C, you will notice that its slope decreases: it takes less and less time to cover the distance of, say, a meter. In other words, the object’s speed increases. According to the classical storybooks, this effect is “caused” by the electric part of the electromagnetic field. (It is actually caused by the distribution and motion of the other charged objects around.)
The rectangle on the right is situated in a spacetime plane that does not contain the time axis. In such a plane, curvature means curvature plain and simple. According to the classical storybooks, this effect is “caused” by the magnetic part of the electromagnetic field, which is thought to accelerate any charged object in a direction perpendicular to the direction in which it moves. (It is actually caused by the distribution and motion of the other charged objects around.)
But if, classical storybooks notwithstanding, accelerations aren’t caused by forces pushing or pulling things, then what do today’s physicists have in mind when they speak of the “fundamental forces” of nature? We already have discussed on several occasions the meaning of “particle” in contemporary physics. Our predicament is that quantum physics is incompatible with the two cornerstones of our classical universe of discourse — the concepts of substance and causality, from which the concepts “particle” and “force” derive their original meanings.
The evidentiary basis of contemporary particle physics consists of scattering experiments, where the relevant theoretical tool is the S-matrix (the S stands for “scattering”). This serves to calculate the probability with which any given set of incoming particles gets transformed into any given set of outgoing particles. The calculations are most efficiently done with the help of diagrams containing vertices at which straight and wiggly lines meet. Internal lines, which represent neither incoming nor outgoing particles, are referred to a “virtual particles,” and are said to represent both particles and the forces that act between them. As Brigitte Falkenburg points out here, talk of virtual particles “is a mere façon de parler” which “provokes conceptual confusion as soon as one insinuates that virtual particles are physical, i.e., on a par with the real field quanta or the incoming and outgoing physical particles of a scattering experiment.” In short, talk about particles that mediate interactions (between physical particles or other virtual ones) is to mistake a mathematical procedure for an actual physical process.
If you begin by imagining a small but finite segment of a curved path, and then zoom in on smaller and smaller segments, you will notice that they are less and less curved. In the infinitesimal limit, their curvature vanishes.