Testing Einstein

An Unfinished Job

Einstein's theory of general relativity has passed every test that it has ever been put to. Nevertheless there are at least four good reasons to think that the theory is incomplete and will eventually need to be overthrown in just the same way that Newton's was. Firstly, general relativity predicts its own demise; it breaks down in singularities, regions where the curvature of spacetime becomes infinite and the field equations can no longer be applied. These cannot be dismissed as mere academic curiosities, because they do apparently occur in the real universe if general relativity holds. Theoretical work by Stephen Hawking and others has proven that singularities must form within a finite time (the universe is necessarily "geodesically incomplete"), given only very generic assumptions such as the positivity of energy. Two places where we expect to find them are at the big bang, and inside black holes like the one at the center of the Milky Way. If we are to fully understand these phenomena, then general relativity must be modified or extended in some way.

Secondly, there is the question of cosmology. Under the reasonable assumptions that the universe on large scales is homogeneous and isotropic (the same in all places and in all directions), as suggested by observation in combination with the Copernican principle, general relativity has led to the creation of a cosmological theory known as the big bang theory. This theory has had some spectacular successes; for instance, the prediction of the cosmic microwave background radiation, the calculation of the abundances of light elements, and a basis for understanding the origin of structure in the universe. It also has some weaknesses, notably involving finely tuned initial conditions (the "flatness" and "horizon problems").

Background galaxy (blue) being gravitationally
lensed by dark matter in foreground cluster
CL 0024+1654 (yellow) (Hubble Space Telescope image)

More troublingly, in recent decades it has become impossible to match the predictions of big-bang cosmology with observation unless the thin density of matter observed in the universe (i.e. that which can be seen by emission or absorption of light, or inferred from consistency with light-element synthesis) is supplemented by much larger amounts of unseen dark matter and dark energy that cannot consist of anything in the standard model of particle physics. The observations are quite clear: the required exotic dark matter has a density some five times that of standard-model matter, and the required dark energy has an energy density some three times greater still. To date, there is no direct experimental evidence for the existence of either component, and there are strong theoretical reasons (the "cosmological constant problem") to be suspicious of dark energy in particular. There is also no convincing explanation of why two new and as-yet unobserved forms of matter-energy should be so closely matched in energy density (the "coincidence problem"). While the majority of cosmologists seem prepared to accept both dark matter and dark energy as necessary, if inelegant facts of life, others are beginning to interpret them as possible evidence of a breakdown of general relativity at large distances and/or small accelerations.

Thirdly, existing tests of general relativity have been restricted to weak gravitational fields (or moderate ones in the case of the binary pulsar). Major surprises in this regime would have been surprising, since Einstein's theory goes over to Newton's in the weak-field limit, and we know that Newtonian gravity works reasonably well. But surprises are quite possible, and even likely, in the strong-field regime. The reason why is closely related to the fourth motivation for continuing to test Einstein's theory: general relativity as it stands is incompatible with the rest of physics (i.e. the "standard model" based on quantum field theory). The problem is only partly due to the fact that the gravitational field carries energy and thus "attracts itself"; this makes the theory nonlinear and more difficult, but not necessarily impossible to quantize. (Yang-Mills fields also possess self-couplings but are perfectly quantizable.) The deeper problem is not nonlinearity but nonrenormalizability, which is inherent in the physical dimensionality of gravity itself (i.e., in the fact that the gravitational field couples to mass rather than any other kind of "charge"). In field-theory language, quantization of gravity requires an infinite number of renormalization parameters. It is widely believed that our present theories of gravity and/or the other interactions are only approximate "effective field theories" that will eventually be seen as limiting cases of a unified theory in which all four forces become comparable in strength at very high energies. But there is no consensus as to whether it is general relativity or particle physics—or both—that must be modified, let alone how. Experimental input may be our only guide to unification, the last great remaining problem in theoretical physics.

The Equivalence Principle

Gravitational experiments can be divided into two kinds: those that test fundamental principles and those that test individual theories, including general relativity. The fundamental principles include such basic axioms as local position invariance (or LPI; the outcome of any experiment should be independent of where or when it is performed) and local Lorentz invariance (or LLI; the outcome of any experiment should be independent of the velocity of the freely-falling reference frame in which it is performed), which we will not discuss here. The fundamental principle of most direct physical relevance to general relativity is the equivalence principle, which asserts that gravitation is locally equivalent to acceleration. In practical terms this means that different falling bodies should follow the same trajectory in the same gravitational field, independent of their mass or internal structure, provided they are small enough not to disturb the environment or to be affected by tidal forces. To test this principle, one drops objects of different mass or composition in the same gravitational field and looks for differences in rate of fall. Such experiments have a long and fascinating history.

Portrait of Simon Stevin
Stevin

Galileo's inclined plane experiment.
(Fresco by G. Bezzuoli, 1841)

The Greek philosopher Aristotle saw no need to do them at all; he knew by reason alone that a larger mass "must" fall more quickly than a light one, since it is the nature of earth-like elements to strive toward the center of the universe. Such was his authority that there are no records of anybody actually testing this prediction until nearly ten centuries later (6th century AD) when the Byzantine scholar John Philoponus wrote in a commentary on Aristotle: "If you let fall from the same height two weights of which one is many times as heavy as the other, you will see that the ratio of times required for the motion does not depend on the ratio of the weights, but that the difference in time is a very small one" (italics added). First to describe an actual experiment in the modern sense was Flemish engineer Simon Stevin (1548/9-1620), who wrote in 1586: My experience against Aristotle is the following. Let us take ... two spheres of lead, one ten times larger and heavier than the other, and drop them together from a height of 30 feet onto a board ... Then it will be found that the lighter will not be ten times longer on its way than the heavier, but that they will fall together onto the board so simultaneously that their two sounds seem to be as one" (italics added). Some years later Galileo Galilei (1564-1642) described a similar experiment involving a cannonball and a musket ball. Contrary to almost universal belief, he did not claim to have dropped these balls from the Leaning Tower of Pisa; that story comes from his last pupil and biographer and its authenticity is far from certain. What is certain is that Galileo understood the importance of this test better than any before him. He used a variety of materials including gold, lead, copper and stone, and improved the experiment by rolling his test masses down inclined tables (to dilute gravity) and eventually by using pendulums (to reduce friction). He concluded in the Discourses and Mathematical Demonstrations Concerning Two New Sciences (1638) that "If one could totally remove the resistance of the medium, all substances would fall at equal speeds".

Portrait of Simon Stevin
Newton

Portrait of Galileo
Eötvös

Many people have improved on these tests since, most notably Isaac Newton (1643-1727) and Loránd Eötvös (1848-1919). Newton improved on Galileo's pendulum experiments, and perceived with characteristic brilliance that celestial bodies could also serve as test masses (in particular he checked that the earth and moon, as well as Jupiter and its satellites, fall toward the sun at the same rate). Newton's idea was reintroduced as a test of the equivalence principle by Kenneth Nordtvedt in the 1970s, and it now provides one of the two strongest limits on possible violations of equivalence: the earth (with a nickel-iron core) and the moon (composed mostly of silicates, like the earth's mantle) fall toward the sun with accelerations that differ by no more than 3 parts in 10¹³. This accuracy is made possible by lunar laser ranging measurements that make use of reflectors left on the moon's surface by the Apollo astronauts.

Torsion Balance

Eötvös' innovation was to introduce the use of the torsion balance, which enabled an improvement in sensitivity of six orders of magnitude over Newton's pendulum tests, reaching a precision of 5 parts in 10⁹. Torsion balances are still the basis for the best terrestrial limits on violations of the equivalence principle today; the best such limits (by Eric Adelberger and his collaborators) are identical to those obtained from the celestial method, and limit any difference between the accelerations of different test masses to less than 3 parts in 10¹³. Other kinds of equivalence-principle experiments using laser atom interferometry to measure differences in rate of fall for isotopes of different atomic mass may reach even higher precision in the future. However, earthbound tests of the equivalence principle are subject to fundamental limitations imposed by seismic noise, tidal effects and systematic uncertainties in lunar modeling. It is likely that further significant increases in precision will require going into space.

Artist's conception of the STEP
spacecraft

Testing equivalence on earth ...
and in space.

One such experiment, the Satellite Test of the Equivalence Principle (STEP), is currently under development at Stanford University with a design sensitivity of one part in 10¹⁸, an improvement of more than five orders of magnitude over current limits. At this level, an equivalence-principle experiment is capable of testing not only general relativity, but also theories that go beyond it and attempt to unify gravity with the other forces of nature, including versions of quantum gravity, string theory and quintessence cosmology. STEP will inherit some of the key technologies that have been proven by Gravity Probe B, including drag-free control and a readout systen based on SQUIDs (Superconducting QUantum Interference Devices).

Gravitational Redshift

Physicist Kip Thorne describes
the Gravity Probe A experiment

This was the first experimental test of gravitation that Einstein proposed, and it is often called one of the "three classical tests" of general relativity. The existence of the gravitational redshift effect, however, follows from the equivalence principle alone, so it is not a test of general relativity per se and is more properly grouped with the fundamental tests. (Some have called it the "half-test" in Einstein's "two and a half classical tests" of general relativity.) A clock in a gravitational field is, by the equivalence principle, indistinguishable from an identical one in an accelerated frame of reference. The gravitational redshift is thus equivalent to a Doppler shift between two accelerating frames. The first accurate measurement of this effect was made by R.V. Pound, G.A. Rebka and J.L. Snider in the 1960s using the frequency shift between two atomic "clocks" moving up and down inside Harvard University's Jefferson tower. They made use of a sensitive phenomenon called the Mössbauer effect to measure this shift to an accuracy of about 1%.

Vessot & Levine

A similar accuracy has been reached by experiments comparing clocks on earth to those on spacecraft such as Voyager (in Saturn's gravitational field) or Galileo (in the field of the sun). Other experimenters have looked for the shift of spectral lines in the sun's gravitational field, an attempt that was confounded for many years by solar "limb effects". Oxygen triplet lines finally allowed a 2% detection by James LoPresto et al. in 1991. Another test compares terrestrial timepieces to the highly stable astronomical "clocks" known as pulsars; this yields accuracies of about 10%. The most precise gravitational redshift test to date was carried out by Robert Vessot and Martin Levine in 1976 and is known as Gravity Probe A. It compared a hydrogen maser clock on earth to an identical one lifted into orbit at about 10000 km, and confirmed theoretical expectations to an accuracy of 0.02%.

It is interesting to note that the Global Positioning System (GPS) system, while not intended or used as a test of general relativity, does effectively serve as confirmation of the gravitational redshift effect. To reach their specified (civilian) navigational accuracy of about 15 m, GPS satellites must coordinate their time signals to within about 50 nanoseconds, a precision nearly 1000 times smaller than the size of the gravitational redshift effect (almost 40 microseconds at their operating altitude of 20,000 km). If they did not take Einstein's theory into account, GPS trackers in aircraft cockpits would be off by kilometers within a day!

Mercury's Perihelion Shift

Animation showing Mercury's
perihelion shift

Mercury provided Einstein with his first true "classical" test of general relativity, and it formed the immediate basis for the rapid acceptance of his theory by his peers. Astronomers had known since the time of Urbain Le Verrier in 1859 that Mercury's orbit (as measured by the location of its perihelion, the planet's closest approach to the sun) was slewing too quickly around the sun to be explained by known factors such as the perturbing influence of the other planets. All explanations for this phenomenon (such as the notion that a new planet named "Vulcan" lay inside Mercury's orbit) had failed. When Einstein found that his theory explained the anomalous perihelion advance perfectly with no special assumptions, he experienced heart palpitations and was (as he wrote to a colleague) "for several days beside myself with joyous excitement".

Einstein in 1933

The perihelion shift test (and most other gravitational tests) are now usually expressed using a formalism invented by Arthur S. Eddington and later developed by Kenneth Nordtvedt Jr. and Clifford Will into what is known as the Parametrized Post-Newtonian (PPN) framework. Eddington described weak spherically symmetric gravitational fields like that around the sun in a rather general form with only two parameters, β describing the nonlinearity in time warping and γ describing space warping. (There is also a third parameter α, but it does not test any aspect of relativity theory and merely allows one to rescale the value of Newton's gravitational constant G.) General relativity predicts that β and γ are both equal to one, and most of the experimental tests effectively place upper limits on |β-1| and/or |γ-1|. Mercury's anomalous perihelion shift is proportional to (2+2γ-β)/3, which is equal to one in general relativity. This is the only "classic" test that probes the nonlinear nature of Einstein's theory. Initial measurements relied on optical telescopes; modern ones are based on radar data and constrain any departure from general relativity to less than 0.3%. An important early source of systematic error came from uncertainty in solar oblateness (quadrupole moment), but this has now been well constrained from helioseismology. Perihelion shift affects other planets besides Mercury, but is far smaller. It has also been observed using radio telescopes in distant binary pulsar systems, where it is known as periastron shift.

Light Deflection

Animation showing gravitational
deflection of light

If Mercury's perihelion shift led to the acceptance of general relativity among Einstein's peers, then light deflection made him famous with the public. He had already found in 1911 that the equivalence principle implies some light deflection, since a beam of light sent horizontally across a room will appear to bend toward the floor if the room is accelerating upwards. (Similar arguments had in fact been proposed on purely Newtonian grounds by Henry Cavendish in 1784 and Johann Georg von Soldner in 1803.) In 1915, however, Einstein realized that space curvature doubles the size of the effect, and that it might be possible to detect it by observing the bending of light from background stars around the sun during a solar eclipse. He had to wait until the end of the war, when expeditions led by Arthur Eddington and Andrew Crommelin returned decent photographs of the eclipse of May 1919.

Photo of Eddington
Eddington

NY Times, Nov 10, 1919

The results vindicated Einstein's new theory, though with an accuracy of only about 30%. When they were announced in November that year, the press picked up the story and Einstein became an international star. Many have speculated on the reasons for Einstein's mythic and seemingly endless appeal. Part of the explanation must lie in the end of the most shattering war in history; the fact that a German-speaking physicist's theory had been confirmed by English observers portended a peaceful future where pure thought might triumph over narrow minds. Abraham Pais, Einstein's great scientific biographer, put it like this in Subtle is the Lord... (1982): "A new man appears abruptly ... He carries the message of a new order in the universe. He is a new Moses come down from the mountain to bring the law, a new Joshua controlling the motion of heavenly bodies. He speaks in strange tongues but wise men aver that the stars testify to his veracity ... His mathematical language is sacred yet amenable to transcription into the profane: the fourth dimension ... light has weight, space is warped ... He fulfills two profound needs in man, the need to know and the need not to know but to believe."

The light deflection angle is proportional to (1+γ)/2, which is equal to one in general relativity. Experimental limits on γ using optical telescopes never managed to improve very much on those obtained by Eddington and Crommelin, and it was not until the late 1960s that radio astronomers were able to make significant progress by using linked arrays of radio telescopes (VLBI) to measure the bending of radio waves from distant quasars around the sun. By 1995 these observations had confirmed general relativity to an accuracy of 0.04%. The entire sky is slightly distorted by light deflection around the sun, and since this effect reaches 4 milliarcseconds perpendicular to the earth-sun direction it must be taken into account by modern astrometric satellites such as Hipparcos, which determine the positions of millions of stars to within 3 milliarcseconds. These corrections confirm general relativity indirectly at the 0.3% level. It has also been possible to measure light deflection by the planet Jupiter, though with an accuracy of only about 50%. In cosmology, light deflection (better known as gravitational lensing) is used routinely to weigh dark matter, measure the Hubble parameter and even function as a cosmic "magnifying glass" to bring the faintest and most distant objects into closer view.

Shapiro Time Delay

Animation of the Shapiro Time Delay

Physicist Kip Thorne describes the
Viking spacecraft measurement of the
Shapiro Time Delay

Photo of Shapiro
Shapiro

Until the early 1960s it seemed that general relativity had been tested experimentally in every way it could, and gravitational physicists were left to focus on mathematical aspects of the theory. A frustrated Richard Feynman wrote to his wife as follows from a conference in 1962: "I am not getting anything out of the meeting. I am learning nothing ... There is a great deal of 'activity in the field' these days, but this 'activity' is mainly in showing that the previous 'activity' of somebody else resulted in an error or in nothing useful ... It is like a lot of worms trying to get out of a bottle by crawling all over each other ... Remind me not to go to any more gravity meetings!" The space age changed all that. In 1964, Irwin Shapiro realized that if general relativity was correct, a light signal sent across the solar system past the sun to a planet or satellite would be slowed in the sun's gravitational field by an amount proportional to the light-bending factor, (1+γ)/2, and that it would be possible to measure this effect if the signal were reflected back to earth. Typical time delays are on the order of several hundred microseconds; this is sometimes referred to as the "fourth classical test" of general relativity. Passive radar reflections from Mercury and Mars were consistent with general relativity to an accuracy of about 5%. Use of the Viking Mars lander as an active radar retransmitter in 1976 confirmed Einstein's theory at the 0.1% level. Other targets included artifical satellites such as Mariners 6 and 7 and Voyager 2, but the most precise of all Shapiro time delay experiments involved Doppler tracking of the Cassini spacecraft on its way to Saturn in 2003; this limited any deviations from general relativity to less than 0.002% — the most stringent test of the theory so far.

The Binary Pulsar

Animation showing gravitational
radiation from a binary pulsar

Pulsars are rapidly rotating neutron stars which emit regular radio pulses as they rotate. As such they act as clocks which allows their orbital motions to be monitored very precisely. Tests based on these objects are particularly valuable because they allow us to probe gravitational fields stronger than those in our own solar system: not strong-field by any means, but arguably "moderate-field" tests. The first binary pulsar (a pulsar and another object in orbit around each other) was discovered in 1974 by Joseph Taylor and Russell Hulse during a routine search for new pulsars; it goes by the prosaic name B1913+16. The companion is also a compact object, likely a neutron star. As they orbit around each other, both stars are continuously accelerating, which causes them to emit gravitational radiation in the same way that an accelerated electric charge emits electromagnetic radiation. The emission of radiation leads to a loss of energy from the system, causing the two bodies to spiral toward each other and resulting in a gradual speeding up of the pulses from the pulsar. Precise timing measurements allow one to reconstruct three relativistic effects: the average rate of periastron shift, a combination of gravitational redshift and (special-relativistic) time dilation, and the rate of change of orbital period. Together these three pieces of information impose three constraints on the two unknown masses; the extra constraint can then be used to test the theoretical prediction for energy loss. If general relativity is assumed to be valid, then all three constraints are satisfied simultaneously to an accuracy of 0.2% or better. For this work Hulse and Taylor won the Nobel Prize in 1993.

Photo of Russell Hulse
Hulse

Photo of Joseph Taylor
Taylor

Observations of the binary pulsar test do not constitute a direct detection of gravitational waves; we see only an energy loss that is consistent with the emission of gravitational radiation in precise agreement with Einstein's quadrupole formula (which, incidentally, he derived while bedridden with stomach ulcers in 1917). Nevertheless it is widely believed that gravitational waves do exist, and hoped that they will be directly detected eventually (see below). As of 2006, nine relativistic binary systems have been discovered with orbital periods of less than a day. Some, like B2127+11C, are virtual clones of B1913+16, while others show potential promise as further tests of Einstein's theory, like B1534+12 (whose orbital plane is seen almost edge-on) and J1141-6545 (in which the companion star is probably a white dwarf rather than a neutron star). Most fascinating is the recently discovered double pulsar system J0737-3039, in which radio pulses are detected from both stars, giving us so much information that the two masses are constrained by six constraints rather than three, allowing for four independent tests of general relativity. That all four of these tests are consistent with each other is itself impressive confirmation of the theory. After two and a half years of observation, the most precise of them (the Shapiro delay) verifies Einstein's theory to within 0.05%.

Gravitational Waves

Effect of a gravity wave on a ring of particles.
Effect of a gravity wave
on a ring of particles

Weber's resonant detector (~1965)

Direct detection of gravitational waves would verify one of general relativity's most striking predictions and open up a new astronomical window on the cosmos. There is good news and bad news about these waves. The good news is that they interact so weakly with matter that they can travel over vast distances without being scattered, potentially bringing us information from the most distant and most violent places in the universe. The bad news is that it is extremely difficult to get them to interact with a detector. Like electromagnetic waves (light), gravitational waves move at the speed of light and are "transverse": they cause test masses to accelerate at right angles to the direction of wave propagation, just as an electromagnetic wave does to test charges. A gravitational wave passing through your computer screen acts on a ring of free particles as shown in the diagram at left. In principle this is simple to detect; the particles behave as if they are being subjected to a strain exactly like a Newtonian tidal force. However, the motions involved are so tiny (the ring is "compressed" by at most 10^-20) that detecting them is an immense challenge. Early detectors were large metal cylinders designed to respond to such a force resonantly, like a bell. One such detector, a 3100-pound aluminum cylinder built by Joseph Weber in 1963, led to a claimed detection of gravitational waves in 1969, but it could never be duplicated and is generally agreed to have been spurious. Four modern versions of the resonant or "Weber detector" are in operation as of 2006 (ALLEGRO in the U.S.A., AURIGA and NAUTILUS in Italy and EXPLORER at CERN).

February 11, 2016: Press Conference
video on the first detection of
gravitational waves by LIGO

Physicist Kip Thorne describes the LISA
mission to detect gravitational waves

The most sensitive detectors use interferometry to make precise distance measurements. The Laser Interferometry Gravity-wave Observatory (LIGO) consists of two L-shaped pairs of interferometers 4 km long, in which beams of laser light measure the difference between the lengths of the two "legs" induced by a gravitational wave. Expected mirror displacements are smaller than the size of an atomic nucleus, and can only be measured with careful suppression of seismic, thermal and shot noise effects. LIGO began operation in August 2002. No gravitational waves have yet been detected, but the experiment is placing useful upper limits on the frequency of possible sources such as exploding supernovae and collisions or coalescences of compact objects like neutron stars and black holes. LIGO was joined by another interferometric detector, GEO 600, in November 2005, and by a third, VIRGO, in May 2007. Other experiments are in various stages of construction around the world, including an upgraded version of LIGO (Advanced LIGO) with at least ten times the initial sensitivity. All these ground-based detectors are sensitive primarily to the high-frequency gravitational waves produced by transient phenomena (explosions, collisions, inspiraling binaries). A complementary Laser Interferometer Space Antenna (LISA) is currently in the planning stages; this will look for the lower-frequency waves from quasi-periodic sources like supermassive black-hole binaries in the final months of coalescence and compact star binaries long before coalescence. LISA is a triangular system of three satellites in solar orbit forming an interferometer with legs millions of km long. It will rely crucially on some of the technology (such as drag-free control) that has been proven by Gravity Probe B.

James Overduin, December 2007, Updated April 2016

<- Einstein's Spacetime (Previous) | Spacetime & Spin (Next) ->

Gravity Probe B