How Heisenberg invented (discovered?) matrix mechanics
How did Werner Heisenberg create the first comprehensive mathematical apparatus for describing quantum phenomena? Why did he use mathematical matrices instead of differential equations, the usual mathematical tool at his time and the one successfully applied by his rival Erwin Schrödinger? The mystery deepens since Heisenberg didn’t even know matrices!
First, we have to say something about myths in history and science. We still regard every action by a politician or a scientist as something planned and logically executed. If I would describe Heisenberg’s discovery (or was it a creation?) of matrix mechanics in the way Disney studio would do it, here’s the fairy tale:
Once upon a time, a young and handsome German “Wanderbursche” (travelling youth) sat down and said to himself: I have to invent something useful, perhaps the groundwork for quantum physics. But what do I take as a mathematical foundation? So he thought and brooded and didn’t eat nothing no more and didn’t sleep at day or night. Until a compassionate fairy saw his crestfallen face, listen to his murmurs and whispered into his ear: Why don’t you take the formalism of matrix calculus? What? Exclaimed a startled Heisenberg, and the rest is history.
Actually, it was quite different. Heisenberg admired Einstein’s philosophy, and Einstein admired Mach’s philosophy. The Austrian philosopher, historian of science and eminent physicist Ernst Mach would be cataloged as a “positivist”. Mach did not believe in science to describe reality (which is inaccessible), but sensory impressions only. Physics, according to Mach, is not about facts but about human beings observing facts. Actually, that would be the realm of psychologists, but that’s another story.
Mach contemplating the world. Illustration: Ernst Mach: “Antimetaphysische Vorbemerkungen”. Public domain, color by Peter Ripota
Anyway, Einstein was impressed by Mach’s thoughts and modelled his special relativity (sr) according to these premises. That’s the reason there’s nothing stable in sr (except the speed of light in vacuum), and you read about observers moving relative to other observers. (Later, in general relativity, Einstein reverted to realism).
What can be observed in quantum physics? Only light.
But what is observable in quantum physics? Only light, with its frequency and intensity. Light is created by an electron jumping from an “orbit” (figuratively) of high energy to one of low energy. Since energy has to be preserved, the internal energy set free has to be emitted from the atom. The difference of energies determines frequency and intensity (the square of the amplitude of a light wave). That’s all you may observe or measure.
Jumping of an electron from higher to lower energy level (levels depicted as orbits) causes internal energy to be released as light
But Electrons may skip several energy-levels. The problem: energy levels can only be calculated via individual jumps, i.e. via a kind of cascade. But how should the two numbers (n→k, k→m) be combined purely mathematically to get the correct value for the jump n→m? Should you add them, multiply them, or combine them in some other way? In addition, for some quantities you need the square of these numbers, for example for the energy (the kinetic energy is known to be mv²/2) and for the intensity (= square of the amplitude). How should you square two numbers?
Heisenberg subjected the question to thorough analysis, eliminating what he did not need (e.g. the current position of the electron), using formulas that were already known (e.g. the “Ritz combination formula” for frequencies), and concentrating on observable quantities. He considered that what matters in a quantum jump is the probability with which an electron jumps from one orbit to another — you cannot give an exact time or even an exact reason. Now Heisenberg had a mathematical problem. In the case of frequencies, these are added in a cascade, but the probabilities are multiplied. What’s the correct connection between the two?
Let’s remember from math class: The probability of A or B is the sum of the individual probabilities. The probability for A and B is the product of the individual probabilities. In a cascade, all jumps must occur, so the probabilities must be multiplied together. On the other hand, there are different possibilities for a larger jump (it can occur in this way or in that way), so the corresponding probabilities must be added together. Confusing? That’s why it was so hard for Heisenberg to find the correct equations!
An example should clarify the matter. Suppose an electron jumps from orbit 5 to orbit 2. Then there are these possibilities: (5→4) and (4→2) or (5→3) and (3→2)
Now we replace “and” with “times”, “or” with “plus”, as usual in probability theory, and get the very symbolic equation:
(5→2) = (5→4)* (4→2) + (5→3)* (3→2).
The general equation is:
Every mathematician immediately knows what the above formula means: the multiplication of matrices. But Heisenberg didn’t know that because physicists didn’t work with it. That’s the story how he came to matrices. He got it, and then other problems arouse.
There is a special feature when multiplying two matrices A and B: A ´ B ≠ (is not the same as) B ´ A, or A ´ B — B ´ A ≠ 0 Multiplication is, as they say, non-commutative (non-interchangeable). This irritated Heisenberg enormously. Again, he spent sleepless nights and remarked afterwards:
The fact that xy is not equal to yx caused me great discomfort. I saw this as the only difficulty in the whole scheme, with which I was otherwise completely satisfied. I had written down the Thomas-Kuhn summation rule as the quantization rule, but I was not aware that it was pq — qp.
But Max Born and his student Pascual Jordan immediately recognized Heisenberg’s discovery. They expanded on Heisenberg’s formalism, called it “matrix mechanics” and thereby created the first usable, although mathematically complicated, quantum theory. But there are three things to note:
First, the matter was not as simple as I have presented it here. Heisenberg did not think in terms of probabilities, but rather transformed his observable quantities (frequency and intensity) into Fourier series. Each quantity thus becomes an infinite series, the product of two quantities becomes the product of two series, each with an infinite number of summands — mathematically demanding, but no problem for Heisenberg.
Second: Heisenberg could not stick to Mach’s program of only describing observable quantities. There are so many other quantities that are important and whose observability or non-observability is not yet certain. For example, position and momentum (mass times speed) are part of classical mechanics, and you cannot do without them in quantum physics. And the interpretation of the matrix didn’t reveal itself.
Third: In the time following, Heisenberg’s matrix-formalism was replaced by Schrödinger’s wave mechanics. Schrödinger’s theory was clear, the physicists liked that, and Schrödinger didn’t rule out anything, not even locations or speeds. In addition, not every physical quantity can be represented as an abstract matrix element. Mass, for example, always remains a simple number, as does time. And that is still a problem today: Both theories cannot deal with the phenomenon of “time”. It doesn’t appear at all with Heisenberg; Schrödinger at least later introduced a “time-dependent” wave equation. However, calculations are usually made using the equation without time dependence. In these images of reality there are only snapshots of events before and after a measurement, not an internal, constant development. This corresponds exactly to the picture that Heisenberg had of nature, but not what Schrödinger wanted. Only Bohm reintroduced space and time on an equal footing in his realistic quantum mechanics. But that’s another story.
This was an excerpt from my book “Das Rätsel der Quanten … und seine Lösung”
Heisenberg, quantum physics, matrix mechanics, history, science