Determinism, locality, and meta-time

John Shutt
18 March 2008.

Minor revisions, 23 March 2008, 12 July 2008, 17 September 2008, 21 October 2008.

Copyright John N. Shutt 2008.

Preface

This essay is about two properties that some theories of physics have — determinism and locality — and the gaps that can exist between how they are understood as properties of physical reality, how they are understood as properties of mathematical theories, and how they are formally defined as properties of mathematical theories. I will point out one such gap that seems to have gone widely unremarked, and that could admit an interesting class of physical theories.

I hope (optimistically) to make the essay accessible to both experts and laymen, without boring the former or overwhelming the latter. The subject does not require very advanced technical machinery, as it is actually possible to discuss these properties in relation to quantum mechanics (the modern theory that most challenges them), in some depth, without directly engaging the mathematics of quantum mechanics itself. Some more elementary tools are assumed. The reader should, for example, be able to do simple algebra; know that the sum of the probabilities of all possible outcomes is equal to 1; and know what a cosine and an integral are (especially that an integral is a sort of infinite sum; but you don't need to remember things like the Chain Rule or the Law of Sines). If you've heard that Special Relativity says things can't go faster than light, so much the better.

On the other hand, for readers already well acquainted with Bell's Theorem, it may be helpful to know up front that, ultimately, I will identify a particular class of mathematical theories that have a sort of locality —mathematical locality, but not apparently physical locality— but that do not satisfy the assumptions of the Theorem and therefore are not constrained by Bell's Inequality (and no, this is not related to Joy Christian's work; I'm going to take an orthodox view of Bell's Theorem).

In presenting an overview of my material, and curtailing some nuances so as not to distract from the points most salient to the essay, here and there I will probably have worded things in ways that send informed purists into conniption fits. I sympathize, and when I become aware of such things I try to defuse them, without compromising accessibility to less heavily prepared readers. Constructive feedback is welcome.

It's possible that the insight I mean to bring out in this essay is well known in some circles. I haven't seen it mentioned anywhere, though; and there are good popular discussions of Bell's Theorem out there, by broad-minded writers who presumably would not have omitted a significant fork in the logic if they were aware of it; so the fact that I had to work it out for myself suggests, at least, that its circle of initiates is smaller than it ought to be.

Hidden variables and EPR
Bohm's pilot wave
Bell's Theorem
Higher-order time
Quantum locality
Conclusion

Hidden variables and EPR

In late-nineteenth-century physics, the physical world was made up of particles and fields in three dimensions of space, evolving over a fourth dimension of time. The four dimensions were a Euclidean continuum — that is, time was just another dimension qualitatively like the three dimensions of space — and the particles were ideal points, i.e., they were infinitely small. At any given time, the particles had definite positions and properties (notably, velocity, mass, and electrical charge), and the fields had definite values at each point in space. All these things evolved deterministically over time; and, moreover, the evolution of a closed volume could be understood entirely in terms of its contents and its interactions with neighboring volumes. In short, the evolution of physical systems over time was deterministic and local.

Except for the part about space-time being Euclidean, Einstein's Special Theory of Relativity actually strengthened this classical view of the physical world. To see why, consider how a finite volume of space — say, a cube one foot on each side — would interact with the rest of the universe over a finite interval of time — say, one nanosecond. (I didn't just pull these units out of a hat, by the way; one foot is, to a pretty good approximation, the distance that light travels in a vacuum in one nanosecond.) How much of the rest of the universe do you have to take into account? A particle traveling at, say, ten feet per nanosecond might be just under ten feet away at the beginning of the time interval, yet inside the volume at the end; indeed, no matter how far away a particle was at the start of the time interval, nineteenth-century physics said that it might just be traveling fast enough to pass through the volume before the end of the interval. Thus, according to nineteenth-century physics, the volume could be affected, before the end of the interval, by particles that were located anywhere in the universe at the start of the interval. Special Relativity, by limiting particles (and waves) to the speed of light, guaranteed that, in order to fully understand the evolution of the volume over an interval of one nanosecond, you could ignore everything in the universe that was more than one foot away from the volume at the start of the interval. Longer time intervals would allow interactions from farther away — for an interval of one second, for example, you'd need to account for particles up to 186 thousand miles out — but that's still a lot better than having to consider the entire content of the universe.

Quantum mechanics defies both determinism and locality. Each apparent particle (that's "apparent particle") is described in quantum mechanics by a wave function — a complex-valued field, extending over the space of all possible states in which the particle might be observed, that determines the probability that the particle would be observed in each state. Each possible state of the particle includes its position and momentum; so the space of possible states has at least six dimensions, although only the three dimensions of position are obviously relevant to the notion of locality. According to the mathematics of quantum mechanics, these wave functions do actually evolve in a strictly deterministic manner, in that if you know the wave function of a particle (or of a larger physical system, whose state may encompass the individual states of many particles) at the start of an interval, there is only one way for the wave function to be at the end of the interval — however, this isn't determinism in the classical sense, because it's only the wave function that can only be one way at the end of the interval; you still have only a probability distribution for the particle (or system) itself. As for locality, anything that interacts with a wave function (such as performing an observation of the particle) affects the entire field at once, at all parts of its state space, which appears classically as an instantaneous propagation of the change to all possible positions in the entire universe.

Einstein — who had been, remember, personally responsible for greatly strengthening the locality of classical physics — called the instantaneous-propagation property "spooky action at a distance". His overall assessment of quantum mechanics is summed up in the following oft-quoted passage, from a private communication to Max Born of 4 December 1926.

Quantum mechanics demands serious attention. But an inner voice tells me that this is not the true Jacob. The theory accomplishes a lot, but it scarcely brings us closer to the secret of the Old One. In any case, I am convinced that He does not play dice.

Three things should be fairly clear about Einstein's reaction to quantum mechanics: (1) he didn't like its nondeterminism; (2) he didn't like its non-locality; and (3) his intuition told him it wasn't getting at the fundamental workings of physical reality. Often, in retellings of this chapter in the history of quantum mechanics, Einstein's intuition is portrayed as rejecting the fundamentality of quantum mechanics because he could not accept that reality isn't deterministic or local — but that isn't quite what I just said. When seeking to explain in depth his reasons for doubting quantum mechanics (as opposed to challenging quantum mechanics on specific technical points, which we'll get to in a moment), his discussion was not so much about how the natural world works, but rather about what it means to conceive a theory of the natural world — that is, metaphysics rather than physics. (See, for example, Paul Arthur Schilpp, ed., Albert Einstein: Philosopher-Scientist, 1949, pp. 665–688.) It would still be safe to claim that he couldn't accept nondeterministic non-local physics, provided that his position on the metaphysics really implies his position on the physics. However, suppose a physical description of reality could be organized in some way, different from what he supposed in his discussion, such that it would answer his metaphysical objections without providing the physical properties he was asking for. To claim that he "couldn't accept" the physics, one would then also have to presuppose that his position on the physics was really more important to him than his position on the metaphysics. That is, one would presuppose that when presented with this alternative description of reality, he would modify or abandon his position on the metaphysics, rather than modify or abandon his position on the physics. If there's any evidence to support that presupposition, I'd be interested to see it; meanwhile, second-guessing Einstein's intuition on anything related to physics seems to me like a supremely bad idea.

In pursuit of his disapproval of quantum nondeterminism (whatever the cause of that disapproval), Einstein became an advocate of hidden variable theories. The basic premise of these theories is that the predictions of quantum mechanics are probabilistic only because it is based on an incomplete description of physical reality. Its probability distributions describe what is likely to happen, given that we don't know the values of those physical parameters that are missing from the quantum description, the "hidden variables". A theory of physics accounting for both the parameters of quantum mechanics, and the hidden variables as well, would be deterministic, with only one possible observable outcome from given initial conditions. It might be impossible, even in principle, for any observer from inside physical reality to find the values of these hidden variables (indeed, if quantum mechanics is quite correct, then it must be impossible to find their values); but the hidden variables, and the metaphysical determinism they imply, would be there, nonetheless.

As an attempt to show that hidden variables are really needed to account for quantum mechanics, Einstein and two of his colleagues, Boris Podolsky and Nathan Rosen, devised a thought experiment, now commonly called the EPR paradox. In outline: A centrally located apparatus emits pairs of particles, which travel in opposite directions but are physically constrained, by the nature of the emitter, to have opposite spin (another property of particles in quantum mechanics, like mass or charge... only different; there will be more about spin later in the essay). Two observers, Alice and Bob —whose names conveniently allow them to be represented in a diagram as A and B— stand some distance away on opposite sides of the emitter, and observe the spins of the emitted particles that come their way. As long as an emitted pair of particles remains unobserved, their spins are described in quantum mechanics by a wave function, which is to say (sort of) that their spins are depicted as having only probability distributions, not definite values. When the emitted particles reach Alice and Bob, however, their spins may be observed, at substantially the same time. At that moment, their spins are abruptly represented by definite values, rather than probability distributions — and those definite values are correlated with each other. Yet, Alice and Bob are too far away from each other for information to have passed from either observation event to the other at the speed of light. Therefore, either information was passed between the observations at arbitrarily high speeds (presumed absurd, since it would violate Special Relativity), or the correlation must have been prearranged by hidden variables, perhaps determined at the emitter and propagated to both observation events as hidden attributes of the particles.

One might synopsize the EPR paradox as arguing that without hidden variables —the device needed to restore determinism— the experiment would violate locality. Justifying one's belief in hidden variables by invoking locality, though, only works if one can first justify one's belief in locality; otherwise, in a non-local world, the EPR paradox says nothing one way or the other about the existence of hidden variables.

Bohm's pilot wave

In 1932, John von Neumann, as part of his formulation of an axiomatic foundation for quantum mechanics, proved a theorem that no deterministic hidden variable theory can be experimentally indistinguishable from quantum mechanics. In 1952, David Bohm published a deterministic hidden variable theory experimentally indistinguishable from quantum mechanics.

They were, at least in some sense, both right. Von Neumann's proof correctly showed that a certain class of formal theories are necessarily experimentally distinguishable from quantum mechanics. Bohm's theory is experimentally indistinguishable from quantum mechanics. Therefore, one can immediately deduce, quite rightly, that Bohm's theory does not belong to the class of theories addressed by von Neumann's theorem. The moral of the story, or at least one such, is that theorems of the form "no scientific theory of type X can do Y", where type X is understood semantically (rather than formally), should be taken with a grain of salt — because any proof of the theorem must be limited by a formal definition of type X, while a counterexample to the theorem is limited only by human understanding of type X. Since human understanding is more flexible than formalism (indeed, formalism isn't flexible at all), sooner or later someone is liable to find a gap between them where a counterexample can be wedged in.

In broad outline, Bohm's theory worked like this: For each quantum (i.e., "apparent particle"), Bohm posited both a particle and a field. The field, called the quantum potential field of the particle, was simply a reformulation of the wave function, and evolved according to the usual rules of quantum mechanics. The particle had a definite location at all times, and its behavior was guided by its quantum potential field. By presenting the quantum wave function as a classical potential field, Bohm produced identically the predictions of quantum mechanics — but in doing so, he had a potential field that could propagate changes instantaneously across all of space. He had achieved determinism by abandoning locality. Einstein, on seeing Bohm's theory, remarked, "This is not at all what I had in mind."

When presented with a new theory that is experimentally indistinguishable from pre-existing theory, one might fairly ask what useful purpose the new theory can serve. Critics of Bohm's theory certainly asked. Moreover, the question has come up again in recent years, as proponents of superstring theory, looking to justify their faith in a theory that hasn't produced experimentally testable predictions, have suggested that one may prefer one of two experimentally indistinguishable theories based on which one is more mathematically elegant.

Lest this question be perceived by some readers as more decisive than it necessarily is, I suggest the following brief list of possible uses for a new, experimentally indistinct theory. One ought, I think, to be able to judge the usefulness of any given possible attribute of theories (such as mathematical elegance) according to what it contributes to each of these. Bohm's theory, by the way, is only obviously subject to use (1), and perhaps use (4).

The new theory may demonstrate a philosophical point, as with Bohm's demonstration that quantum mechanics isn't inherently nondeterministic. This might sound like a one-shot deal, in that once the demonstration has been made, the theory would have no further use; but once created, the theory might also be useful in one of the other capacities below.
The new theory may make some problems easier to solve, either analytically (using symbolic math) or numerically (using computers) than they would be under the pre-existing theory. If this is the case, it is likely to go both ways; i.e., there may also be some problems that are easier to solve using the pre-existing theory.
The new theory may suggest new experimentally testable predictions. This could happen simply because the different form of the new theory inspires scientists to think of questions that hadn't occurred to them before; or it could happen because, following from use (2), the new predictions would have been too difficult to work out under the old theory. Assuming that the old and new theories are really theoretically equivalent, an experimental refutation of the new theory would also refute the old, even though the old theory might itself have never directly made any refutable prediction.
The new theory may suggest alternative strategies for the invention of rival theories, i.e., theories that disagree with the old theory in experimentally testable ways. Although the devising of rival theories may sound like a revolutionary activity, reserved for times of scientific crisis when the old theory has been experimentally refuted, in fact rival theories can be part of normal scientific work, closely related to use (3), above. New experimental tests of a theory aren't of much interest if they simply do things at random and hope that their consequences will say something useful about the theory. A good experimental test of a theory is one that distinguishes its predictions from those of a rival theory — which requires that a rival be devised, if only as a straw man to be torn down (or startlingly confirmed).

Bell's Theorem

While Bohm's theory established that quantum mechanics does not require nondeterminism, it conspicuously did not resolve whether or not quantum mechanics requires non-locality. In 1964, John Bell proposed a variation on the EPR paradox that would allow an experimental test to distinguish the predictions of quantum mechanics from those of any local hidden variable theory. This is commonly stated in the form of a theorem, called Bell's Theorem:

No physical theory of local hidden variables can reproduce all of the predictions of quantum mechanics.

This was a fairly spectacular result (Einstein's reaction to which is a moot subject of speculation, since Einstein had died nearly a decade earlier); but keep in mind, throughout the following explanation of Bell's reasoning, that his result is of the same form that got Von Neumann's theorem into trouble: "no scientific theory of type X can do Y", where "type X" is understood semantically. In fact, the ultimate purpose of this essay will be to point out a particular class of physical theories that could reasonably be described as "local hidden variable theories", but that do not belong to the class of theories addressed by Bell's Theorem.

(One of the most remarkable things about Bell's proof is that most of the math involved is absurdly simple algebra, on the order of statements like "(−a) − (−b) = b − a". Outside the actual derivation of the quantum mechanical predictions for the experiment —which I will omit— there's really nothing in the proof that should be out of reach for the level of background I'm pitching this essay to. My description draws on the excellent treatment in the Afterword of D.J. Griffiths' Introduction to Quantum Mechanics, Prentice Hall, 1994.)

Bell's experiment is much like the one described earlier for the EPR paradox (which is actually a simplified version of the original EPR experiment, due to David Bohm); but where Einstein Podolsky and Rosen were only concerned with the fact that Alice and Bob's observations were correlated with each other, Bell looked closely at exactly how they were correlated. Bell's reasoning pivoted on a peculiarly non-classical behavior inherent in the particle attribute being measured, spin.

The spin of a particle appears, superficially, to be a simple directed magnitude; but you can't actually measure the direction itself, only the component of the magnitude in some particular direction. This could still be okay: classically, if the spin has direction v and magnitude s, and you measure it in a direction at angle θ away from v, you would expect the component in that direction to be s cos(θ). What's odd —that is, inherently non-classical— about spin is that, apparently, no matter what value θ has, your measurement of the magnitude will always come out to one of just a small set of values — in this case (with s = 1/2), either s or −s. The angle θ only determines how likely your measurement is to be positive rather than negative. If θ is zero, the measurement is sure to be positive; if θ = π (that is, π radians, which is 180 degrees, meaning that you're measuring in exactly the opposite direction from the direction of the spin), the measurement is sure to be negative; and in between these extremes, the probability of a positive measurement is (1 + cos(θ)) / 2 — so that the expected value of the measurement (i.e., the average value you'd get if you could do the measurement many times under exactly the same circumstances) is s cos(θ).

This seems a rather elegant and disarming result, suggestive that perhaps the situation isn't quite as non-classical as it had seemed: even though s cos(θ) isn't what you measure on any one occasion, it is still the expected value of the measurement. However, Bell showed that, on closer analysis, the situation is even more non-classical than it had seemed.

Suppose Alice measures the spins of particles using a detector oriented in direction a, and Bob measures the spins of particles using a detector oriented in direction b. Rather than having them always use the same direction, a = b, suppose that directions a and b are skewed from each other by some angle θ. Each pair of emitted particles will have opposite spin of uniformly random direction. We're interested in the relationship between the measurement by Alice, normalized by dividing by s — call that a — and the measurement by Bob, again normalized by dividing by s — call that b. The value of a is either 1 or −1, and the value of b is either 1 or −1. The particles have spin in exactly opposite directions; so if Alice and Bob measure in exactly the same direction, θ = 0 and a = b, then the product ab will always be −1. If Alice and Bob measure in exactly opposite directions, θ = π and a = −b, then the product ab will always be 1. In general, the expected value of the product ab will be E(ab) = −cos(θ).

(This is one place where informed purists are especially at risk of conniption fits. You don't really get this expected value of ab, as a function of the angle between Alice and Bob's detectors, by reasoning from what I said earlier about the expected value of a single spin measurement as a function of the angle between its detector and the direction of spin. There's nothing wrong with my account of E(ab) itself; the trouble is that, according to quantum mechanics, there is no such thing as the direction of spin as an entity distinct from any measurement. I introduced it as a rhetorical device, in pointing out how nearly classical spin can seem; but no "actual direction of spin" occurs in the quantum mechanical description of the system — it would have to be a hidden variable.)

Next, Bell introduces his notion of a local hidden variable, as a series of assumptions. Suppose that some hidden variable, which Bell calls λ, is established at the time the pair of particles are emitted, and will ultimately determine the outcomes of both spin measurements. Suppose that each of the measurements, a and b, is a function of the hidden variable λ and the orientation of the detector; that is,

a = A(a,λ)
b = B(b,λ).

Finally, suppose that λ is determined independently of the detector orientations a and b; this makes λ the only possible source of correlation between measurements a and b, and thus makes the theory local. Using some of the most elementary things we know about the measurements and their correlations with each other and the orientations of the detectors,

A(v,λ) = ±1

B(v,λ) = ±1

A(v,λ) B(v,λ) = −1.

Therefore,

A(v,λ) = −B(v,λ)

A(v,λ) A(v,λ) = 1.

We haven't specified just what kinds of values λ can take on; they might be real numbers, or complex numbers, or something more exotic, like wave functions; but whatever the possible values of λ look like, there are liable to be an infinite number of them, since the probability distribution of the experimental outcomes is continuous rather than discrete. If there were only finitely many possible values of λ, we could describe the probability distribution of λ by simply specifying for each possible value a finite probability, between 0 and 1, such that the sum of all these probabilities added up to 1. With a possibly-infinite number of values, each individual value may be infinitely unlikely; but we can still describe which regions of the space of possible values are more likely, by means of a probability density function. This is just a function, called ρ, that maps each possible value of λ to a non-negative real number ρ(λ) indicating the relative likelihood that that value of λ will occur. Just as, in the finite case, the probability that λ would be one of a given set of values was the sum of the probabilities of those values, in the infinite case the probability that the value of λ will fall within a given region is the integral of ρ over that region. In particular, the integral of ρ over the entire space of possible values of λ is 1,

∫ ρ(λ) dλ = 1.

The expected value of the product of observations, E(ab), is evidently a function of the two detector orientations, E(ab) = P(a,b). In the finite case, P(a,b) would be the sum, for each possible value of λ, of the value of ab at that λ, A(a,λ) B(b,λ), times the probability of that λ. In the infinite case, P(a,b) is the integral, over all possible values of λ, of the value of ab at that λ, A(a,λ) B(b,λ), times the probability density at that λ, ρ(λ). That is,

P(a,b) = ∫ ρ(λ) A(a,λ) B(b,λ) dλ.

Substituting B(b,λ) = −A(b,λ),

P(a,b) = ∫ ρ(λ) A(a,λ) B(b,λ) dλ

= − ∫ ρ(λ) A(a,λ) A(b,λ) dλ.

Bell now asks how the expected value changes if we reorient Bob's detector, while leaving Alice's alone. That is, he derives an expression for P(a,b) − P(a,c).

P(a,b) − P(a,c) = ( − ∫ ρ(λ) A(a,λ) A(b,λ) dλ ) − ( − ∫ ρ(λ) A(a,λ) A(c,λ) dλ )

= − ∫ ρ(λ) ( A(a,λ) A(b,λ) − A(a,λ) A(c,λ) ) dλ.

Using the identity A(v,λ) A(v,λ) = 1,

A(a,λ) A(b,λ) − A(a,λ) A(c,λ)

= A(a,λ) A(b,λ) − A(a,λ) A(b,λ) A(b,λ) A(c,λ)

= (1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ).

The two factors in the product, (1 − A(b,λ) A(c,λ)) and A(a,λ) A(b,λ), can be bounded because A(v,λ) = ±1:

0 ≤ 1 − A(b,λ) A(c,λ)

−1 ≤ A(a,λ) A(b,λ) ≤ 1.

Therefore,

| (1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ) | ≤ 1 − A(b,λ) A(c,λ).

Now everything is in place, and with a few simple algebraic manipulations of integrals, we'll have Bell's Inequality.

| P(a,b) − P(a,c) | = | − ∫ ρ(λ) ( A(a,λ) A(b,λ) − A(a,λ) A(c,λ) ) dλ |

= | ∫ ρ(λ) ((1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ)) dλ |

≤ ∫ ρ(λ) | (1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ) | dλ

≤ ∫ ρ(λ) (1 − A(b,λ) A(c,λ)) dλ

= ( ∫ ρ(λ) dλ ) − ( ∫ ρ(λ) A(b,λ) A(c,λ) dλ )

= 1 + P(b,c).

Bell's Inequality is just this without the intermediate steps:

| P(a,b) − P(a,c) | ≤ 1 + P(b,c).

We've proven this inequality directly from our assumptions about local hidden variables, and a few basic facts about the behavior of the experiment. What we haven't used yet is our knowledge that, according to quantum mechanics, P(x,y) in general is minus the cosine of the angle between x and y. Suppose that a and b are perpendicular to each other, with c bisecting the plane angle between them. Then the angle between a and b is π / 2, and the angle between c and each of the others is π / 4. Therefore,

P(a,b) = 0

P(a,c) = P(b,c) = −sqrt(1/2).

The square root of 1/2 is slightly more than 0.707. Plugging these choices of a, b, and c into Bell's Inequality,

| P(a,b) − P(a,c) | ≤ 1 + P(b,c)

sqrt(1/2) ≤ 1 − sqrt(1/2)

0.707 < 0.303.

In other words, the numerical predictions of quantum mechanics for this experiment do not satisfy Bell's Inequality, and therefore they cannot possibly result from any hidden variable theory that satisfies Bell's assumptions.

Besides distinguishing the predictions of quantum mechanics from those of local hidden variable theories (in Bell's sense), Bell's Inequality also provides an experimental means to test whether or not we live in a physical world that can be described by a local hidden variable theory. Build the Bell experimental apparatus, run the experiment very many times, and see what the shape of the statistical distribution of results is. If the observations violate Bell's Inequality to a high degree of statistical confidence, then we are just that confident that we don't live in a world that can be accurately described by a local hidden variable theory. Experimental physicists have actually done this, and concluded that we don't live in a world described by local hidden variables — though, of course, this is a tremendously delicate experiment, and, in keeping with the principles of experimentally based skepticism that make science work, the reliability of the experimental procedures in every Bell test to date is challenged by some critics.

Higher-order time

(This may sound disconnected from what I've just been talking about, but it will lead back into the main stream of the essay further down.)

A classic argument against the concept of time travel is the grandfather paradox. It works like this: Suppose you build a time machine, travel back in time, meet your father's father before he meets your father's mother, and kill him. (Yes, it's a rather violent thought experiment, but it's traditional.) Since your grandfather died before having children, your father would never be born, therefore you would never be born, therefore you wouldn't build a time machine, go back in time, and kill your grandfather — in which case, since he wasn't killed, you would be born, build a time machine, and go back and kill him. In short, your birth causes you not to be born, while your not being born causes you to be born, in a vicious circle of causation.

Some modern physical theories do allow the possibility of time travel, so they have to take some position that resolves the grandfather paradox. The usual solution is to postulate that all parts of physical reality are always consistent with each other. Just as, in the Bell experiment, the equations of quantum mechanics will not permit any solution that fails to correlate Alice's observations with Bob's, in our time-travel scenario, the equations that govern the flow of history (whatever those equations are) will not permit any solution that involves a grandfather paradox. If you go back in time and kill your grandfather, then it must be that your grandfather was killed at that time, by you, and somehow this didn't prevent you from existing. If one were plotting a science fiction story, one could imagine really outre explanations for this, involving, say, highly advanced genetic technologies, but it really isn't necessary to do so. There is certainly some explanation, because if there weren't, it all wouldn't have happened. When people started talking about the grandfather paradox, for example, there was no such thing as a sperm bank.

Each of these scenarios, in which there is some neat explanation for how it happens that you exist even though you went back in time and killed your grandfather, is an example of a stable state of history, that is, a state in which no part of the scenario causes any part to change. In any given physical system that is subject to change —specifically, deterministic change— there may be one or more stable states, which are determined by how change to the system works. As the system jostles about, if it ever happens to enter a stable state, it will stay there — because saying that the system won't leave that state on its own is just a different way of explaining what we mean by "stable". Determinism is generally implied because without it, there would always be a chance that the system would randomly leave an otherwise stable state, so that no state would be entirely stable. (This implication also degrades gracefully: approximate stability implies at least approximate determinism.) In classical physics, when you see a physical system in a complicated stable state —a state that looks fragile— you expect that it got to that state by jostling around through unstable states until it happened into the stable one. We can use this same technique to account for the stable state of history as well, i.e., we can assume that history will jostle around unstably until it enters a stable state — but before we can make that assumption, we have to answer a more basic question: what does it mean to say that history "changes"?

Whenever we say that something changes, we mean that it changes along some dimension; that is, its state at one coordinate along that dimension is different from its state at another coordinate along that dimension. Usually the dimension is understood from context. When we say that a board varies in thickness, we probably mean that its thickness varies along its length; perhaps it's thicker toward one end than toward the other. When we say that a star varies in brightness, we probably mean that its brightness varies over time. When we talk about history changing, we need to explain along what dimension it changes. It doesn't make any logical sense, to me anyway, to talk about something changing along a dimension that is part of the thing that changes; and history is a four dimensional structure, spread out over all of the three dimensions of space and one of time; therefore, if history changes, it can't be changing over any of those four dimensions.

So let's postulate an additional dimension over which history changes. For this essay, I'll call it meta-time. In the following table, I've tried to describe how meta-time copes with the grandfather paradox. History evolves, over meta-time, through a sequence of states; and in each state of history, the three-dimensional universe varies over time (that's ordinary time) through a sequence of states. Each column of the table is a moment in time, and each row of the table is a moment in meta-time. Each entry describes what happens at a given moment in time, in the version of history that exists at a given moment in meta-time.

Meta-time evolution of the grandfather paradox.
	Time T1	Time T2	Time T3
M1	Your grandfather is minding his own business.	You are born.	You get in your time machine, and embark on a journey into the past.
M2	You kill your grandfather.	You are born. (The consequences of your killing your grandfather have to propagate across history, and haven't gotten this far yet.)	You get in your time machine, and embark on a journey into the past.
M3	You kill your grandfather.	You aren't born, because your father never existed to sire you.	You get in your time machine, and embark on a journey into the past.
M4	You kill your grandfather.	You aren't born, because your father never existed to sire you.	You don't get into your time machine, because you never existed.
M5	Your grandfather is minding his own business.	You aren't born, because your father never existed to sire you.	You don't get into your time machine, because you never existed.
M6	Your grandfather is minding his own business.	You are born.	You don't get into your time machine, because you never existed.
M7	Your grandfather is minding his own business.	You are born.	You get in your time machine, and embark on a journey into the past.

State M7 is identical to M1, and the sequence starts all over. It might seem that history would simply loop indefinitely through this same sequence of states, without ever reaching a stable state (stable states of history being, remember, the reason we postulated meta-time in the first place); but I'll come back to that.

The reason this meta-time evolution de-paradoxifies the grandfather paradox is that when you change history by killing your grandfather, that change doesn't instantly affect all of history. It propagates forward across time, at a finite rate (measured in displacement across time per unit of meta-time), and thereby prevents any circular causation. The event at M2-T2 (which is you being born) is not causally affected by events, such as M2-T1, that are earlier in time but not earlier in meta-time; it is affected by events, such as M1-T1, that are earlier in both time and meta-time. We already have a name for the property that causation propagates at a finite rate: it's called locality.

Admittedly, locality in this case is rather different from the sort that Bell and Einstein were considering. To completely disambiguate the sense of locality, I could specify both the dimension over which change occurs, and the dimension(s) in which effects are localized: the sort presented so far in this section is locality in time, over meta-time, while the sort discussed earlier was locality in space, over time. When the intent seems clear from context, I'll leave off as many qualifications as I can. Locality over time always localizes in space. For this section, locality over meta-time always localizes in time, and it often doesn't matter whether it also localizes in space; and in later sections, locality over meta-time will always localize in both time and space; so I'll usually just say "locality over meta-time".

Locality over meta-time precludes locality over time. This is because locality over time is a property of history (that is, the state of history at any particular moment in meta-time) that constrains the way the three-dimensional universe evolves over time — but given meta-time locality, there is no causal connection between the states of the universe at different times but the same meta-time. That was why at M2-T2 you could be born even though at M2-T1 your grandfather was killed before meeting your grandmother (disarming the paradox).

Note that, whereas Bell asked whether quantum mechanics could tolerate locality (to which his answer was "no"), our meta-time solution of the grandfather paradox doesn't just tolerate it, but exploits it as a positive asset (because without it, the paradox wouldn't be solved).

One can similarly distinguish between determinism over meta-time, which we consider here, and determinism over time, which Einstein and Bohm considered. Again, locality over meta-time precludes causation across time within any given history, and therefore precludes determinism over time. However, while locality over meta-time destroys these properties over time in general, determinism over meta-time actually enables them to be recovered in a limited form. Remember that we postulated meta-time so that we could have stable states of history — and determinism is what generally enables the possibility of stable states. Given determinism over meta-time, the rules by which history changes over meta-time might be set up in such a way that, in any stable version of history, the state of the three-dimensional universe at any given time can be predicted entirely from states of the universe at preceding times. Thus, determinism over time might hold for all stable states of history, even though it doesn't hold for arbitrary states of history — and, for that matter, some rules for how history changes would similarly endow stable states of history with locality over time. (Of course, if stable states of history are both deterministic and local over time, and are also consistent with quantum mechanics, then they must violate some other precondition of Bell's Theorem, most likely by correlating λ with a and b.)

I promised to come back to the question of whether the evolution of history might get stuck in a loop of repeating states, preventing any stable version of history from being reached, as suggested by the table. If the change to history that you have made by killing your grandfather is the only factor causing history to change, then a loop of states of history does seem possible; but if many different details of history are changing, then they will interact in chaotic ways, and it becomes unlikely that a cycle of states of history would repeat itself perfectly. The more details of history are changing at once, the more unlikely a cycle of states of history becomes. It would therefore seem desirable, for this purpose at least, that the rules governing history should involve common occurrences in which effects propagate backward through time — perhaps time travel in the routine interactions of elementary particles, thus creating a great deal of random "noise" in the evolution of history that would tend to spoil any cyclic pattern. Effects propagating backward through time are also interesting because they allow a set of laws, both deterministic and local over meta-time, to produce stable versions of history that violate Bell's Inequality — which is the final subject of the essay.

Quantum locality

The main point of the essay is now at hand: to show that a set of rules for the evolution of history can be deterministic over meta-time, and local over meta-time in both time and space, while stable versions of history satisfy the quantum-mechanical prediction for the Bell experiment, therefore violating Bell's Inequality. Despite its far-reaching potential implications (which I will discuss, briefly, below in the conclusion), the result I'm showing here is itself very limited. The set of rules I'll propose don't cover any kind of situation except for Bell's experiment; and even in that case, they are entirely ad hoc, merely serving to demonstrate that Bell's inequality does not necessarily hold for this type of theory.

First, I'll describe rules that are local over meta-time and consistent with the quantum-mechanical predictions for the experiment, but are not deterministic. These rules do give rise to stable versions of history, but call for repeated generation of uniformly random numbers until a stable state is reached. Then, to complete the demonstration, I'll simply point out how the entire indefinite sequence of random numbers could be extracted from a single uniformly random initial condition.

The rules are only concerned with three positions in space-time: E, the time and place where the emitter emits the pair of particles; A, the time and place where Alice's detector measures the spin of one particle; and B, the time and place where Bob's detector measures the spin of the other particle.

The behavior of E (the emitter at the start of the experiment) has three possible states:

Send a signal in the directions of space-time positions A and B, telling them to generate observations and report back the results. Then enter state (2).
Wait to receive results of observations from both directions — the direction of A and the direction of B. These results are spatial directions; call them x and y. Once both are received, enter state (3).
Having received x and y, generate a uniformly random number λ in the range −1 ≤ λ ≤ +1. If the dot product of x and y —that is, the product of their magnitudes times the cosine of the angle between them— is greater than λ, enter state (1); otherwise, return to state (2).

Each observer at the end of the experiment, X (which is either A or B), maintains a memory of the orientation of its detector, v (which is either a or b), and the value observed, s (which is either −1 or +1). Whenever X receives a signal from E telling it to report an observation, it chooses a new value for s, uniformly at random, and then sends back to E the product s v.

At meta-time M₀, no signals are in transit, and E is in state (1).

Each time E sends out requests, A and B will randomly generate new observations, and report back to E the products of these observations with the detector orientations. Once the replies reach E, it will randomly decide whether to accept or reject them. If it rejects the observations, it sends out fresh requests, and the whole transaction starts over again. If it accepts the observations, it returns to state (2), and the history of the experiment is then stable — because E is waiting for signals from A and B, which are complacently waiting for a signal from E.

Starting from meta-time M₀, there is no finite upper bound on how many times the emitter–observer transaction may be repeated; but because the probability of reaching a stable state on any given iteration has a fixed, non-zero value (which we will calculate in a moment), there is a one-hundred percent probability that eventually a stable history of the experiment will be reached.

Let a and b be the detector orientations, and let θ be the angle between them (all of which we are assuming to be constant throughout meta-time). On any given iteration, let a and b be the observations. The probability that ab = +1 is 1/2, and the probability that ab = −1 is 1/2. The reported values are x = a a and y = b b, so the dot product of x and y is ab cos(θ), and the probability of acceptance is just the probability that ab cos(θ) ≤ λ. Then,

ab cos(θ) ≤ λ

iff   −λ ≤ −ab cos(θ)

iff   1 − λ ≤ 1 − ab cos(θ)

iff   (1 − λ)/2 ≤ (1 − ab cos(θ))/2.

Since λ is uniformly distributed on the interval from −1 to +1, (1 − λ)/2 is uniformly distributed on the interval from 0 to 1. Moreover, for any real number p in the interval from 0 to 1, the probability that (1 − λ)/2 ≤ p is just p. Therefore, the probability that E will accept on any given iteration, as a function of the observations a and b, is (1 − ab cos(θ))/2.

The probability that ab = +1 is 1/2, as noted earlier; and if ab = +1, then the probability of acceptance is (1 − cos(θ))/2; therefore, the probability that ab = +1 and E accepts is the product of these, (1 − cos(θ))/4. Similarly, if ab = −1 then the probability of acceptance is (1 + cos(θ))/2, therefore, the probability that ab = −1 and E accepts is (1 + cos(θ))/4. Altogether, the probability that E accepts is the sum of the probability that ab = +1 and E accepts, plus the probability that ab = −1 and E accepts, which is

(1 − cos(θ))/4 + (1 + cos(θ))/4 = (1 − cos(θ) + 1 + cos(θ))/4

= 1/2.

The probability that ab = +1 given that E accepts on this iteration is just the probability that E accepts with ab = +1, divided by the probability that E accepts with any value of ab. This is

((1 − cos(θ))/4) / (1/2) = (1 − cos(θ))/2.

In other words, the probability that ab = +1 given that E accepts on this iteration is just twice the probability that ab = +1 and E accepts on this iteration. By similar reasoning, the probability that ab = −1 given that E accepts on this iteration is (1 + cos(θ))/2.

The probability that the accepting state will have ab = +1 does not depend on how many rejecting iterations occur before the accepting iteration. Therefore, the probability that the eventual stable history of the experiment will have ab = +1 is (1 − cos(θ))/2, and the probability that the eventual stable history of the experiment will have ab = −1 is (1 + cos(θ))/2. The expected value of ab in the eventual stable history is the sum of each possible value of ab times the probability of that value, which is

E(ab) = +1 × (1 − cos(θ))/2 + −1 × (1 + cos(θ))/2

= (1 − cos(θ) − 1 − cos(θ))/2

= −cos(θ).

This is just the expected value predicted for the experiment by quantum mechanics, that violates Bell's inequality and therefore, according to Bell's chosen definitions, cannot be produced by any local hidden variable theory.

There is still one detail to clean up. My claim was that Bell's Inequality could be violated by a set of rules that would be deterministic and local over meta-time; but the rules I've described thus far are nondeterministic. Because I'm only trying to give an existence proof, not anything like a serious candidate for general physical theory, all I have to do is describe how to overlay a superficially deterministic structure on top of the mathematical model — rather like the way Bohm overlaid a superficial deterministic structure on top of quantum mechanics without disrupting the pre-existing structure.

Here is my technique for introducing determinism: Assume that E maintains a stored real number, called h. Just before meta-time M₀, h has a value uniformly randomly distributed on the interval between 0 and 1. All the random numbers required for the entire evolution of the experiment history can then be extracted from this initial value of h.

To choose a uniformly random value for a, if h ≥ 1/2 then let a = 1, otherwise let a = −1; and in either case, compute a new value for h by multiplying the old value by 2 and, if the result is greater than 1, subtracting 1. Thus, if a = 1 then h ← (2 h) − 1, otherwise h ← 2 h. The new value of h is again uniformly random on the interval from 0 to 1, and independent of the value chosen for a. This value for a can be passed to A as part of the signal from E to A, while the seed number h remains at E.
Uniformly random values for b can be extracted from h similarly.
To extract a uniformly random value for λ, let λ = (2 h) − 1. λ is uniformly random on the interval from −1 to +1. If the value of λ causes E to accept the observations, then there is no need for a new value of h, because the experiment history has reached a stable state. Otherwise, we know the value of ab cos(&theta), and we know that ab cos(&theta) > λ. Then λ is uniformly random on the interval from ab cos(&theta) to +1. Assign h ← (&lambda − ab cos(θ)) / (1 − ab cos(θ)). (If the denominator in this division were zero, we wouldn't be computing it because the observations would have been accepted.) Then h is uniformly random on the interval from 0 to 1, and contains no information about previous states of the experiment.

Conclusion

The key significance in the suggested class of physical theories using meta-time is that it separates mathematical locality and determinism from physical locality and determinism. Viewed in this light, Bell's Theorem says that the predictions of quantum mechanics cannot be duplicated by a theory that is physically local and deterministic — but says nothing about a theory that is only mathematically local and deterministic.

Recall, from near the top of the essay, that under classical physics after Special Relativity, the three-dimensional universe evolved over time, and in order to predict exactly how it would evolve within a bounded region of space over a bounded interval of time, one only had to know about the contents of a somewhat larger bounded region at the start of the interval. This is the essence of physical local determinism, which Bell addressed. The analogous mathematical properties are derived by abstracting away from the particular choices of time and space as the dimensions involved: The system being studied evolves over some dimension of change; and in order to predict exactly how it will evolve within a bounded region of the system over a bounded interval of the dimension of change, one only has to know about the contents of a somewhat larger bounded region at the start of the interval. Bell assumed that the dimension of change was time, and the system being studied was the three-dimensional universe; whereas, for the class of theories described in the last section, the dimension of change is meta-time, and the system being studied is the four-dimensional history of the universe.

Bell's choice of dimensions suggests the classical view of physics as the science of predicting, from the current state of a physical system, what will happen to it next — despite the fact that, if the hidden variables are truly hidden in theory, then no such prediction could be made in practice even though the hidden variables exist in theory. Under the suggested alternative choice of dimensions, we cannot expect to predict exactly how a physical system will evolve, because we cannot possibly know the entire state of history at the start of a meta-time interval; we only have direct access to a cross-section of history at a particular time, in a stable version of history after all the evolution has already happened. On one hand, this view of physics as predicting possible ways for systems to evolve is very much in keeping with the approach admitted by quantum mechanics. On the other hand, the form of mathematics used in a physics of meta-time could be startlingly classical in character. One might actually be able to describe the evolution of the history of a bounded physical system, over a bounded interval of meta-time, using only a finite number of ideal particles and real-valued fields in a mere four dimensions, without ever encountering such eldritch creatures as wave functions or Hilbert spaces. (There would still have to be one notable difference, however, from the typical nineteenth-century model, beyond simply that the particles and fields inhabit four dimensions rather than three: an "ideal particle" would be one-dimensional, the curve traced over time by a zero-dimensional point-particle in space.)

In the discussion of Bohm's hidden variable theory, I suggested four kinds of uses for a new theory that cannot be experimentally distinguished from pre-existing theory. If a physical theory with locality and determinism over meta-time were experimentally indistinguishable from quantum mechanics, its existence would automatically serve use (1); while a sharp contrast of mathematical techniques from quantum mechanics seems likely to favor the other three kinds of uses.

Note that, while locality and determinism are fundamental to the scenario, only a weak form of stability of history is required. The potential uses for an indistinguishable theory of physics over meta-time are plausible because the theory is likely to use mathematical techniques different from those of quantum mechanics; and its mathematical techniques are likely to be different because the theory has local determinism — but stability only enters the picture because of the way we are connecting the theory with quantum mechanics. We expect that, after some point in meta-time, the time-space pattern of observations predicted by the meta-time theory will be stable, and the predicted statistical distribution of these patterns will be the same as under quantum mechanics. This doesn't actually require that the whole state of history be stable after that point, provided that all states after that point agree on the time-space pattern of observations. Fluctuations of history that don't change the pattern of observations —essentially, fluctuations of the hidden variables— should not invalidate the theory as a practical or even theoretical tool. Even its philosophical implications might be only partially modified.

While a theory of physics based on meta-time might be characterized, broadly, as a "hidden variable theory", it should be acknowledged that the "hidden" part of the theory has greater scope than Bohm, or even Bell (who was going out of his way to be very general), were talking about. Both envisioned hidden variables as populating space and evolving over time, hence particle-like or field-like, augmenting quantum mechanics with mathematical structures of scope substantially comparable to that of the wave functions of quantum mechanics. Under the meta-time hypothesis, though, almost all of the mathematical model is missing from the quantum mechanical description; indeed, if one interprets meta-time as a real dimension rather than just a mathematical device, then almost all of reality is beyond our observation, including the most important dimension (since apparent change over time is a mere symptom of causal change over meta-time).

Back in the first section of the essay, I expressed doubt as to whether Einstein's metaphysical reasoning was an excuse for his physical intuition, or prior to it. I can now be more specific: it seems consistent with Einstein's writings (those, at least, that I am familiar with) that his primary intuitive requirement was mathematical locality and determinism, while physical locality and determinism were essentially symptomatic of the mathematical requirement. Of course the only sure test would be Einstein's reaction when presented with a theory of physics based on the meta-time hypothesis. Since we presumably can't carry out that experiment, either the value measured by the experiment doesn't exist, or we don't know what the value is.

John Shutt's Home Page

P(a,b)	=	∫ ρ(λ) A(a,λ) B(b,λ) dλ
	=	− ∫ ρ(λ) A(a,λ) A(b,λ) dλ.

P(a,b) − P(a,c)	=	( − ∫ ρ(λ) A(a,λ) A(b,λ) dλ ) − ( − ∫ ρ(λ) A(a,λ) A(c,λ) dλ )
	=	− ∫ ρ(λ) ( A(a,λ) A(b,λ) − A(a,λ) A(c,λ) ) dλ.

	A(a,λ) A(b,λ) − A(a,λ) A(c,λ)
=	A(a,λ) A(b,λ) − A(a,λ) A(b,λ) A(b,λ) A(c,λ)
=	(1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ).

\| P(a,b) − P(a,c) \|	=	\| − ∫ ρ(λ) ( A(a,λ) A(b,λ) − A(a,λ) A(c,λ) ) dλ \|
	=	\| ∫ ρ(λ) ((1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ)) dλ \|
	≤	∫ ρ(λ) \| (1 − A(b,λ) A(c,λ)) A(a,λ) A(b,λ) \| dλ
	≤	∫ ρ(λ) (1 − A(b,λ) A(c,λ)) dλ
	=	( ∫ ρ(λ) dλ ) − ( ∫ ρ(λ) A(b,λ) A(c,λ) dλ )
	=	1 + P(b,c).

\| P(a,b) − P(a,c) \|	≤	1 + P(b,c)
sqrt(1/2)	≤	1 − sqrt(1/2)
0.707	<	0.303.

	ab cos(θ)	≤	λ
iff	−λ	≤	−ab cos(θ)
iff	1 − λ	≤	1 − ab cos(θ)
iff	(1 − λ)/2	≤	(1 − ab cos(θ))/2.

(1 − cos(θ))/4 + (1 + cos(θ))/4	=	(1 − cos(θ) + 1 + cos(θ))/4
	=	1/2.

E(ab)	=	+1 × (1 − cos(θ))/2 + −1 × (1 + cos(θ))/2
	=	(1 − cos(θ) − 1 − cos(θ))/2
	=	−cos(θ).