Ergodicity

☆ Save On Wikipedia ↗

In mathematics and Ergodic theory, ergodicity expresses the idea that a point of a moving system, either a dynamical system or a stochastic process, will eventually visit all parts of the space in which the system moves, in a uniform and random sense.[1]. Uniform and random implies that the average behavior of the system can be deduced from the trajectory of a "typical" point. Equivalently, a sufficiently large collection of random samples from a process can represent the average statistical properties of the entire process.

The property the flow needs to satisfy in order to equate time means and phase means in real valued functions is what is now called ergodicity[2]

Ergodicity is also a property of the trajectories of a system; it is a statement that any system can be reduced or factored into smaller atomic groups of trajectories defined by ergodic measures and that some systems cannot be further decomposed.[3]

Ergodic systems occur in a broad range of systems in physics and in geometry. This can be roughly understood to be due to a common phenomenon: the motions of particles, that is, geodesics, on a hyperbolic manifold are divergent; when that manifold is compact, that is, of finite size, those orbits return to the same general area, eventually filling the entire space.

 Ergodicity is the condition under which the strong law of large numbers holds for a dynamical system[4]

Ergodic systems capture the common-sense, everyday notions of randomness, such that smoke might come to fill all of a smoke-filled room, or that a block of metal might eventually come to have the same temperature throughout, or that flips of a fair coin may come up heads and tails half the time. A stronger concept than ergodicity is that of mixing, which aims to mathematically describe the common-sense notions of mixing, such as mixing drinks or mixing cooking ingredients.

The proper mathematical formulation of ergodicity is founded on the formal definitions of measure theory and dynamical systems, and rather specifically on the notion of a measure-preserving dynamical system. The origins of ergodicity lie in statistical physics, where Ludwig Boltzmann formulated the ergodic hypothesis.

Informal explanation

Ergodicity occurs in broad settings in physics and mathematics.[5] All of these settings are unified by a common mathematical description, that of the measure-preserving dynamical system. Equivalently, ergodicity can be understood in terms of stationary stochastic processes[6].

Measure-preserving dynamical systems

The mathematical definition of ergodicity aims to capture ordinary every-day ideas about randomness and ground them in measure theory and dynamical systems. [7] This includes ideas about systems that move in such a way as to (eventually) fill up all of space, such as diffusion and Brownian motion, as well as common-sense notions of mixing.[8]

As an example consider the mixing of the dust in Saturn's rings: the equation of motions are deterministic, the system is in general not integrable [9], there is an effective dynamics which is essentially the Navier Stokes equations [10], it is about statistical distributions of particles in stable orbits but the dynamics is hyperbolic and unstable[11], therefore the Kolmogorov-Sinai entropy is not zero[12]

The second important motivation behind ergodicity is the analogy with mixing, ergodic theory can be motivated by phase transitions, and phase transitions are deeply interlinked with the concept of mixing of different fluids and different phases.[13] There is a hierarchy of types of mixing that lead from weak mixing, to strong mixing and to chaos[14], for example a maxwell distribution at equilibrium, implies a fully mixed gas and microstates, the existence of local and global averages both in time and space, where instead a fully developed turbulence can be interpreted as fluids that do not mix at any energy scale.

The extremal equilibrium states are τ-ergodic measures. They are interpreted as pure thermodynamic phases. Since the equilibrium states correspond to tangents to the graph of P, the discontinuities of the derivative of P correspond to phase transitions. One would thus like to know if P is piecewise analytic (in a suitable sense). An extremal equilibrium state σ may have a non-trivial decomposition into extremal Gibbs states. This is an example of symmetry breaking (the broken symmetry is the invariance under τ).[15]

Ergodic processes

The measure theoretic discussion assumes a concept of volume and integral, typically from a Lebesgue measure. Intutively when mixing different liquids the volume of each liquid is preserved. More generally in a statistical system the measure is a probability measure and is normalized to one. In this sense measure theory is dual to Kolmogorov axioms of probabilty[16] and is a global approach on the full space of avaialble states. Locally instead one can see time evolution as a stochastic process, with a transfer operator from one state to the following state, such as in a Markov chain or as subsequent coin tosses.

Quick example

As an example, this can be seen explicitly in the case of the Bernoulli scheme:

  1. it is possible to treat the dynamical system as a Bernoulli process, with subsequent coin tosses and the evolution process across multiple states is described by a string like 0.10010101... [17][18]
  2. it is also possible to treat the dynamical system as a given infinite string of 1 and 0, a state of the system can then be represented as a binary number such as a starting state: 0.10010101, and the time evolution is the shift operator leading to a subsequent state 0.0010101. Each binary digit is shifted to the left, or even more precisely the time evolution is the dyadic map i.e T → T : 2 x mod 1 {\displaystyle T\to T:2x\mod 1} {\displaystyle T\to T:2x\mod 1}. In fact 2 ∗ ( 0.10010101 ) mod 1 = 0.0010101 {\displaystyle 2*(0.10010101)\mod 1=0.0010101} {\displaystyle 2*(0.10010101)\mod 1=0.0010101} and the measure is the Bernoulli measure

One can also easily understand in this example that for each time step a digit, and a piece of information is destroyed by the modulo operation and the system somehow destroy information and creates entropy.

Formalization of the example

Let's now take a random process, the Bernoulli process, and convert it to a measure-preserving dynamical system ( X , B , μ , T ) {\displaystyle (X,{\mathcal {B}},\mu ,T)} {\displaystyle (X,{\mathcal {B}},\mu ,T)}. Let's consider the space X {\displaystyle X} {\displaystyle X} of the set of infinite sequences of heads and tails, we will use three symbols: h for heads, t for tails and * for "don't know". One generic element can be described as ( ∗ , ⋯ , ∗ , h , ∗ , ⋯ ) {\displaystyle (*,\cdots ,*,h,*,\cdots )} {\displaystyle (*,\cdots ,*,h,*,\cdots )}. The space X {\displaystyle X} {\displaystyle X} is the product of the individual space of coin tosses with probabilities {p,1-p} and we can represent it as { p , 1 − p } N {\displaystyle \{p,1-p\}^{\mathbb {N} }} {\displaystyle \{p,1-p\}^{\mathbb {N} }} where we used the natural numbers to explicitely show the time direction. Each individual space of a single coin toss has a sigma-algebra[19], equivalently each coin toss has also a discrete topology and is Hausdorff. The sigma-algebra is also a Borel sigma algebra because is minimal. From a σ-algebra standpoint the resulting σ-algebra is a product of sigma algebras and from a topological standpoint the topology is the product topology of the individual spaces of coin tosses and a set like ( ∗ , ⋯ , ∗ , h , ∗ , ⋯ ) {\displaystyle (*,\cdots ,*,h,*,\cdots )} {\displaystyle (*,\cdots ,*,h,*,\cdots )} is the cylinder set of the single extractions. The set of all possible intersections, unions and complements of the cylinder sets then form the Borel set B {\displaystyle {\mathcal {B}}} {\displaystyle {\mathcal {B}}}. These can be also summarized saying that it is a Radon measure, i.e. that measure and topology are consistent. In formal terms, the cylinder sets form the base for a topology on the space X {\displaystyle X} {\displaystyle X} of all possible infinite-length coin-flips.

We now can define a measure μ {\displaystyle \mu } {\displaystyle \mu } on X with the following properties:

  1. the volume and the probability of the total space as μ ( X ) = 1 {\displaystyle \mu (X)=1} {\displaystyle \mu (X)=1}
  2. the probability of a single coin toss is then μ ( ∗ , ⋯ , ∗ , h , ∗ , ⋯ ) = p {\displaystyle \mu ({*,\cdots ,*,h,*,\cdots })=p} {\displaystyle \mu ({*,\cdots ,*,h,*,\cdots })=p}, i.e this is the volume of the cylinder set
  3. the probability of it's complement is μ ( ∗ , ⋯ , ∗ , t , ∗ , ⋯ ) = 1 − p {\displaystyle \mu ({*,\cdots ,*,t,*,\cdots })=1-p} {\displaystyle \mu ({*,\cdots ,*,t,*,\cdots })=1-p}
  4. the probability of a combination of tosses is then the product of probabilities μ ( ∗ , ⋯ , ∗ , h , ∗ , ⋯ , ∗ , t , ∗ , ⋯ ) = μ ( ∗ , ⋯ , ∗ , t , ∗ , ⋯ ) μ ( ∗ , ⋯ , ∗ , h , ∗ , ⋯ ) {\displaystyle \mu ({*,\cdots ,*,h,*,\cdots ,*,t,*,\cdots })=\mu ({*,\cdots ,*,t,*,\cdots })\mu ({*,\cdots ,*,h,*,\cdots })} {\displaystyle \mu ({*,\cdots ,*,h,*,\cdots ,*,t,*,\cdots })=\mu ({*,\cdots ,*,t,*,\cdots })\mu ({*,\cdots ,*,h,*,\cdots })}
  5. The probability for a generic set of n tosses with k heads and (n-k) tails is then P ( [ ω 1 , ω 2 , ⋯ , ω n ) = μ ( h , t , h , ⋯ , t , t , h ) = p k ( 1 − p ) n − k {\displaystyle P([\omega _{1},\omega _{2},\cdots ,\omega _{n})=\mu (h,t,h,\cdots ,t,t,h)=p^{k}(1-p)^{n-k}} {\displaystyle P([\omega _{1},\omega _{2},\cdots ,\omega _{n})=\mu (h,t,h,\cdots ,t,t,h)=p^{k}(1-p)^{n-k}}

All together, these form the axioms of a sigma-additive measure; measure-preserving dynamical systems always use sigma-additive measures. For coin flips, this measure is called the Bernoulli measure. For the coin-flip process, the time-evolution operator T {\displaystyle T} {\displaystyle T} is the shift operator, more precisely the Bernoulli shift, that says "throw away the first coin-flip, and keep the rest". Formally, if ( x 1 , x 2 , ⋯ ) {\displaystyle (x_{1},x_{2},\cdots )} {\displaystyle (x_{1},x_{2},\cdots )} is a sequence of coin-flips, then T ( x 1 , x 2 , ⋯ ) = ( x 2 , x 3 , ⋯ ) {\displaystyle T(x_{1},x_{2},\cdots )=(x_{2},x_{3},\cdots )} {\displaystyle T(x_{1},x_{2},\cdots )=(x_{2},x_{3},\cdots )}.

The measure is obviously shift-invariant: as long as we are talking about some set A ∈ A {\displaystyle A\in {\mathcal {A}}} {\displaystyle A\in {\mathcal {A}}} where the first coin-flip x 1 = ∗ {\displaystyle x_{1}=*} {\displaystyle x_{1}=*} is the "don't care" value, then the volume μ ( A ) {\displaystyle \mu (A)} {\displaystyle \mu (A)} does not change: μ ( A ) = μ ( T ( A ) ) {\displaystyle \mu (A)=\mu (T(A))} {\displaystyle \mu (A)=\mu (T(A))}. In order to avoid talking about the first coin-flip, it is easier to define T − 1 {\displaystyle T^{-1}} {\displaystyle T^{-1}} as inserting a "don't care" value into the first position: T − 1 ( x 1 , x 2 , ⋯ ) = ( ∗ , x 1 , x 2 , ⋯ ) {\displaystyle T^{-1}(x_{1},x_{2},\cdots )=(*,x_{1},x_{2},\cdots )} {\displaystyle T^{-1}(x_{1},x_{2},\cdots )=(*,x_{1},x_{2},\cdots )}. With this definition, one obviously has that μ ( T − 1 ( A ) ) = μ ( A ) {\displaystyle \mu {\mathord {\left(T^{-1}(A)\right)}}=\mu (A)} {\displaystyle \mu {\mathord {\left(T^{-1}(A)\right)}}=\mu (A)} with no constraints on A {\displaystyle A} {\displaystyle A}. This is again an example of why T − 1 {\displaystyle T^{-1}} {\displaystyle T^{-1}} is used in the formal definitions.

Other processes

The same conversion (equivalence, isomorphism) can be applied to any stochastic process.. Thus, an informal definition of ergodicity is that a sequence is ergodic if it visits all of X {\displaystyle X} {\displaystyle X}; such sequences are "typical" for the process. Another is that its statistical properties can be deduced from a single, sufficiently long, random sample of the process (thus uniformly sampling all of X {\displaystyle X} {\displaystyle X}), or that any collection of random samples from a process must represent the average statistical properties of the entire process (that is, samples drawn uniformly from X {\displaystyle X} {\displaystyle X} are representative of X {\displaystyle X} {\displaystyle X} as a whole.) In the present example, a sequence of coin flips, where half are heads, and half are tails, is a "typical" sequence.

Collateral properties

There are several important points to be made about the Bernoulli process. If one writes 0 for tails and 1 for heads, one gets the set of all infinite strings of binary digits. These correspond to the base-two expansion of real numbers. Explicitly, given a sequence ( x 1 , x 2 , ⋯ ) {\displaystyle (x_{1},x_{2},\cdots )} {\displaystyle (x_{1},x_{2},\cdots )}, the corresponding real number is

y = ∑ n = 1 ∞ x n 2 n . {\displaystyle y=\sum _{n=1}^{\infty }{\frac {x_{n}}{2^{n}}}.} {\displaystyle y=\sum _{n=1}^{\infty }{\frac {x_{n}}{2^{n}}}.}

The statement that the Bernoulli process is ergodic is equivalent to the statement that the real numbers are uniformly distributed. The set of all such strings can be written in a variety of ways: { h , t } ∞ = { h , t } ω = { 0 , 1 } ω = 2 ω = 2 N . {\displaystyle \{h,t\}^{\infty }=\{h,t\}^{\omega }=\{0,1\}^{\omega }=2^{\omega }=2^{\mathbb {N} }.} {\displaystyle \{h,t\}^{\infty }=\{h,t\}^{\omega }=\{0,1\}^{\omega }=2^{\omega }=2^{\mathbb {N} }.} This set is the Cantor set, sometimes called the Cantor space to avoid confusion with the Cantor function

C ( x ) = ∑ n = 1 ∞ x n 3 n . {\displaystyle C(x)=\sum _{n=1}^{\infty }{\frac {x_{n}}{3^{n}}}.} {\displaystyle C(x)=\sum _{n=1}^{\infty }{\frac {x_{n}}{3^{n}}}.}

In the end, these are all "the same thing".

The Cantor set plays key roles in many branches of mathematics. In recreational mathematics, it underpins the period-doubling fractals; in analysis, it appears in a vast variety of theorems. A key one for stochastic processes is the Wold decomposition, which states that any stationary process can be decomposed into a pair of uncorrelated processes, one deterministic, and the other being a moving average process.

The Ornstein isomorphism theorem states that every stationary stochastic process is equivalent to a Bernoulli scheme (a Bernoulli process with an N-sided (and possibly unfair) gaming die). Other results include that every non-dissipative ergodic system is equivalent to the Markov odometer, sometimes called an "adding machine" because it looks like elementary-school addition, that is, taking a base-N digit sequence, adding one, and propagating the carry bits. The proof of equivalence is very abstract; understanding the result is not: by adding one at each time step, every possible state of the odometer is visited, until it rolls over, and starts again. Likewise, ergodic systems visit each state, uniformly, moving on to the next, until they have all been visited.

Systems that generate (infinite) sequences of N letters are studied by means of symbolic dynamics. Important special cases include subshifts of finite type and sofic systems.

History and etymology

The word ergodic was introduced by Boltzmann... is an amalgamation of the Greek words ἔργον ergon (work) and ὁδός odos (path)[20]

Ludwig Boltzmann was trying to justify statistical mechanics and why it was possible to work with averages and probability distributions.[21] At the same time it is also claimed to be a derivation of ergomonode, coined by Boltzmann in a relatively obscure paper from 1884. The etymology appears to be contested in other ways as well.[22]

The idea of ergodicity was born in the field of thermodynamics, where it was necessary to relate the individual states of gas molecules to the temperature of a gas as a whole and its time evolution thereof. In order to do this, it was necessary to state what exactly it means for gases to mix well together, so that thermodynamic equilibrium could be defined with mathematical rigor. Once the theory was well developed in physics, it was rapidly formalized and extended, so that ergodic theory has long been an independent area of mathematics in itself. As part of that progression, more than one slightly different definition of ergodicity and multitudes of interpretations of the concept in different fields coexist.[23]

For example, in classical statistical physics the term implies that a system satisfies the ergodic hypothesis of thermodynamics,[24], the prototype system is the ideal gas, the relevant state space being position and momentum space. In dynamical systems theory the state space is usually taken to be a more general phase space and typical prototype systems are the irrational rotation often on a torus which has a trajectory that cover the full space but entropy zero, and the sinai billiard with entropy not zero[25]. On the other hand, in coding theory the state space is often discrete in both time and state, with less concomitant structure.

In all those fields the ideas of time average and ensemble average can also carry extra baggage as wellas is the case with the many possible thermodynamically relevant partition functions used to define ensemble averages in physics, back again. As such the measure theoretic formalization of the concept also serves as a unifying discipline. In 1913 Michel Plancherel proved the strict impossibility of ergodicity for a purely mechanical system.[26]

Ergodicity in physics and geometry

A review of ergodicity in physics, and in geometry follows. In all cases, the notion of ergodicity is exactly the same as that for dynamical systems; there is no difference, except for outlook, notation, style of thinking and the journals where results are published.

Physical systems can be split into three categories: classical mechanics, which describes machines with a finite number of moving parts, quantum mechanics, which describes the structure of atoms, and statistical mechanics, which describes gases, liquids, solids; this includes condensed matter physics. These are presented below.

In statistical mechanics

This section reviews ergodicity in statistical mechanics. The above abstract definition of a volume is required as the appropriate setting for definitions of ergodicity in physics. Consider a container of liquid, or gas, or plasma, or other collection of atoms or particles. Each and every particle x i {\displaystyle x_{i}} {\displaystyle x_{i}} has a 3D position, and a 3D velocity, and is thus described by six numbers: a point in six-dimensional space R 6 . {\displaystyle \mathbb {R} ^{6}.} {\displaystyle \mathbb {R} ^{6}.} If there are N {\displaystyle N} {\displaystyle N} of these particles in the system, a complete description requires 6 N {\displaystyle 6N} {\displaystyle 6N} numbers. Any one system is just a single point in R 6 N . {\displaystyle \mathbb {R} ^{6N}.} {\displaystyle \mathbb {R} ^{6N}.} The physical system is not all of R 6 N {\displaystyle \mathbb {R} ^{6N}} {\displaystyle \mathbb {R} ^{6N}}, of course; if it's a box of width, height and length W × H × L {\displaystyle W\times H\times L} {\displaystyle W\times H\times L} then a point is in ( W × H × L × R 3 ) N . {\displaystyle \left(W\times H\times L\times \mathbb {R} ^{3}\right)^{N}.} {\displaystyle \left(W\times H\times L\times \mathbb {R} ^{3}\right)^{N}.} Nor can velocities be infinite: they are scaled by some probability measure, for example the Boltzmann–Gibbs measure for a gas. Nonetheless, for N {\displaystyle N} {\displaystyle N} close to the Avogadro number, this is obviously a very large space. This space is called the canonical ensemble.

A physical system is said to be ergodic if any representative point of the system eventually comes to visit the entire volume of the system. For the above example, this implies that any given atom not only visits every part of the box W × H × L {\displaystyle W\times H\times L} {\displaystyle W\times H\times L} with uniform probability, but it does so with every possible velocity, with probability given by the Boltzmann distribution for that velocity (so, uniform with respect to that measure). The ergodic hypothesis states that physical systems actually are ergodic. Multiple time scales are at work: gases and liquids appear to be ergodic over short time scales. Ergodicity in a solid can be viewed in terms of the vibrational modes or phonons, as obviously the atoms in a solid do not exchange locations. Glasses present a challenge to the ergodic hypothesis; time scales are assumed to be in the millions of years, but results are contentious. Spin glasses present particular difficulties. Approach to ergodicity in this cases are shown to be anomalous [27].

Formal mathematical proofs of ergodicity in statistical physics are hard to come by; most high-dimensional many-body systems are assumed to be ergodic, without mathematical proof. Exceptions include the dynamical billiards, which model billiard ball-type collisions of atoms in an ideal gas or plasma. The first hard-sphere ergodicity theorem was for Sinai's billiards, which considers two balls, one of them taken as being stationary, at the origin. As the second ball collides, it moves away; applying periodic boundary conditions, it then returns to collide again. By appeal to homogeneity, this return of the "second" ball can instead be taken to be "just some other atom" that has come into range, and is moving to collide with the atom at the origin (which can be taken to be just "any other atom".) This is one of the few formal proofs that exist; there are no equivalent statements e.g. for atoms in a liquid, interacting via van der Waals forces, even if it would be common sense to believe that such systems are ergodic (and mixing). More precise physical arguments can be made, though.

Simple dynamical systems

The formal study of ergodicity can be approached by examining fairly simple dynamical systems. Some of the primary ones are listed here.

The irrational rotation of a circle is ergodic: the orbit of a point is such that eventually, every other point in the circle is visited. Such rotations are a special case of the interval exchange map. The beta expansions of a number are ergodic: beta expansions of a real number are done not in base-N, but in base- β {\displaystyle \beta } {\displaystyle \beta } for some β . {\displaystyle \beta .} {\displaystyle \beta .} The reflected version of the beta expansion is tent map; there are a variety of other ergodic maps of the unit interval. Moving to two dimensions, the arithmetic billiards with irrational angles are ergodic. One can also take a rectangle, squash it, cut it and reassemble it; this is the previously mentioned baker's map. Its points can be described by the set of bi-infinite strings in two letters, that is, extending to both the left and right; as such, it looks like two copies of the Bernoulli process. If one deforms sideways during the squashing, one obtains Arnold's cat map. In most ways, the cat map is prototypical of any other similar transformation.

In classical mechanics and geometry

Ergodicity is a widespread phenomenon in the study of symplectic manifolds and Riemannian manifolds. Symplectic manifolds provide the generalized setting for classical mechanics, where the motion of a mechanical system is described by a geodesic. Riemannian manifolds are a special case: the cotangent bundle of a Riemannian manifold is always a symplectic manifold. In particular, the geodesics on a Riemannian manifold are given by the solution of the Hamilton–Jacobi equations.

The geodesic flow of a flat torus following any irrational direction is ergodic (however, this requires the state space of the flow to be taken as the torus itself and not as its unit tangent bundle, which is the usual state space for the geodesic flow). Informally this means that when drawing a straight line in a square starting at any point, and with an irrational angle with respect to the sides, if every time one meets a side one starts over on the opposite side with the same angle, the line will eventually meet every subset of positive measure (however, the direction of the tangent vectors of this trajectory remains constant, which is why if this flow is taken on the unit tangent bundle, it is non-ergodic and actually an integrable system). More generally, on any flat surface there are many ergodic directions for the geodesic flow.

For non-flat surfaces, one has that the geodesic flow of any negatively curved compact Riemann surface is ergodic. A surface is "compact" in the sense that it has finite surface area. The geodesic flow is a generalization of the idea of moving in a "straight line" on a curved surface: such straight lines are geodesics. One of the earliest cases studied is Hadamard's billiards, which describes geodesics on the Bolza surface, topologically equivalent to a donut with two holes. Ergodicity can be demonstrated informally, if one has a sharpie and some reasonable example of a two-holed donut: starting anywhere, in any direction, one attempts to draw a straight line; rulers are useful for this. It doesn't take all that long to discover that one is not coming back to the starting point. (Of course, crooked drawing can also account for this; that's why we have proofs.)

These results extend to higher dimensions. The geodesic flow for negatively curved compact Riemannian manifolds is ergodic. A classic example for this is the Anosov flow, which is the horocycle flow on a hyperbolic manifold. This can be seen to be a kind of Hopf fibration. Such flows commonly occur in classical mechanics, which is the study in physics of finite-dimensional moving machinery, e.g. the double pendulum and so-forth. Classical mechanics is constructed on symplectic manifolds. The flows on such systems can be deconstructed into stable and unstable manifolds; as a general rule, when this is possible, chaotic motion results. That this is generic can be seen by noting that the cotangent bundle of a Riemannian manifold is (always) a symplectic manifold; the geodesic flow is given by a solution to the Hamilton–Jacobi equations for this manifold. In terms of the canonical coordinates ( q , p ) {\displaystyle (q,p)} {\displaystyle (q,p)} on the cotangent manifold, the Hamiltonian or energy is given by

H = 1 2 ∑ i j g i j ( q ) p i p j {\displaystyle H={\tfrac {1}{2}}\sum _{ij}g^{ij}(q)p_{i}p_{j}} {\displaystyle H={\tfrac {1}{2}}\sum _{ij}g^{ij}(q)p_{i}p_{j}}

with g i j {\displaystyle g^{ij}} {\displaystyle g^{ij}} the (inverse of the) metric tensor and p i {\displaystyle p_{i}} {\displaystyle p_{i}} the momentum. The resemblance to the kinetic energy E = 1 2 m v 2 {\displaystyle E={\tfrac {1}{2}}mv^{2}} {\displaystyle E={\tfrac {1}{2}}mv^{2}} of a point particle is hardly accidental; this is the whole point of calling such things "energy". In this sense, chaotic behavior with ergodic orbits is a more-or-less generic phenomenon in large tracts of geometry.

Ergodicity results have been provided in translation surfaces, hyperbolic groups and systolic geometry. Techniques include the study of ergodic flows, the Hopf decomposition, and the Ambrose–Kakutani–Krengel–Kubo theorem. An important class of systems are the Axiom A systems.

A number of both classification and "anti-classification" results have been obtained. The Ornstein isomorphism theorem applies here as well; again, it states that most of these systems are isomorphic to some Bernoulli scheme. This rather neatly ties these systems back into the definition of ergodicity given for a stochastic process, in the previous section. The anti-classification results state that there are more than a countably infinite number of inequivalent ergodic measure-preserving dynamical systems. This is perhaps not entirely a surprise, as one can use points in the Cantor set to construct similar-but-different systems. See measure-preserving dynamical system for a brief survey of some of the anti-classification results.

In wave mechanics

All of the previous sections considered ergodicty either from the point of view of a measurable dynamical system, or from the dual notion of tracking the motion of individual particle trajectories. A closely related concept occurs in (non-linear) wave mechanics. There, the resonant interaction allows for the mixing of normal modes, often (but not always) leading to the eventual thermalization of the system. One of the earliest systems to be rigorously studied in this context is the Fermi–Pasta–Ulam–Tsingou problem, a string of weakly coupled oscillators.

A resonant interaction is possible whenever the dispersion relations for the wave media allow three or more normal modes to sum in such a way as to conserve both the total momentum and the total energy. This allows energy concentrated in one mode to bleed into other modes, eventually distributing that energy uniformly across all interacting modes.

Resonant interactions between waves helps provide insight into the distinction between high-dimensional chaos (that is, turbulence) and thermalization. When normal modes can be combined so that energy and momentum are exactly conserved, then the theory of resonant interactions applies, and energy spreads into all of the interacting modes. When the dispersion relations only allow an approximate balance, turbulence or chaotic motion results. The turbulent modes can then transfer energy into modes that do mix, eventually leading to thermalization, but not before a preceding interval of chaotic motion.

In quantum mechanics

As to quantum mechanics, there is no universal quantum definition of ergodicity or even chaos (see quantum chaos).[28] However, there is a quantum ergodicity theorem stating that the expectation value of an operator converges to the corresponding microcanonical classical average in the semiclassical limit ℏ → 0 {\displaystyle \hbar \rightarrow 0} {\displaystyle \hbar \rightarrow 0}. Nevertheless, the theorem does not imply that all eigenstates of the Hamiltonian whose classical counterpart is chaotic are features and random. For example, the quantum ergodicity theorem does not exclude the existence of non-ergodic states such as quantum scars. In addition to the conventional scarring,[29][30][31][32] there are two other types of quantum scarring, which further illustrate the weak-ergodicity breaking in quantum chaotic systems: perturbation-induced[33][34][35][36][37] and many-body quantum scars.[38]

Definition for discrete-time systems

Ergodic measures provide one of the cornerstones with which ergodicity is generally discussed. A formal definition follows.

Invariant measure

Let ( X , B ) {\displaystyle (X,{\mathcal {B}})} {\displaystyle (X,{\mathcal {B}})} be a measurable space. If T {\displaystyle T} {\displaystyle T} is a measurable function from X {\displaystyle X} {\displaystyle X} to itself and μ {\displaystyle \mu } {\displaystyle \mu } a probability measure on ( X , B ) {\displaystyle (X,{\mathcal {B}})} {\displaystyle (X,{\mathcal {B}})}, then a measure-preserving dynamical system is defined as a dynamical system for which μ ( T − 1 ( A ) ) = μ ( A ) {\displaystyle \mu {\mathord {\left(T^{-1}(A)\right)}}=\mu (A)} {\displaystyle \mu {\mathord {\left(T^{-1}(A)\right)}}=\mu (A)} for all A ∈ B {\displaystyle A\in {\mathcal {B}}} {\displaystyle A\in {\mathcal {B}}}. Such a T {\displaystyle T} {\displaystyle T} is said to preserve μ ; {\displaystyle \mu ;} {\displaystyle \mu ;} equivalently, that μ {\displaystyle \mu } {\displaystyle \mu } is T {\displaystyle T} {\displaystyle T}-invariant.

Ergodic measure

A measurable function T {\displaystyle T} {\displaystyle T} is said to be μ {\displaystyle \mu } {\displaystyle \mu }-ergodic or that μ {\displaystyle \mu } {\displaystyle \mu } is an ergodic measure for T {\displaystyle T} {\displaystyle T} if T {\displaystyle T} {\displaystyle T} preserves μ {\displaystyle \mu } {\displaystyle \mu } and the following condition holds:

For any A ∈ B {\displaystyle A\in {\mathcal {B}}} {\displaystyle A\in {\mathcal {B}}} such that T − 1 ( A ) = A {\displaystyle T^{-1}(A)=A} {\displaystyle T^{-1}(A)=A} either μ ( A ) = 0 {\displaystyle \mu (A)=0} {\displaystyle \mu (A)=0} or μ ( A ) = 1 {\displaystyle \mu (A)=1} {\displaystyle \mu (A)=1}.

In other words, there are no T {\displaystyle T} {\displaystyle T}-invariant subsets up to measure 0 (with respect to μ {\displaystyle \mu } {\displaystyle \mu }).

Some authors[39] relax the requirement that T {\displaystyle T} {\displaystyle T} preserves μ {\displaystyle \mu } {\displaystyle \mu } to the requirement that T {\displaystyle T} {\displaystyle T} is a non-singular transformation with respect to μ {\displaystyle \mu } {\displaystyle \mu }, meaning that if N {\displaystyle N} {\displaystyle N} is a subset so that T − 1 ( N ) {\displaystyle T^{-1}(N)} {\displaystyle T^{-1}(N)} has zero measure, then so does T ( N ) {\displaystyle T(N)} {\displaystyle T(N)}.

Examples

The simplest example is when X {\displaystyle X} {\displaystyle X} is a finite set and μ {\displaystyle \mu } {\displaystyle \mu } the counting measure. Then a self-map of X {\displaystyle X} {\displaystyle X} preserves μ {\displaystyle \mu } {\displaystyle \mu } if and only if it is a bijection, and it is ergodic if and only if T {\displaystyle T} {\displaystyle T} has only one orbit (that is, for every x , y ∈ X {\displaystyle x,y\in X} {\displaystyle x,y\in X} there exists k ∈ N {\displaystyle k\in \mathbb {N} } {\displaystyle k\in \mathbb {N} } such that y = T k ( x ) {\displaystyle y=T^{k}(x)} {\displaystyle y=T^{k}(x)}). For example, if X = { 1 , 2 , … , n } {\displaystyle X=\{1,2,\ldots ,n\}} {\displaystyle X=\{1,2,\ldots ,n\}} then the cycle ( 1 2 ⋯ n ) {\displaystyle (1\,2\,\cdots \,n)} {\displaystyle (1\,2\,\cdots \,n)} is ergodic, but the permutation ( 1 2 ) ( 3 4 ⋯ n ) {\displaystyle (1\,2)(3\,4\,\cdots \,n)} {\displaystyle (1\,2)(3\,4\,\cdots \,n)} is not (it has the two invariant subsets { 1 , 2 } {\displaystyle \{1,2\}} {\displaystyle \{1,2\}} and { 3 , 4 , … , n } {\displaystyle \{3,4,\ldots ,n\}} {\displaystyle \{3,4,\ldots ,n\}}).

Equivalent formulations

The definition given above admits the following immediate reformulations:

  • for every A ∈ B {\displaystyle A\in {\mathcal {B}}} {\displaystyle A\in {\mathcal {B}}} with μ ( T − 1 ( A ) △ A ) = 0 {\displaystyle \mu {\mathord {\left(T^{-1}(A)\bigtriangleup A\right)}}=0} {\displaystyle \mu {\mathord {\left(T^{-1}(A)\bigtriangleup A\right)}}=0} we have μ ( A ) = 0 {\displaystyle \mu (A)=0} {\displaystyle \mu (A)=0} or μ ( A ) = 1 {\displaystyle \mu (A)=1\,} {\displaystyle \mu (A)=1\,} (where △ {\displaystyle \bigtriangleup } {\displaystyle \bigtriangleup } denotes the symmetric difference);
  • for every A ∈ B {\displaystyle A\in {\mathcal {B}}} {\displaystyle A\in {\mathcal {B}}} with positive measure we have μ ( ⋃ n = 1 ∞ T − n ( A ) ) = 1 {\textstyle \mu {\mathord {\left(\bigcup _{n=1}^{\infty }T^{-n}(A)\right)}}=1} {\textstyle \mu {\mathord {\left(\bigcup _{n=1}^{\infty }T^{-n}(A)\right)}}=1};
  • for every two sets A , B ∈ B {\displaystyle A,B\in {\mathcal {B}}} {\displaystyle A,B\in {\mathcal {B}}} of positive measure, there exists n > 0 {\displaystyle n>0} {\displaystyle n>0} such that μ ( ( T − n ( A ) ) ∩ B ) > 0 {\displaystyle \mu {\mathord {\left(\left(T^{-n}(A)\right)\cap B\right)}}>0} {\displaystyle \mu {\mathord {\left(\left(T^{-n}(A)\right)\cap B\right)}}>0};
  • Every measurable function f : X → R {\displaystyle f:X\to \mathbb {R} } {\displaystyle f:X\to \mathbb {R} } with f ∘ T = f {\displaystyle f\circ T=f} {\displaystyle f\circ T=f} is constant on a subset of full measure.

Importantly for applications, the condition in the last characterisation can be restricted to square-integrable functions only:

  • If f ∈ L 2 ( X , μ ) {\displaystyle f\in L^{2}(X,\mu )} {\displaystyle f\in L^{2}(X,\mu )} and f ∘ T = f {\displaystyle f\circ T=f} {\displaystyle f\circ T=f} then f {\displaystyle f} {\displaystyle f} is constant almost everywhere.

Further examples

Bernoulli shifts and subshifts

Let S {\displaystyle S} {\displaystyle S} be a finite set and X = S Z {\displaystyle X=S^{\mathbb {Z} }} {\displaystyle X=S^{\mathbb {Z} }} with μ {\displaystyle \mu } {\displaystyle \mu } the product measure (each factor S {\displaystyle S} {\displaystyle S} being endowed with its counting measure). Then the shift operator T {\displaystyle T} {\displaystyle T} defined by T ( ( s k ) k ∈ Z ) ) = ( s k + 1 ) k ∈ Z {\displaystyle T\left((s_{k})_{k\in \mathbb {Z} })\right)=(s_{k+1})_{k\in \mathbb {Z} }} {\displaystyle T\left((s_{k})_{k\in \mathbb {Z} })\right)=(s_{k+1})_{k\in \mathbb {Z} }} is μ {\displaystyle \mu } {\displaystyle \mu }-ergodic.[40]

There are many more ergodic measures for the shift map T {\displaystyle T} {\displaystyle T} on X {\displaystyle X} {\displaystyle X}. Periodic sequences give finitely supported measures. More interestingly, there are infinitely-supported ones which are subshifts of finite type.

Irrational rotations

Let X {\displaystyle X} {\displaystyle X} be the unit circle { z ∈ C , | z | = 1 } {\displaystyle \{z\in \mathbb {C} ,\,|z|=1\}} {\displaystyle \{z\in \mathbb {C} ,\,|z|=1\}}, with its Lebesgue measure μ {\displaystyle \mu } {\displaystyle \mu }. For any θ ∈ R {\displaystyle \theta \in \mathbb {R} } {\displaystyle \theta \in \mathbb {R} } the rotation of X {\displaystyle X} {\displaystyle X} of angle θ {\displaystyle \theta } {\displaystyle \theta } is given by T θ ( z ) = e 2 i π θ z {\displaystyle T_{\theta }(z)=e^{2i\pi \theta }z} {\displaystyle T_{\theta }(z)=e^{2i\pi \theta }z}. If θ ∈ Q {\displaystyle \theta \in \mathbb {Q} } {\displaystyle \theta \in \mathbb {Q} } then T θ {\displaystyle T_{\theta }} {\displaystyle T_{\theta }} is not ergodic for the Lebesgue measure as it has infinitely many finite orbits. On the other hand, if θ {\displaystyle \theta } {\displaystyle \theta } is irrational then T θ {\displaystyle T_{\theta }} {\displaystyle T_{\theta }} is ergodic.[41]

Arnold's cat map

Let X = R 2 / Z 2 {\displaystyle X=\mathbb {R} ^{2}/\mathbb {Z} ^{2}} {\displaystyle X=\mathbb {R} ^{2}/\mathbb {Z} ^{2}} be the 2-torus. Then any element g ∈ S L 2 ( Z ) {\displaystyle g\in \mathrm {SL} _{2}(\mathbb {Z} )} {\displaystyle g\in \mathrm {SL} _{2}(\mathbb {Z} )} defines a self-map of X {\displaystyle X} {\displaystyle X} since g ( Z 2 ) = Z 2 {\displaystyle g\left(\mathbb {Z} ^{2}\right)=\mathbb {Z} ^{2}} {\displaystyle g\left(\mathbb {Z} ^{2}\right)=\mathbb {Z} ^{2}}. When g = ( 2 1 1 1 ) {\textstyle g=\left({\begin{array}{cc}2&1\\1&1\end{array}}\right)} {\textstyle g=\left({\begin{array}{cc}2&1\\1&1\end{array}}\right)} one obtains the so-called Arnold's cat map, which is ergodic for the Lebesgue measure on the torus.

Ergodic theorems

If μ {\displaystyle \mu } {\displaystyle \mu } is a probability measure on a space X {\displaystyle X} {\displaystyle X} which is ergodic for a transformation T {\displaystyle T} {\displaystyle T} the pointwise ergodic theorem of G. D. Birkhoff states that for every measurable function f : X → R {\displaystyle f:X\to \mathbb {R} } {\displaystyle f:X\to \mathbb {R} } and for μ {\displaystyle \mu } {\displaystyle \mu }-almost every point x ∈ X {\displaystyle x\in X} {\displaystyle x\in X} the time average on the orbit of x {\displaystyle x} {\displaystyle x} converges to the space average of f {\displaystyle f} {\displaystyle f}. Formally this means that lim k → + ∞ ( 1 k + 1 ∑ i = 0 k f ( T i ( x ) ) ) = ∫ X f d μ . {\displaystyle \lim _{k\to +\infty }\left({\frac {1}{k+1}}\sum _{i=0}^{k}f\left(T^{i}(x)\right)\right)=\int _{X}fd\mu .} {\displaystyle \lim _{k\to +\infty }\left({\frac {1}{k+1}}\sum _{i=0}^{k}f\left(T^{i}(x)\right)\right)=\int _{X}fd\mu .}

The mean ergodic theorem of J. von Neumann is a similar, weaker statement about averaged translates of square-integrable functions.

Dense orbits

An immediate consequence of the definition of ergodicity is that on a topological space X {\displaystyle X} {\displaystyle X}, and if B {\displaystyle {\mathcal {B}}} {\displaystyle {\mathcal {B}}} is the σ-algebra of Borel sets, if T {\displaystyle T} {\displaystyle T} is μ {\displaystyle \mu } {\displaystyle \mu }-ergodic then μ {\displaystyle \mu } {\displaystyle \mu }-almost every orbit of T {\displaystyle T} {\displaystyle T} is dense in the support of μ {\displaystyle \mu } {\displaystyle \mu }.

This is not an equivalence since for a transformation which is not uniquely ergodic, but for which there is an ergodic measure with full support μ 0 {\displaystyle \mu _{0}} {\displaystyle \mu _{0}}, for any other ergodic measure μ 1 {\displaystyle \mu _{1}} {\displaystyle \mu _{1}} the measure 1 2 ( μ 0 + μ 1 ) {\textstyle {\frac {1}{2}}(\mu _{0}+\mu _{1})} {\textstyle {\frac {1}{2}}(\mu _{0}+\mu _{1})} is not ergodic for T {\displaystyle T} {\displaystyle T} but its orbits are dense in the support. Explicit examples can be constructed with shift-invariant measures.[42]

Mixing

A transformation T {\displaystyle T} {\displaystyle T} of a probability measure space ( X , μ ) {\displaystyle (X,\mu )} {\displaystyle (X,\mu )} is said to be mixing for the measure μ {\displaystyle \mu } {\displaystyle \mu } if for any measurable sets A , B ⊂ X {\displaystyle A,B\subset X} {\displaystyle A,B\subset X} the following holds: lim n → + ∞ μ ( T − n A ∩ B ) = μ ( A ) μ ( B ) {\displaystyle \lim _{n\to +\infty }\mu \left(T^{-n}A\cap B\right)=\mu (A)\mu (B)} {\displaystyle \lim _{n\to +\infty }\mu \left(T^{-n}A\cap B\right)=\mu (A)\mu (B)}

It is immediate that a mixing transformation is also ergodic (taking A {\displaystyle A} {\displaystyle A} to be a T {\displaystyle T} {\displaystyle T}-stable subset and B {\displaystyle B} {\displaystyle B} its complement). The converse is not true, for example a rotation with irrational angle on the circle (which is ergodic per the examples above) is not mixing (for a sufficiently small interval its successive images will not intersect itself most of the time). Bernoulli shifts are mixing, and so is Arnold's cat map.

This notion of mixing is sometimes called strong mixing, as opposed to weak mixing which means that lim n → + ∞ 1 n ∑ k = 1 n | μ ( T − k A ∩ B ) − μ ( A ) μ ( B ) | = 0 {\displaystyle \lim _{n\to +\infty }{\frac {1}{n}}\sum _{k=1}^{n}\left|\mu (T^{-k}A\cap B)-\mu (A)\mu (B)\right|=0} {\displaystyle \lim _{n\to +\infty }{\frac {1}{n}}\sum _{k=1}^{n}\left|\mu (T^{-k}A\cap B)-\mu (A)\mu (B)\right|=0}

Proper ergodicity

The transformation T {\displaystyle T} {\displaystyle T} is said to be properly ergodic if it does not have an orbit of full measure. In the discrete case this means that the measure μ {\displaystyle \mu } {\displaystyle \mu } is not supported on a finite orbit of T {\displaystyle T} {\displaystyle T}.

Definition for continuous-time dynamical systems

The definition is essentially the same for continuous-time dynamical systems as for a single transformation. Let ( X , B ) {\displaystyle (X,{\mathcal {B}})} {\displaystyle (X,{\mathcal {B}})} be a measurable space and for each t ∈ R + {\displaystyle t\in \mathbb {R} _{+}} {\displaystyle t\in \mathbb {R} _{+}}, then such a system is given by a family T t {\displaystyle T_{t}} {\displaystyle T_{t}} of measurable functions from X {\displaystyle X} {\displaystyle X} to itself, so that for any t , s ∈ R + {\displaystyle t,s\in \mathbb {R} _{+}} {\displaystyle t,s\in \mathbb {R} _{+}} the relation T s + t = T s ∘ T t {\displaystyle T_{s+t}=T_{s}\circ T_{t}} {\displaystyle T_{s+t}=T_{s}\circ T_{t}} holds (usually it is also asked that the orbit map from R + × X → X {\displaystyle \mathbb {R} _{+}\times X\to X} {\displaystyle \mathbb {R} _{+}\times X\to X} is also measurable). If μ {\displaystyle \mu } {\displaystyle \mu } is a probability measure on ( X , B ) {\displaystyle (X,{\mathcal {B}})} {\displaystyle (X,{\mathcal {B}})} then we say that T t {\displaystyle T_{t}} {\displaystyle T_{t}} is μ {\displaystyle \mu } {\displaystyle \mu }-ergodic or μ {\displaystyle \mu } {\displaystyle \mu } is an ergodic measure for T {\displaystyle T} {\displaystyle T} if each T t {\displaystyle T_{t}} {\displaystyle T_{t}} preserves μ {\displaystyle \mu } {\displaystyle \mu } and the following condition holds:

For any A ∈ B {\displaystyle A\in {\mathcal {B}}} {\displaystyle A\in {\mathcal {B}}}, if for all t ∈ R + {\displaystyle t\in \mathbb {R} _{+}} {\displaystyle t\in \mathbb {R} _{+}} we have T t − 1 ( A ) ⊂ A {\displaystyle T_{t}^{-1}(A)\subset A} {\displaystyle T_{t}^{-1}(A)\subset A} then either μ ( A ) = 0 {\displaystyle \mu (A)=0} {\displaystyle \mu (A)=0} or μ ( A ) = 1 {\displaystyle \mu (A)=1} {\displaystyle \mu (A)=1}.

Examples

As in the discrete case the simplest example is that of a transitive action, for instance the action on the circle given by T t ( z ) = e 2 i π t z {\displaystyle T_{t}(z)=e^{2i\pi t}z} {\displaystyle T_{t}(z)=e^{2i\pi t}z} is ergodic for Lebesgue measure.

An example with infinitely many orbits is given by the flow along an irrational slope on the torus: let X = S 1 × S 1 {\displaystyle X=\mathbb {S} ^{1}\times \mathbb {S} ^{1}} {\displaystyle X=\mathbb {S} ^{1}\times \mathbb {S} ^{1}} and α ∈ R {\displaystyle \alpha \in \mathbb {R} } {\displaystyle \alpha \in \mathbb {R} }. Let T t ( z 1 , z 2 ) = ( e 2 i π t z 1 , e 2 α i π t z 2 ) {\displaystyle T_{t}(z_{1},z_{2})=\left(e^{2i\pi t}z_{1},e^{2\alpha i\pi t}z_{2}\right)} {\displaystyle T_{t}(z_{1},z_{2})=\left(e^{2i\pi t}z_{1},e^{2\alpha i\pi t}z_{2}\right)}; then if α ∉ Q {\displaystyle \alpha \not \in \mathbb {Q} } {\displaystyle \alpha \not \in \mathbb {Q} } this is ergodic for the Lebesgue measure.

Ergodic flows

Further examples of ergodic flows are:

Ergodicity in compact metric spaces

If X {\displaystyle X} {\displaystyle X} is a compact metric space it is naturally endowed with the σ-algebra of Borel sets. The additional structure coming from the topology then allows a much more detailed theory for ergodic transformations and measures on X {\displaystyle X} {\displaystyle X}.

Functional analysis interpretation

A very powerful alternate definition of ergodic measures can be given using the theory of Banach spaces. Radon measures on X {\displaystyle X} {\displaystyle X} form a Banach space of which the set P ( X ) {\displaystyle {\mathcal {P}}(X)} {\displaystyle {\mathcal {P}}(X)} of probability measures on X {\displaystyle X} {\displaystyle X} is a convex subset. Given a continuous transformation T {\displaystyle T} {\displaystyle T} of X {\displaystyle X} {\displaystyle X} the subset P ( X ) T {\displaystyle {\mathcal {P}}(X)^{T}} {\displaystyle {\mathcal {P}}(X)^{T}} of T {\displaystyle T} {\displaystyle T}-invariant measures is a closed convex subset, and a measure is ergodic for T {\displaystyle T} {\displaystyle T} if and only if it is an extreme point of this convex.[43]

Existence of ergodic measures

In the setting above it follows from the Banach-Alaoglu theorem that there always exists extremal points in P ( X ) T {\displaystyle {\mathcal {P}}(X)^{T}} {\displaystyle {\mathcal {P}}(X)^{T}}. Hence a transformation of a compact metric space always admits ergodic measures.

Ergodic decomposition

In general an invariant measure need not be ergodic, but as a consequence of Choquet theory it can always be expressed as the barycenter of a probability measure on the set of ergodic measures. This is referred to as the ergodic decomposition of the measure.[44][45][46]

Example

In the case of X = { 1 , … , n } {\displaystyle X=\{1,\ldots ,n\}} {\displaystyle X=\{1,\ldots ,n\}} and T = ( 1 2 ) ( 3 4 ⋯ n ) {\displaystyle T=(1\,2)(3\,4\,\cdots \,n)} {\displaystyle T=(1\,2)(3\,4\,\cdots \,n)} the counting measure is not ergodic. The ergodic measures for T {\displaystyle T} {\displaystyle T} are the uniform measures μ 1 , μ 2 {\displaystyle \mu _{1},\mu _{2}} {\displaystyle \mu _{1},\mu _{2}} supported on the subsets { 1 , 2 } {\displaystyle \{1,2\}} {\displaystyle \{1,2\}} and { 3 , … , n } {\displaystyle \{3,\ldots ,n\}} {\displaystyle \{3,\ldots ,n\}} and every T {\displaystyle T} {\displaystyle T}-invariant probability measure can be written in the form t μ 1 + ( 1 − t ) μ 2 {\displaystyle t\mu _{1}+(1-t)\mu _{2}} {\displaystyle t\mu _{1}+(1-t)\mu _{2}} for some t ∈ [ 0 , 1 ] {\displaystyle t\in [0,1]} {\displaystyle t\in [0,1]}. In particular 2 n μ 1 + n − 2 n μ 2 {\textstyle {\frac {2}{n}}\mu _{1}+{\frac {n-2}{n}}\mu _{2}} {\textstyle {\frac {2}{n}}\mu _{1}+{\frac {n-2}{n}}\mu _{2}} is the ergodic decomposition of the counting measure.

Continuous systems

Everything in this section transfers verbatim to continuous actions of R {\displaystyle \mathbb {R} } {\displaystyle \mathbb {R} } or R + {\displaystyle \mathbb {R} _{+}} {\displaystyle \mathbb {R} _{+}} on compact metric spaces.

Unique ergodicity

The transformation T {\displaystyle T} {\displaystyle T} is said to be uniquely ergodic if there is a unique Borel probability measure μ {\displaystyle \mu } {\displaystyle \mu } on X {\displaystyle X} {\displaystyle X} which is ergodic for T {\displaystyle T} {\displaystyle T}.

In the examples considered above, irrational rotations of the circle are uniquely ergodic;[47] shift maps are not.

Probabilistic interpretation: ergodic processes

If ( X n ) n ≥ 1 {\displaystyle \left(X_{n}\right)_{n\geq 1}} {\displaystyle \left(X_{n}\right)_{n\geq 1}} is a discrete-time stochastic process on a space Ω {\displaystyle \Omega } {\displaystyle \Omega }, it is said to be ergodic if the joint distribution of the variables on Ω N {\displaystyle \Omega ^{\mathbb {N} }} {\displaystyle \Omega ^{\mathbb {N} }} is invariant under the shift map ( x n ) n ≥ 1 ↦ ( x n + 1 ) n ≥ 1 {\displaystyle \left(x_{n}\right)_{n\geq 1}\mapsto \left(x_{n+1}\right)_{n\geq 1}} {\displaystyle \left(x_{n}\right)_{n\geq 1}\mapsto \left(x_{n+1}\right)_{n\geq 1}}. This is a particular case of the notions discussed above.

The simplest case is that of an independent and identically distributed process which corresponds to the shift map described above. Another important case is that of a Markov chain which is discussed in detail below.

A similar interpretation holds for continuous-time stochastic processes though the construction of the measurable structure of the action is more complicated.

Ergodicity of Markov chains

The dynamical system associated with a Markov chain

Let S {\displaystyle S} {\displaystyle S} be a finite set. A Markov chain on S {\displaystyle S} {\displaystyle S} is defined by a matrix P ∈ [ 0 , 1 ] S × S {\displaystyle P\in [0,1]^{S\times S}} {\displaystyle P\in [0,1]^{S\times S}}, where P ( s 1 , s 2 ) {\displaystyle P(s_{1},s_{2})} {\displaystyle P(s_{1},s_{2})} is the transition probability from s 1 {\displaystyle s_{1}} {\displaystyle s_{1}} to s 2 {\displaystyle s_{2}} {\displaystyle s_{2}}, so for every s ∈ S {\displaystyle s\in S} {\displaystyle s\in S} we have ∑ s ′ ∈ S P ( s , s ′ ) = 1 {\textstyle \sum _{s'\in S}P(s,s')=1} {\textstyle \sum _{s'\in S}P(s,s')=1}. A stationary measure for P {\displaystyle P} {\displaystyle P} is a probability measure ν {\displaystyle \nu } {\displaystyle \nu } on S {\displaystyle S} {\displaystyle S} such that ν P = ν {\displaystyle \nu P=\nu } {\displaystyle \nu P=\nu } ; that is ∑ s ′ ∈ S ν ( s ′ ) P ( s ′ , s ) = ν ( s ) {\textstyle \sum _{s'\in S}\nu (s')P(s',s)=\nu (s)} {\textstyle \sum _{s'\in S}\nu (s')P(s',s)=\nu (s)} for all s ∈ S {\displaystyle s\in S} {\displaystyle s\in S}.

Using this data we can define a probability measure μ ν {\displaystyle \mu _{\nu }} {\displaystyle \mu _{\nu }} on the set X = S Z {\displaystyle X=S^{\mathbb {Z} }} {\displaystyle X=S^{\mathbb {Z} }} with its product σ-algebra by giving the measures of the cylinders as follows: μ ν ( ⋯ × S × { ( s n , … , s m ) } × S × ⋯ ) = ν ( s n ) P ( s n , s n + 1 ) ⋯ P ( s m − 1 , s m ) . {\displaystyle \mu _{\nu }(\cdots \times S\times \{(s_{n},\ldots ,s_{m})\}\times S\times \cdots )=\nu (s_{n})P(s_{n},s_{n+1})\cdots P(s_{m-1},s_{m}).} {\displaystyle \mu _{\nu }(\cdots \times S\times \{(s_{n},\ldots ,s_{m})\}\times S\times \cdots )=\nu (s_{n})P(s_{n},s_{n+1})\cdots P(s_{m-1},s_{m}).}

Stationarity of ν {\displaystyle \nu } {\displaystyle \nu } then means that the measure μ ν {\displaystyle \mu _{\nu }} {\displaystyle \mu _{\nu }} is invariant under the shift map T ( ( s k ) k ∈ Z ) ) = ( s k + 1 ) k ∈ Z {\displaystyle T\left(\left(s_{k}\right)_{k\in \mathbb {Z} })\right)=\left(s_{k+1}\right)_{k\in \mathbb {Z} }} {\displaystyle T\left(\left(s_{k}\right)_{k\in \mathbb {Z} })\right)=\left(s_{k+1}\right)_{k\in \mathbb {Z} }}.

Criterion for ergodicity

The measure μ ν {\displaystyle \mu _{\nu }} {\displaystyle \mu _{\nu }} is always ergodic for the shift map if the associated Markov chain is irreducible (any state can be reached with positive probability from any other state in a finite number of steps).[48]

The hypotheses above imply that there is a unique stationary measure for the Markov chain. In terms of the matrix P {\displaystyle P} {\displaystyle P} a sufficient condition for this is that 1 be a simple eigenvalue of the matrix P {\displaystyle P} {\displaystyle P} and all other eigenvalues of P {\displaystyle P} {\displaystyle P} (in C {\displaystyle \mathbb {C} } {\displaystyle \mathbb {C} }) are of modulus <1.

Note that in probability theory the Markov chain is called ergodic if in addition each state is aperiodic (the times where the return probability is positive are not multiples of a single integer >1). This is not necessary for the invariant measure to be ergodic; hence the notions of "ergodicity" for a Markov chain and the associated shift-invariant measure are different (the one for the chain is strictly stronger).[49]

Moreover, the criterion is an "if and only if" if all communicating classes in the chain are recurrent and we consider all stationary measures.

Examples

Counting measure

If P ( s , s ′ ) = 1 / | S | {\displaystyle P(s,s')=1/|S|} {\displaystyle P(s,s')=1/|S|} for all s , s ′ ∈ S {\displaystyle s,s'\in S} {\displaystyle s,s'\in S} then the stationary measure is the counting measure, the measure μ P {\displaystyle \mu _{P}} {\displaystyle \mu _{P}} is the product of counting measures. The Markov chain is ergodic, so the shift example from above is a special case of the criterion.

Non-ergodic Markov chains

Markov chains with recurring communicating classes which are not irreducible are not ergodic, and this can be seen immediately as follows. If S 1 , S 2 ⊊ S {\displaystyle S_{1},S_{2}\subsetneq S} {\displaystyle S_{1},S_{2}\subsetneq S} are two distinct recurrent communicating classes there are nonzero stationary measures ν 1 , ν 2 {\displaystyle \nu _{1},\nu _{2}} {\displaystyle \nu _{1},\nu _{2}} supported on S 1 , S 2 {\displaystyle S_{1},S_{2}} {\displaystyle S_{1},S_{2}} respectively and the subsets S 1 Z {\displaystyle S_{1}^{\mathbb {Z} }} {\displaystyle S_{1}^{\mathbb {Z} }} and S 2 Z {\displaystyle S_{2}^{\mathbb {Z} }} {\displaystyle S_{2}^{\mathbb {Z} }} are both shift-invariant and of measure 1/2 for the invariant probability measure 1 2 ( ν 1 + ν 2 ) {\textstyle {\frac {1}{2}}(\nu _{1}+\nu _{2})} {\textstyle {\frac {1}{2}}(\nu _{1}+\nu _{2})}. A very simple example of that is the chain on S = { 1 , 2 } {\displaystyle S=\{1,2\}} {\displaystyle S=\{1,2\}} given by the matrix ( 1 0 0 1 ) {\textstyle \left({\begin{array}{cc}1&0\\0&1\end{array}}\right)} {\textstyle \left({\begin{array}{cc}1&0\\0&1\end{array}}\right)} (both states are stationary).

A periodic chain

The Markov chain on S = { 1 , 2 } {\displaystyle S=\{1,2\}} {\displaystyle S=\{1,2\}} given by the matrix ( 0 1 1 0 ) {\textstyle \left({\begin{array}{cc}0&1\\1&0\end{array}}\right)} {\textstyle \left({\begin{array}{cc}0&1\\1&0\end{array}}\right)} is irreducible but periodic. Thus it is not ergodic in the sense of Markov chain though the associated measure μ {\displaystyle \mu } {\displaystyle \mu } on { 1 , 2 } Z {\displaystyle \{1,2\}^{\mathbb {Z} }} {\displaystyle \{1,2\}^{\mathbb {Z} }} is ergodic for the shift map. However the shift is not mixing for this measure, as for the sets A = ⋯ × { 1 , 2 } × 1 × { 1 , 2 } × 1 × { 1 , 2 } ⋯ {\displaystyle A=\cdots \times \{1,2\}\times 1\times \{1,2\}\times 1\times \{1,2\}\cdots } {\displaystyle A=\cdots \times \{1,2\}\times 1\times \{1,2\}\times 1\times \{1,2\}\cdots }

and B = ⋯ × { 1 , 2 } × 2 × { 1 , 2 } × 2 × { 1 , 2 } ⋯ {\displaystyle B=\cdots \times \{1,2\}\times 2\times \{1,2\}\times 2\times \{1,2\}\cdots } {\displaystyle B=\cdots \times \{1,2\}\times 2\times \{1,2\}\times 2\times \{1,2\}\cdots }

we have μ ( A ) = 1 2 = μ ( B ) {\textstyle \mu (A)={\frac {1}{2}}=\mu (B)} {\textstyle \mu (A)={\frac {1}{2}}=\mu (B)} but μ ( T − n A ∩ B ) = { 1 2  if  n  is odd 0  if  n  is even. {\displaystyle \mu \left(T^{-n}A\cap B\right)={\begin{cases}{\frac {1}{2}}{\text{ if }}n{\text{ is odd}}\\0{\text{ if }}n{\text{ is even.}}\end{cases}}} {\displaystyle \mu \left(T^{-n}A\cap B\right)={\begin{cases}{\frac {1}{2}}{\text{ if }}n{\text{ is odd}}\\0{\text{ if }}n{\text{ is even.}}\end{cases}}}

Generalisations

The definition of ergodicity also makes sense for group actions. The classical theory (for invertible transformations) corresponds to actions of Z {\displaystyle \mathbb {Z} } {\displaystyle \mathbb {Z} } or R {\displaystyle \mathbb {R} } {\displaystyle \mathbb {R} }.

For non-abelian groups there might not be invariant measures even on compact metric spaces. However the definition of ergodicity carries over unchanged if one replaces invariant measures by quasi-invariant measures.

Important examples are the action of a semisimple Lie group (or a lattice therein) on its Furstenberg boundary.

A measurable equivalence relation it is said to be ergodic if all saturated subsets are either null or conull.

Notes

  1. "Definition of ERGODICITY". www.merriam-webster.com. Retrieved 2026-01-13.
  2. Peter Walters (1982). "preliminaries". An introduction to ergodic theory. p. 2.
  3. Silva; Danilenko (2023). "Ergodicity and mixing properties , Ergodic decomposition". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 45.
  4. Silva; Danilenko (2023). "Ergodicity and mixing properties". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 35.
  5. Schöpf, H.‐G. (January 1970). "<scp>V. I. Arnold</scp> and<scp> A. Avez</scp>, Ergodic Problems of Classical Mechanics. (The Mathematical Physics Monograph Series) IX + 286 S. m. Fig. New York/Amsterdam 1968. W. A. Benjamin, Inc. Preis geb. $ 14.75, brosch. $ 6.95 ". ZAMM - Journal of Applied Mathematics and Mechanics / Zeitschrift für Angewandte Mathematik und Mechanik. 50 (7–9): 506–506. doi:10.1002/zamm.19700500721. ISSN 0044-2267.
  6. Typically ergodic theory is applied when probability measures don't depend on time or at best they are periodic in time. Namely standard dynamical systems are well defined in such cases, they are deterministic, it is possible to make them discrete and describe time evolution through a shift or transfer operator. Measure theory and stochastic processes are not interchangeable in the general case. Applying ergodic theory to other situations, such as for example chaotic noise, can be considered an open research problem: https://royalsocietypublishing.org/rsif/article/19/189/20220095/90226/Ergodic-descriptors-of-non-ergodic-stochastic
  7. Silva; Danilenko (2023). "Introduction to ergodic theory". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 1.
  8. Silva; Danilenko (2023). "Ergodicity and mixing properties". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 35.
  9. typically the number of variables of the microstate, are much larger than the constant of motions, and typically integrability means also being non chaotic
  10. the reduced set of effective variables are still too many and there is a closure problem, again hinting at chaotic regimes
  11. the Saturn system is an n-body problem, there are many effects of the push and pulls of the moons on the rings and the dynamics is hyperbolic "cassini mission".
  12. All hyperbolic systems such as billiards typically have non zero entropy "scholarpedia dynamic billiards".. The initial intuition behind the Kolmogorov-Sinai entropy was that there were two classes of systems, the probabilitistic ones with entropy non zero and the deterministic ones with entropy zero, this is actually not true and hyperbolicity is linked to entropy. This also leads to the theory of deterministic chaos, which is "rigid" in structure but still unpredictable @see Yakov Sinai: Now everything has been started? The origin of deterministic chaos YouTube · The Abel Prize 7 Feb 2020
  13. David Ruelle. "Introduction". The thermodynamic formalism. p. 1-2.
  14. Peter Walters (1982). "1.7 mixing". An introduction to ergodic theory.
  15. David Ruelle. "Introduction". The thermodynamic formalism. p. 8.
  16. Achim Klenke, "Probability Theory: A Comprehensive Course" (2013) Springer Universitext ISBN 978-1-4471-5360-3 DOI 10.1007/978-1-4471-5361-0 (See Chapter One)
  17. this is the simple case of coin tosses, more generally this can be one symbol for each availble extraction
  18. The 0. is introduced to explicitly distinguish the time direction, I.e the order of coin tosses, which here is big endian, I.e the first toss is the first digit on the left after the dot
  19. @see the example here
  20. Peter Walters (1982). "preliminaries". An introduction to ergodic theory. p. 2.
  21. Walters 1982,§0.1, p. 2
  22. Gallavotti, Giovanni (1995). "Ergodicity, ensembles, irreversibility in Boltzmann and beyond". Journal of Statistical Physics. 78 (5–6): 1571–1589. arXiv:chao-dyn/9403004. Bibcode:1995JSP....78.1571G. doi:10.1007/BF02180143. S2CID 17605281.
  23. This typically is meaningful in the multidisciplinary scope of dynamical systems in general, note that comparing information theory with let's say ergodic mixing can be non trivial. From the cited article: sibling branches... , families of different nature..., different tools..., common features... Silva; Danilenko (2023). "An introduction to ergodic theory". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 1.
  24. Feller, William (1 August 2008). An Introduction to Probability Theory and Its Applications (2nd ed.). Wiley India Pvt. Limited. p. 271. ISBN 978-81-265-1806-7.
  25. Silva; Danilenko (2023). "Entropy in ergodic theory, Determinism and Zero-Entropy". Ergodic theory A Volume in the Encyclopedia of Complexity and Systems Science (Second ed.). p. 180.
  26. Plancherel, M. (1913). "Beweis der Unmöglichkeit ergodischer mechanischer Systeme". Annalen der Physik. 42: 1061–1063. doi:10.1002/andp.19133471509.
  27. Süzen, M. (2026). "Anomalous diffusion in convergence to effective ergodicity". Physica Scripta. 101 (10) 105913. arXiv:1606.08693. doi:10.1088/1402-4896/ae483e.
  28. Stöckmann, Hans-Jürgen (1999). Quantum Chaos: An Introduction. Cambridge: Cambridge University Press. doi:10.1017/cbo9780511524622. ISBN 978-0-521-02715-1.
  29. Heller, Eric J. (1984-10-15). "Bound-State Eigenfunctions of Classically Chaotic Hamiltonian Systems: Scars of Periodic Orbits". Physical Review Letters. 53 (16): 1515–1518. Bibcode:1984PhRvL..53.1515H. doi:10.1103/PhysRevLett.53.1515.
  30. Kaplan, L (1999-03-01). "Scars in quantum chaotic wavefunctions". Nonlinearity. 12 (2): R1–R40. arXiv:chao-dyn/9810013. doi:10.1088/0951-7715/12/2/009. ISSN 0951-7715. S2CID 250793219.
  31. Kaplan, L.; Heller, E.J. (April 1998). "Linear and Nonlinear Theory of Eigenfunction Scars". Annals of Physics. 264 (2): 171–206. arXiv:chao-dyn/9809011. Bibcode:1998AnPhy.264..171K. doi:10.1006/aphy.1997.5773. S2CID 120635994.
  32. Heller, Eric Johnson (2018). The semiclassical way to dynamics and spectroscopy. Princeton: Princeton University Press. ISBN 978-1-4008-9029-3. OCLC 1034625177.
  33. Keski-Rahkonen, J.; Ruhanen, A.; Heller, E. J.; Räsänen, E. (2019-11-21). "Quantum Lissajous Scars". Physical Review Letters. 123 (21) 214101. arXiv:1911.09729. Bibcode:2019PhRvL.123u4101K. doi:10.1103/PhysRevLett.123.214101. PMID 31809168. S2CID 208248295.
  34. Luukko, Perttu J. J.; Drury, Byron; Klales, Anna; Kaplan, Lev; Heller, Eric J.; Räsänen, Esa (2016-11-28). "Strong quantum scarring by local impurities". Scientific Reports. 6 (1) 37656. arXiv:1511.04198. Bibcode:2016NatSR...637656L. doi:10.1038/srep37656. ISSN 2045-2322. PMC 5124902. PMID 27892510.
  35. Keski-Rahkonen, J.; Luukko, P. J. J.; Kaplan, L.; Heller, E. J.; Räsänen, E. (2017-09-20). "Controllable quantum scars in semiconductor quantum dots". Physical Review B. 96 (9) 094204. arXiv:1710.00585. Bibcode:2017PhRvB..96i4204K. doi:10.1103/PhysRevB.96.094204. S2CID 119083672.
  36. Keski-Rahkonen, J; Luukko, P J J; Åberg, S; Räsänen, E (2019-01-21). "Effects of scarring on quantum chaos in disordered quantum wells". Journal of Physics: Condensed Matter. 31 (10): 105301. arXiv:1806.02598. Bibcode:2019JPCM...31j5301K. doi:10.1088/1361-648x/aaf9fb. ISSN 0953-8984. PMID 30566927. S2CID 51693305.
  37. Keski-Rahkonen, Joonas (2020). Quantum Chaos in Disordered Two-Dimensional Nanostructures. Tampere University. ISBN 978-952-03-1699-0.
  38. Turner, C. J.; Michailidis, A. A.; Abanin, D. A.; Serbyn, M.; Papić, Z. (July 2018). "Weak ergodicity breaking from quantum many-body scars". Nature Physics. 14 (7): 745–749. arXiv:1711.03528. Bibcode:2018NatPh..14..745T. doi:10.1038/s41567-018-0137-5. ISSN 1745-2481. S2CID 256706206.
  39. Aaronson, Jon (1997). An introduction to infinite ergodic theory. Mathematical Surveys and Monographs. Vol. 50. Providence, Rhode Island: American Mathematical Society. p. 21. doi:10.1090/surv/050. ISBN 0-8218-0494-4. MR 1450400.
  40. Walters 1982, p. 32.
  41. Walters 1982, p. 29.
  42. "Example of a measure-preserving system with dense orbits that is not ergodic". MathOverflow. September 1, 2011. Retrieved May 16, 2020.
  43. Walters 1982, p. 152.
  44. Walters 1982, p. 153.
  45. https://arxiv.org/abs/1909.04896
  46. https://ncatlab.org/nlab/show/ergodic+decomposition+theorem
  47. Walters 1982, p. 159.
  48. Walters 1982, p. 42.
  49. "Different uses of the word "ergodic"". MathOverflow. September 4, 2011. Retrieved May 16, 2020.

References