Subexponential distribution (light-tailed)

☆ Save On Wikipedia ↗

In probability theory, in the context of light-tailed distributions, one definition of a subexponential distribution is as a probability distribution whose tails decay at an exponential rate, or faster: a real-valued distribution D {\displaystyle {\cal {D}}} {\displaystyle {\cal {D}}} is called subexponential if, for a random variable X ∼ D {\displaystyle X\sim {\cal {D}}} {\displaystyle X\sim {\cal {D}}},

P ( | X | ≥ x ) = O ( e − K x ) {\displaystyle {\mathbb {P}}(|X|\geq x)=O(e^{-Kx})} {\displaystyle {\mathbb {P}}(|X|\geq x)=O(e^{-Kx})}, for large x {\displaystyle x} {\displaystyle x} and some constant K > 0 {\displaystyle K>0} {\displaystyle K>0}.

Note that this is almost the opposite of the more established meaning of subexponential in the context of Heavy-tailed distributions, where "sub" means that the rate of decay is slower than exponential, rather than that the tail is lighter than exponential.

The subexponential norm, ‖ ⋅ ‖ ψ 1 {\displaystyle \|\cdot \|_{\psi _{1}}} {\displaystyle \|\cdot \|_{\psi _{1}}}, of a random variable is defined by

‖ X ‖ ψ 1 := inf   { K > 0 ∣ E ( e | X | / K ) ≤ 2 } , {\displaystyle \|X\|_{\psi _{1}}:=\inf \ \{K>0\mid {\mathbb {E}}(e^{|X|/K})\leq 2\},} {\displaystyle \|X\|_{\psi _{1}}:=\inf \ \{K>0\mid {\mathbb {E}}(e^{|X|/K})\leq 2\},} where the infimum is taken to be + ∞ {\displaystyle +\infty } {\displaystyle +\infty } if no such K {\displaystyle K} {\displaystyle K} exists.

This is an example of a Orlicz norm. An equivalent condition for a distribution D {\displaystyle {\cal {D}}} {\displaystyle {\cal {D}}} to be subexponential is then that ‖ X ‖ ψ 1 < ∞ . {\displaystyle \|X\|_{\psi _{1}}<\infty .} {\displaystyle \|X\|_{\psi _{1}}<\infty .}[1]:§2.7

Subexponentiality can also be expressed in the following equivalent ways:[1]:§2.7

  1. P ( | X | ≥ x ) ≤ 2 e − K x , {\displaystyle {\mathbb {P}}(|X|\geq x)\leq 2e^{-Kx},} {\displaystyle {\mathbb {P}}(|X|\geq x)\leq 2e^{-Kx},} for all x ≥ 0 {\displaystyle x\geq 0} {\displaystyle x\geq 0} and some constant K > 0 {\displaystyle K>0} {\displaystyle K>0}.
  2. E ( | X | p ) 1 / p ≤ K p , {\displaystyle {\mathbb {E}}(|X|^{p})^{1/p}\leq Kp,} {\displaystyle {\mathbb {E}}(|X|^{p})^{1/p}\leq Kp,} for all p ≥ 1 {\displaystyle p\geq 1} {\displaystyle p\geq 1} and some constant K > 0 {\displaystyle K>0} {\displaystyle K>0}.
  3. For some constant K > 0 {\displaystyle K>0} {\displaystyle K>0}, E ( e λ | X | ) ≤ e K λ {\displaystyle {\mathbb {E}}(e^{\lambda |X|})\leq e^{K\lambda }} {\displaystyle {\mathbb {E}}(e^{\lambda |X|})\leq e^{K\lambda }} for all 0 ≤ λ ≤ 1 / K {\displaystyle 0\leq \lambda \leq 1/K} {\displaystyle 0\leq \lambda \leq 1/K}.
  4. E ( X ) {\displaystyle {\mathbb {E}}(X)} {\displaystyle {\mathbb {E}}(X)} exists and for some constant K > 0 {\displaystyle K>0} {\displaystyle K>0}, E ( e λ ( X − E ( X ) ) ) ≤ e K 2 λ 2 {\displaystyle {\mathbb {E}}(e^{\lambda (X-{\mathbb {E}}(X))})\leq e^{K^{2}\lambda ^{2}}} {\displaystyle {\mathbb {E}}(e^{\lambda (X-{\mathbb {E}}(X))})\leq e^{K^{2}\lambda ^{2}}} for all − 1 / K ≤ λ ≤ 1 / K {\displaystyle -1/K\leq \lambda \leq 1/K} {\displaystyle -1/K\leq \lambda \leq 1/K}.
  5. | X | {\displaystyle {\sqrt {|X|}}} {\displaystyle {\sqrt {|X|}}} is sub-Gaussian.

References

  1. High-Dimensional Probability: An Introduction with Applications in Data Science, Roman Vershynin, University of California, Irvine, June 9, 2020

Further reading

  • High-Dimensional Statistics: A Non-Asymptotic Viewpoint, Martin J. Wainwright, Cambridge University Press, 2019, ISBN 9781108498029.