In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.
Definition
Suppose that
-
N
∼
Poisson
(
λ
)
,
{\displaystyle N\sim \operatorname {Poisson} (\lambda ),}
i.e., N is a random variable whose distribution is a Poisson distribution with expected value λ, and that
-
X
1
,
X
2
,
X
3
,
…
{\displaystyle X_{1},X_{2},X_{3},\dots }
are identically distributed random variables that are mutually independent and also independent of N. Then the probability distribution of the sum of
N
{\displaystyle N}
i.i.d. random variables
-
Y
=
∑
n
=
1
N
X
n
{\displaystyle Y=\sum _{n=1}^{N}X_{n}}
is a compound Poisson distribution.
In the case N = 0, then this is a sum of 0 terms, so the value of Y is 0. Hence the conditional distribution of Y given that N = 0 is a degenerate distribution.
The compound Poisson distribution is obtained by marginalising the joint distribution of (Y,N) over N, and this joint distribution can be obtained by combining the conditional distribution Y | N with the marginal distribution of N.
Properties
The expected value and the variance of the compound distribution can be derived in a simple way from law of total expectation and the law of total variance. Thus
-
E
(
Y
)
=
E
[
E
(
Y
∣
N
)
]
=
E
[
N
E
(
X
)
]
=
E
(
N
)
E
(
X
)
,
{\displaystyle \operatorname {E} (Y)=\operatorname {E} \left[\operatorname {E} (Y\mid N)\right]=\operatorname {E} \left[N\operatorname {E} (X)\right]=\operatorname {E} (N)\operatorname {E} (X),}
-
Var
(
Y
)
=
E
[
Var
(
Y
∣
N
)
]
+
Var
[
E
(
Y
∣
N
)
]
=
E
[
N
Var
(
X
)
]
+
Var
[
N
E
(
X
)
]
,
=
E
(
N
)
Var
(
X
)
+
(
E
(
X
)
)
2
Var
(
N
)
.
{\displaystyle {\begin{aligned}\operatorname {Var} (Y)&=\operatorname {E} \left[\operatorname {Var} (Y\mid N)\right]+\operatorname {Var} \left[\operatorname {E} (Y\mid N)\right]=\operatorname {E} \left[N\operatorname {Var} (X)\right]+\operatorname {Var} \left[N\operatorname {E} (X)\right],\\[6pt]&=\operatorname {E} (N)\operatorname {Var} (X)+\left(\operatorname {E} (X)\right)^{2}\operatorname {Var} (N).\end{aligned}}}
Then, since E(N) = Var(N) if N is Poisson-distributed, these formulae can be reduced to
-
E
(
Y
)
=
E
(
N
)
E
(
X
)
=
λ
E
(
X
)
,
{\displaystyle \operatorname {E} (Y)=\operatorname {E} (N)\operatorname {E} (X)=\lambda \operatorname {E} (X),}
-
Var
(
Y
)
=
E
(
N
)
(
Var
(
X
)
+
(
E
(
X
)
)
2
)
=
E
(
N
)
E
(
X
2
)
=
λ
E
(
X
2
)
.
{\displaystyle \operatorname {Var} (Y)=\operatorname {E} (N)(\operatorname {Var} (X)+(\operatorname {E} (X))^{2})=\operatorname {E} (N){\operatorname {E} (X^{2})}=\lambda {\operatorname {E} (X^{2})}.}
The probability distribution of Y can be determined in terms of characteristic functions:
-
φ
Y
(
t
)
=
E
(
e
i
t
Y
)
=
E
(
(
E
(
e
i
t
X
∣
N
)
)
N
)
=
E
(
(
φ
X
(
t
)
)
N
)
,
{\displaystyle \varphi _{Y}(t)=\operatorname {E} (e^{itY})=\operatorname {E} \left(\left(\operatorname {E} (e^{itX}\mid N)\right)^{N}\right)=\operatorname {E} \left((\varphi _{X}(t))^{N}\right),\,}
and hence, using the probability-generating function of the Poisson distribution, we have
-
φ
Y
(
t
)
=
e
λ
(
φ
X
(
t
)
−
1
)
.
{\displaystyle \varphi _{Y}(t)={\textrm {e}}^{\lambda (\varphi _{X}(t)-1)}.\,}
An alternative approach is via cumulant generating functions:
-
K
Y
(
t
)
=
ln
E
[
e
t
Y
]
=
ln
E
[
E
[
e
t
Y
∣
N
]
]
=
ln
E
[
e
N
K
X
(
t
)
]
=
K
N
(
K
X
(
t
)
)
.
{\displaystyle K_{Y}(t)=\ln \operatorname {E} [e^{tY}]=\ln \operatorname {E} [\operatorname {E} [e^{tY}\mid N]]=\ln \operatorname {E} [e^{NK_{X}(t)}]=K_{N}(K_{X}(t)).\,}
Via the law of total cumulance it can be shown that, if the mean of the Poisson distribution λ = 1, the cumulants of Y are the same as the moments of X1.
Every infinitely divisible probability distribution is a limit of compound Poisson distributions.[1] And compound Poisson distributions is infinitely divisible by the definition.
Discrete compound Poisson distribution
When
X
1
,
X
2
,
X
3
,
…
{\displaystyle X_{1},X_{2},X_{3},\dots }
are positive integer-valued i.i.d random variables with
P
(
X
1
=
k
)
=
α
k
,
(
k
=
1
,
2
,
…
)
{\displaystyle P(X_{1}=k)=\alpha _{k},\ (k=1,2,\ldots )}
, then this compound Poisson distribution is named discrete compound Poisson distribution[2][3][4] (or stuttering-Poisson distribution[5]) . We say that the discrete random variable
Y
{\displaystyle Y}
satisfying probability generating function characterization
-
P
Y
(
z
)
=
∑
i
=
0
∞
P
(
Y
=
i
)
z
i
=
exp
(
∑
k
=
1
∞
α
k
λ
(
z
k
−
1
)
)
,
(
|
z
|
≤
1
)
{\displaystyle P_{Y}(z)=\sum \limits _{i=0}^{\infty }P(Y=i)z^{i}=\exp \left(\sum \limits _{k=1}^{\infty }\alpha _{k}\lambda (z^{k}-1)\right),\quad (|z|\leq 1)}
has a discrete compound Poisson(DCP) distribution with parameters
(
α
1
λ
,
α
2
λ
,
…
)
∈
R
∞
{\displaystyle (\alpha _{1}\lambda ,\alpha _{2}\lambda ,\ldots )\in \mathbb {R} ^{\infty }}
(where
∑
i
=
1
∞
α
i
=
1
{\textstyle \sum _{i=1}^{\infty }\alpha _{i}=1}
, with
α
i
≥
0
,
λ
>
0
{\textstyle \alpha _{i}\geq 0,\lambda >0}
), which is denoted by
-
X
∼
DCP
(
λ
α
1
,
λ
α
2
,
…
)
{\displaystyle X\sim {\text{DCP}}(\lambda {\alpha _{1}},\lambda {\alpha _{2}},\ldots )}
Moreover, if
X
∼
DCP
(
λ
α
1
,
…
,
λ
α
r
)
{\displaystyle X\sim {\operatorname {DCP} }(\lambda {\alpha _{1}},\ldots ,\lambda {\alpha _{r}})}
, we say
X
{\displaystyle X}
has a discrete compound Poisson distribution of order
r
{\displaystyle r}
. When
r
=
1
,
2
{\displaystyle r=1,2}
, DCP becomes Poisson distribution and Hermite distribution, respectively. When
r
=
3
,
4
{\displaystyle r=3,4}
, DCP becomes triple stuttering-Poisson distribution and quadruple stuttering-Poisson distribution, respectively.[6] Other special cases include: shift geometric distribution, negative binomial distribution, Geometric Poisson distribution, Neyman type A distribution, Luria–Delbrück distribution in Luria–Delbrück experiment. For more special case of DCP, see the reviews paper[7] and references therein.
Feller's characterization of the compound Poisson distribution states that a non-negative integer valued r.v.
X
{\displaystyle X}
is infinitely divisible if and only if its distribution is a discrete compound Poisson distribution.[8] The negative binomial distribution is discrete infinitely divisible, i.e., if X has a negative binomial distribution, then for any positive integer n, there exist discrete i.i.d. random variables X1, ..., Xn whose sum has the same distribution that X has. The shift geometric distribution is discrete compound Poisson distribution since it is a trivial case of negative binomial distribution.
This distribution can model batch arrivals (such as in a bulk queue[5][9]). The discrete compound Poisson distribution is also widely used in actuarial science for modelling the distribution of the total claim amount.[3]
When some
α
k
{\displaystyle \alpha _{k}}
are negative, it is the discrete pseudo compound Poisson distribution.[3] We define that any discrete random variable
Y
{\displaystyle Y}
satisfying probability generating function characterization
-
G
Y
(
z
)
=
∑
i
=
0
∞
P
(
Y
=
i
)
z
i
=
exp
(
∑
k
=
1
∞
α
k
λ
(
z
k
−
1
)
)
,
(
|
z
|
≤
1
)
{\displaystyle G_{Y}(z)=\sum \limits _{i=0}^{\infty }P(Y=i)z^{i}=\exp \left(\sum \limits _{k=1}^{\infty }\alpha _{k}\lambda (z^{k}-1)\right),\quad (|z|\leq 1)}
has a discrete pseudo compound Poisson distribution with parameters
(
λ
1
,
λ
2
,
…
)
=:
(
α
1
λ
,
α
2
λ
,
…
)
∈
R
∞
{\displaystyle (\lambda _{1},\lambda _{2},\ldots )=:(\alpha _{1}\lambda ,\alpha _{2}\lambda ,\ldots )\in \mathbb {R} ^{\infty }}
where
∑
i
=
1
∞
α
i
=
1
{\textstyle \sum _{i=1}^{\infty }{\alpha _{i}}=1}
and
∑
i
=
1
∞
|
α
i
|
<
∞
{\textstyle \sum _{i=1}^{\infty }{\left|{\alpha _{i}}\right|}<\infty }
, with
α
i
∈
R
,
λ
>
0
{\displaystyle {\alpha _{i}}\in \mathbb {R} ,\lambda >0}
.
Compound Poisson Gamma distribution
If X has a gamma distribution, of which the exponential distribution is a special case, then the conditional distribution of Y | N is again a gamma distribution. The marginal distribution of Y is a Tweedie distribution with variance power 1 < p < 2 (proof via comparison of characteristic function).[10] To be more explicit, if
-
N
∼
Poisson
(
λ
)
,
{\displaystyle N\sim \operatorname {Poisson} (\lambda ),}
and
-
X
i
∼
Γ
(
α
,
β
)
{\displaystyle X_{i}\sim \operatorname {\Gamma } (\alpha ,\beta )}
i.i.d., then the distribution of
-
Y
=
∑
i
=
1
N
X
i
{\displaystyle Y=\sum _{i=1}^{N}X_{i}}
is a reproductive exponential dispersion model
E
D
(
μ
,
σ
2
)
{\displaystyle ED(\mu ,\sigma ^{2})}
with
-
E
[
Y
]
=
λ
α
β
=:
μ
,
Var
[
Y
]
=
λ
α
(
1
+
α
)
β
2
=:
σ
2
μ
p
.
{\displaystyle {\begin{aligned}\operatorname {E} [Y]&=\lambda {\frac {\alpha }{\beta }}=:\mu ,\\[4pt]\operatorname {Var} [Y]&=\lambda {\frac {\alpha (1+\alpha )}{\beta ^{2}}}=:\sigma ^{2}\mu ^{p}.\end{aligned}}}
The mapping of parameters Tweedie parameter
μ
,
σ
2
,
p
{\displaystyle \mu ,\sigma ^{2},p}
to the Poisson and Gamma parameters
λ
,
α
,
β
{\displaystyle \lambda ,\alpha ,\beta }
is the following:
-
λ
=
μ
2
−
p
(
2
−
p
)
σ
2
,
α
=
2
−
p
p
−
1
,
β
=
μ
1
−
p
(
p
−
1
)
σ
2
.
{\displaystyle {\begin{aligned}\lambda &={\frac {\mu ^{2-p}}{(2-p)\sigma ^{2}}},\\[4pt]\alpha &={\frac {2-p}{p-1}},\\[4pt]\beta &={\frac {\mu ^{1-p}}{(p-1)\sigma ^{2}}}.\end{aligned}}}
Compound Poisson processes
A compound Poisson process with rate
λ
>
0
{\displaystyle \lambda >0}
and jump size distribution G is a continuous-time stochastic process
{
Y
(
t
)
:
t
≥
0
}
{\displaystyle \{\,Y(t):t\geq 0\,\}}
given by
-
Y
(
t
)
=
∑
i
=
1
N
(
t
)
D
i
,
{\displaystyle Y(t)=\sum _{i=1}^{N(t)}D_{i},}
where the sum is by convention equal to zero as long as N(t) = 0. Here,
{
N
(
t
)
:
t
≥
0
}
{\displaystyle \{\,N(t):t\geq 0\,\}}
is a Poisson process with rate
λ
{\displaystyle \lambda }
, and
{
D
i
:
i
≥
1
}
{\displaystyle \{\,D_{i}:i\geq 1\,\}}
are independent and identically distributed random variables, with distribution function G, which are also independent of
{
N
(
t
)
:
t
≥
0
}
.
{\displaystyle \{\,N(t):t\geq 0\,\}.\,}
[11]
For the discrete version of compound Poisson process, it can be used in survival analysis for the frailty models.[12]
Applications
A compound Poisson distribution, in which the summands have an exponential distribution, was used by Revfeim to model the distribution of the total rainfall in a day, where each day contains a Poisson-distributed number of events each of which provides an amount of rainfall which has an exponential distribution.[13] Thompson applied the same model to monthly total rainfalls.[14]
There have been applications to insurance claims[15][16] and x-ray computed tomography.[17][18][19]
See also
References
- Lukacs, E. (1970). Characteristic functions. London: Griffin. ISBN 0-85264-170-2.
- Johnson, N.L., Kemp, A.W., and Kotz, S. (2005) Univariate Discrete Distributions, 3rd Edition, Wiley, ISBN 978-0-471-27246-5.
- Huiming, Zhang; Yunxiao Liu; Bo Li (2014). "Notes on discrete compound Poisson model with applications to risk theory". Insurance: Mathematics and Economics. 59: 325–336. doi:10.1016/j.insmatheco.2014.09.012.
- Huiming, Zhang; Bo Li (2016). "Characterizations of discrete compound Poisson distributions". Communications in Statistics - Theory and Methods. 45 (22): 6789–6802. doi:10.1080/03610926.2014.901375. S2CID 125475756.
- Kemp, C. D. (1967). ""Stuttering – Poisson" distributions". Journal of the Statistical and Social Enquiry of Ireland. 21 (5): 151–157. hdl:2262/6987.
- Patel, Y. C. (1976). Estimation of the parameters of the triple and quadruple stuttering-Poisson distributions. Technometrics, 18(1), 67-73.
- Wimmer, G., Altmann, G. (1996). The multiple Poisson distribution, its characteristics and a variety of forms. Biometrical journal, 38(8), 995-1011.
- Feller, W. (1968). An Introduction to Probability Theory and its Applications. Vol. I (3rd ed.). New York: Wiley.
- Adelson, R. M. (1966). "Compound Poisson Distributions". Journal of the Operational Research Society. 17 (1): 73–75. doi:10.1057/jors.1966.8.
- Jørgensen, Bent (1997). The theory of dispersion models. Chapman & Hall. ISBN 978-0412997112.
- S. M. Ross (2007). Introduction to Probability Models (ninth ed.). Boston: Academic Press. ISBN 978-0-12-598062-3.
- Ata, N.; Özel, G. (2013). "Survival functions for the frailty models based on the discrete compound Poisson process". Journal of Statistical Computation and Simulation. 83 (11): 2105–2116. doi:10.1080/00949655.2012.679943. S2CID 119851120.
- Revfeim, K. J. A. (1984). "An initial model of the relationship between rainfall events and daily rainfalls". Journal of Hydrology. 75 (1–4): 357–364. Bibcode:1984JHyd...75..357R. doi:10.1016/0022-1694(84)90059-3.
- Thompson, C. S. (1984). "Homogeneity analysis of a rainfall series: an application of the use of a realistic rainfall model". Journal of Climatology. 4 (6): 609–619. Bibcode:1984IJCli...4..609T. doi:10.1002/joc.3370040605.
- Jørgensen, Bent; Paes De Souza, Marta C. (January 1994). "Fitting Tweedie's compound poisson model to insurance claims data". Scandinavian Actuarial Journal. 1994 (1): 69–93. doi:10.1080/03461238.1994.10413930.
- Smyth, Gordon K.; Jørgensen, Bent (29 August 2014). "Fitting Tweedie's Compound Poisson Model to Insurance Claims Data: Dispersion Modelling". ASTIN Bulletin. 32 (1): 143–157. doi:10.2143/AST.32.1.1020.
- Whiting, Bruce R. (3 May 2002). Antonuk, Larry E.; Yaffe, Martin J. (eds.). "Signal statistics in x-ray computed tomography". Medical Imaging 2002: Physics of Medical Imaging. 4682. International Society for Optics and Photonics: 53–60. Bibcode:2002SPIE.4682...53W. doi:10.1117/12.465601. S2CID 116487704.
- Elbakri, Idris A.; Fessler, Jeffrey A. (16 May 2003). Sonka, Milan; Fitzpatrick, J. Michael (eds.). "Efficient and accurate likelihood for iterative image reconstruction in x-ray computed tomography". Medical Imaging 2003: Image Processing. 5032. SPIE: 1839–1850. Bibcode:2003SPIE.5032.1839E. CiteSeerX 10.1.1.419.3752. doi:10.1117/12.480302. S2CID 12215253.
- Whiting, Bruce R.; Massoumzadeh, Parinaz; Earl, Orville A.; O'Sullivan, Joseph A.; Snyder, Donald L.; Williamson, Jeffrey F. (24 August 2006). "Properties of preprocessed sinogram data in x-ray computed tomography". Medical Physics. 33 (9): 3290–3303. Bibcode:2006MedPh..33.3290W. doi:10.1118/1.2230762. PMID 17022224.