Conway–Maxwell–binomial distribution

Discrete probability distribution
Conway–Maxwell–binomial
Parameters n { 1 , 2 , } , {\displaystyle n\in \{1,2,\ldots \},} 0 p 1 , {\displaystyle 0\leq p\leq 1,} < ν < {\displaystyle -\infty <\nu <\infty }
Support x { 0 , 1 , 2 , , n } {\displaystyle x\in \{0,1,2,\dots ,n\}}
PMF 1 C n , p , ν ( n x ) ν p j ( 1 p ) n x {\displaystyle {\frac {1}{C_{n,p,\nu }}}{\binom {n}{x}}^{\nu }p^{j}(1-p)^{n-x}}
CDF i = 0 x Pr ( X = i ) {\displaystyle \sum _{i=0}^{x}\Pr(X=i)}
Mean Not listed
Median No closed form
Mode See text
Variance Not listed
Skewness Not listed
Excess kurtosis Not listed
Entropy Not listed
MGF See text
CF See text

In probability theory and statistics, the Conway–Maxwell–binomial (CMB) distribution is a three parameter discrete probability distribution that generalises the binomial distribution in an analogous manner to the way that the Conway–Maxwell–Poisson distribution generalises the Poisson distribution. The CMB distribution can be used to model both positive and negative association among the Bernoulli summands,.[1][2]

The distribution was introduced by Shumeli et al. (2005),[1] and the name Conway–Maxwell–binomial distribution was introduced independently by Kadane (2016) [2] and Daly and Gaunt (2016).[3]

Probability mass function

The Conway–Maxwell–binomial (CMB) distribution has probability mass function

Pr ( Y = j ) = 1 C n , p , ν ( n j ) ν p j ( 1 p ) n j , j { 0 , 1 , , n } , {\displaystyle \Pr(Y=j)={\frac {1}{C_{n,p,\nu }}}{\binom {n}{j}}^{\nu }p^{j}(1-p)^{n-j}\,,\qquad j\in \{0,1,\ldots ,n\},}

where n N = { 1 , 2 , } {\displaystyle n\in \mathbb {N} =\{1,2,\ldots \}} , 0 p 1 {\displaystyle 0\leq p\leq 1} and < ν < {\displaystyle -\infty <\nu <\infty } . The normalizing constant C n , p , ν {\displaystyle C_{n,p,\nu }} is defined by

C n , p , ν = i = 0 n ( n i ) ν p i ( 1 p ) n i . {\displaystyle C_{n,p,\nu }=\sum _{i=0}^{n}{\binom {n}{i}}^{\nu }p^{i}(1-p)^{n-i}.}

If a random variable Y {\displaystyle Y} has the above mass function, then we write Y CMB ( n , p , ν ) {\displaystyle Y\sim \operatorname {CMB} (n,p,\nu )} .

The case ν = 1 {\displaystyle \nu =1} is the usual binomial distribution Y Bin ( n , p ) {\displaystyle Y\sim \operatorname {Bin} (n,p)} .

Relation to Conway–Maxwell–Poisson distribution

The following relationship between Conway–Maxwell–Poisson (CMP) and CMB random variables [1] generalises a well-known result concerning Poisson and binomial random variables. If X 1 CMP ( λ 1 , ν ) {\displaystyle X_{1}\sim \operatorname {CMP} (\lambda _{1},\nu )} and X 2 CMP ( λ 2 , ν ) {\displaystyle X_{2}\sim \operatorname {CMP} (\lambda _{2},\nu )} are independent, then X 1 | X 1 + X 2 = n CMB ( n , λ 1 / ( λ 1 + λ 2 ) , ν ) {\displaystyle X_{1}\,|\,X_{1}+X_{2}=n\sim \operatorname {CMB} (n,\lambda _{1}/(\lambda _{1}+\lambda _{2}),\nu )} .

Sum of possibly associated Bernoulli random variables

The random variable Y CMB ( n , p , ν ) {\displaystyle Y\sim \operatorname {CMB} (n,p,\nu )} may be written [1] as a sum of exchangeable Bernoulli random variables Z 1 , , Z n {\displaystyle Z_{1},\ldots ,Z_{n}} satisfying

Pr ( Z 1 = z 1 , , Z n = z n ) = 1 C n , p , ν ( n k ) ν 1 p k ( 1 p ) n k , {\displaystyle \Pr(Z_{1}=z_{1},\ldots ,Z_{n}=z_{n})={\frac {1}{C_{n,p,\nu }}}{\binom {n}{k}}^{\nu -1}p^{k}(1-p)^{n-k},}

where k = z 1 + + z n {\displaystyle k=z_{1}+\cdots +z_{n}} . Note that E Z 1 p {\displaystyle \operatorname {E} Z_{1}\not =p} in general, unless ν = 1 {\displaystyle \nu =1} .

Generating functions

Let

T ( x , ν ) = k = 0 n x k ( n k ) ν . {\displaystyle T(x,\nu )=\sum _{k=0}^{n}x^{k}{\binom {n}{k}}^{\nu }.}

Then, the probability generating function, moment generating function and characteristic function are given, respectively, by:[2]

G ( t ) = T ( t p / ( 1 p ) , ν ) T ( p ( 1 p ) , ν ) , {\displaystyle G(t)={\frac {T(tp/(1-p),\nu )}{T(p(1-p),\nu )}},}
M ( t ) = T ( e t p / ( 1 p ) , ν ) T ( p ( 1 p ) , ν ) , {\displaystyle M(t)={\frac {T(\mathrm {e} ^{t}p/(1-p),\nu )}{T(p(1-p),\nu )}},}
φ ( t ) = T ( e i t p / ( 1 p ) , ν ) T ( p ( 1 p ) , ν ) . {\displaystyle \varphi (t)={\frac {T(\mathrm {e} ^{\mathrm {i} t}p/(1-p),\nu )}{T(p(1-p),\nu )}}.}

Moments

For general ν {\displaystyle \nu } , there do not exist closed form expressions for the moments of the CMB distribution. The following neat formula is available, however.[3] Let ( j ) r = j ( j 1 ) ( j r + 1 ) {\displaystyle (j)_{r}=j(j-1)\cdots (j-r+1)} denote the falling factorial. Let Y CMB ( n , p , ν ) {\displaystyle Y\sim \operatorname {CMB} (n,p,\nu )} , where ν > 0 {\displaystyle \nu >0} . Then

E [ ( ( Y ) r ) ν ] = C n r , p , ν C n , p , ν ( ( n ) r ) ν p r , {\displaystyle \operatorname {E} [((Y)_{r})^{\nu }]={\frac {C_{n-r,p,\nu }}{C_{n,p,\nu }}}((n)_{r})^{\nu }p^{r}\,,}

for r = 1 , , n 1 {\displaystyle r=1,\ldots ,n-1} .

Mode

Let Y CMB ( n , p , ν ) {\displaystyle Y\sim \operatorname {CMB} (n,p,\nu )} and define

a = n + 1 1 + ( 1 p p ) 1 / ν . {\displaystyle a={\frac {n+1}{1+\left({\frac {1-p}{p}}\right)^{1/\nu }}}.}

Then the mode of Y {\displaystyle Y} is a {\displaystyle \lfloor a\rfloor } if a {\displaystyle a} is not an integer. Otherwise, the modes of Y {\displaystyle Y} are a {\displaystyle a} and a 1 {\displaystyle a-1} .[3]

Stein characterisation

Let Y CMB ( n , p , ν ) {\displaystyle Y\sim \operatorname {CMB} (n,p,\nu )} , and suppose that f : Z + R {\displaystyle f:\mathbb {Z} ^{+}\mapsto \mathbb {R} } is such that E | f ( Y + 1 ) | < {\displaystyle \operatorname {E} |f(Y+1)|<\infty } and E | Y ν f ( Y ) | < {\displaystyle \operatorname {E} |Y^{\nu }f(Y)|<\infty } . Then [3]

E [ p ( n Y ) ν f ( Y + 1 ) ( 1 p ) Y ν f ( Y ) ] = 0. {\displaystyle \operatorname {E} [p(n-Y)^{\nu }f(Y+1)-(1-p)Y^{\nu }f(Y)]=0.}

Approximation by the Conway–Maxwell–Poisson distribution

Fix λ > 0 {\displaystyle \lambda >0} and ν > 0 {\displaystyle \nu >0} and let Y n C M B ( n , λ / n ν , ν ) {\displaystyle Y_{n}\sim \mathrm {CMB} (n,\lambda /n^{\nu },\nu )} Then Y n {\displaystyle Y_{n}} converges in distribution to the C M P ( λ , ν ) {\displaystyle \mathrm {CMP} (\lambda ,\nu )} distribution as n {\displaystyle n\rightarrow \infty } .[3] This result generalises the classical Poisson approximation of the binomial distribution.

Conway–Maxwell–Poisson binomial distribution

Let X 1 , , X n {\displaystyle X_{1},\ldots ,X_{n}} be Bernoulli random variables with joint distribution given by

Pr ( X 1 = x 1 , , X n = x n ) = 1 C n ( n k ) ν 1 j = 1 n p j x j ( 1 p j ) 1 x j , {\displaystyle \Pr(X_{1}=x_{1},\ldots ,X_{n}=x_{n})={\frac {1}{C_{n}'}}{\binom {n}{k}}^{\nu -1}\prod _{j=1}^{n}p_{j}^{x_{j}}(1-p_{j})^{1-x_{j}},}

where k = x 1 + + x n {\displaystyle k=x_{1}+\cdots +x_{n}} and the normalizing constant C n {\displaystyle C_{n}^{\prime }} is given by

C n = k = 0 n ( n k ) ν 1 A F k i A p i j A c ( 1 p j ) , {\displaystyle C_{n}'=\sum _{k=0}^{n}{\binom {n}{k}}^{\nu -1}\sum _{A\in F_{k}}\prod _{i\in A}p_{i}\prod _{j\in A^{c}}(1-p_{j}),}

where

F k = { A { 1 , , n } : | A | = k } . {\displaystyle F_{k}=\left\{A\subseteq \{1,\ldots ,n\}:|A|=k\right\}.}

Let W = X 1 + + X n {\displaystyle W=X_{1}+\cdots +X_{n}} . Then W {\displaystyle W} has mass function

Pr ( W = k ) = 1 C n ( n k ) ν 1 A F k i A p i j A c ( 1 p j ) , {\displaystyle \Pr(W=k)={\frac {1}{C_{n}'}}{\binom {n}{k}}^{\nu -1}\sum _{A\in F_{k}}\prod _{i\in A}p_{i}\prod _{j\in A^{c}}(1-p_{j}),}

for k = 0 , 1 , , n {\displaystyle k=0,1,\ldots ,n} . This distribution generalises the Poisson binomial distribution in a way analogous to the CMP and CMB generalisations of the Poisson and binomial distributions. Such a random variable is therefore said [3] to follow the Conway–Maxwell–Poisson binomial (CMPB) distribution. This should not be confused with the rather unfortunate terminology Conway–Maxwell–Poisson–binomial that was used by [1] for the CMB distribution.

The case ν = 1 {\displaystyle \nu =1} is the usual Poisson binomial distribution and the case p 1 = = p n = p {\displaystyle p_{1}=\cdots =p_{n}=p} is the CMB ( n , p , ν ) {\displaystyle \operatorname {CMB} (n,p,\nu )} distribution.

References

  1. ^ a b c d e Shmueli G., Minka T., Kadane J.B., Borle S., and Boatwright, P.B. "A useful distribution for fitting discrete data: revival of the Conway–Maxwell–Poisson distribution." Journal of the Royal Statistical Society: Series C (Applied Statistics) 54.1 (2005): 127–142.[1]
  2. ^ a b c Kadane, J.B. " Sums of Possibly Associated Bernoulli Variables: The Conway–Maxwell–Binomial Distribution." Bayesian Analysis 11 (2016): 403–420.
  3. ^ a b c d e f Daly, F. and Gaunt, R.E. " The Conway–Maxwell–Poisson distribution: distributional theory and approximation." ALEA Latin American Journal of Probabability and Mathematical Statistics 13 (2016): 635–658.