On Runs, Bivariate Poisson Mixtures and Distributions That Arise in Bernoulli Arrays

Distributional findings are obtained relative to various quantities arising in Bernoulli) for k ≥ 1 with the ho-mogeneity across the first r columns assumption p k,1 = · · · = p k,r. The quantities of interest relate to the measure of the number of runs of length 2 and are With various known results applicable to the marginal distributions of the Sn,j's and to their limiting quantities Sj = limn→∞ Sn,j , we investigate joint distributions in the bivariate (r = 2) case and the distributions of their totals Tn and T for r ≥ 2. In the latter case, we derive a key relationship between multivariate problems and univariate (r = 1) problems opening up the path for several derivations and representations such as Poisson mixtures. In the former case, we obtain general expressions for the probability generating functions, the binomial moments and the probability mass functions through conditioning, an analysis of a resulting recursive system of equations, and again by exploiting connections with the univariate problem. More precisely, for cases where p k,j = 1 b+k for j = 1, 2 with b ≥ 1, we obtain explicit expressions for the probability generating function of S n , n ≥ 1, and S, as well as a Poisson mixture representation : S|(V1 = v1, V2 = v2) ∼ ind. Poisson(vi) with (V1, V2) ∼ Dirichlet(1, 1, b−1) which nicely captures both the marginal distributions and the dependence structure. From this, we derive the fact that S1|S1 + S2 = t is uniformly distributed on {0, 1,. .. , t} for all b ≥ 1. We conclude with yet another mixture representation for p k,j = 1 b+k for j = 1, 2 with b ≥ 1, where we show that S|α ∼ pα, α ∼ Beta(1, b) with pα a bivariate mass function with Poisson(α) marginals given by pα(s1, s2) = e −α α s 1 +s 2 (s 1 +s 2 +1)! (s1 + s2 + 1 − α) .

With known marginal distributions for the S n,j 's and the S j 's for various configurations of the p k,j 's and a clearly negative pairwise dependence, further questions of interest concern the joint distribution of the vectors S n = (S n,1 , . . ., S n,r ) and S = lim n→∞ S n .As well, the distribution of the totals T n and T are also of related interest.With such Poisson distributions and Poisson mixtures arising naturally in these univariate (r = 1) situations, it seems natural to investigate multivariate versions of such results.Said otherwise, in what sense and for which configurations of the {p k,j }'s, can analytical extensions of Result A and (1.1) be obtained?
In this paper, we obtain multivariate generalizations for the homogeneous along column case (i.e., first r row components identically distributed) where p k,1 = • • • = p k,r .As in Joffe et al. (2004) and Ait Aoudia and Marchand (2010), we first obtain by conditioning a recursive system of equations involving the probability generating functions of the S n,j 's in Section 2.1.This permits us, in Section 3, to obtain key result (Theorem 3.1) linking the multivariate r ≥ 2 distributions of T n and T to univariate (r = 1) analogs.This is especially useful given that results like (1.1) are available and, hence, corollaries are derived.As an illustration, for p k,j = λ λr+k−1 , we show that the distribution of T is Poisson(λ) and, for p k,j = a k−1+rb , we obtain a Poisson mixture representation for the distribution of T with a Beta mixing variable.
In Section 4, we obtain (Theorems 4.1 and 4.3) for the bivariate case with p k,j = 1 b+k , b ≥ 1, explicit expressions and representations for the distributions of S n , n ≥ 1, and S. For instance, we show ( Theorem 4.3 (b)) that the distribution of S is the mixture of two independent Poisson(V i ), i = 1, 2 with (V 1 , V 2 ) having a Dirichlet distribution, with some definitions and preliminary results on such mixtures given earlier in Section 2.2.This represents a natural extension of (1.1) for a = 1 as one recovers the univariate result with the Beta marginals of the Dirichlet and the sought-after dependence structure as reflected by the dependence of the Dirichlet components V 1 and V 2 .Yet another mixture representation is given in part (c) of Theorem 4.3.But it is quite different as the mixing parameter is univariate and the dependence is reflected otherwise through a bivariate distribution with Poisson and non-independent marginals.

Definitions, recurrences for pgf's and binomial moments
We work with the quantities X k , S n,j , S j , T n , T , S n and S as defined in the Introduction.Ait Aoudia and Marchand (2010) studied the distribution of T n for the bivariate case (r = 2) and the homogeneous (in k) case with p k,1 = p, p k,2 = 1 − p (and p k,3 = 0) for all k.We obtain here representations and relationships for the distributions of the vectors S n and S as well as those of the totals T n and T for various non-homogeneous in k configurations of the p k,j 's but with identically distributed components of X k , in other words We pursue by setting S 0,j = 0 for all j and by introducing the auxiliary random variables W n,1 , . . ., W n,r where W n,j := S n−1,j + X n,j , n ≥ 1, j ∈ {1, . . ., r} .
Proof.We condition on X n+1 .We obtain with the independence of the X k 's and the definitions of the sequences S n and W n,j Equations (2) follow along the same lines by conditioning again on X n+1 .Finally, the initial values for n = 1 follow from definitions.
A rearrangement of the above system of equations is as follows. ECP and the probability function of Z can be written as For the bivariate case with non-negative integer valued components Z 1 and Z 2 , analogous relationships are as long as the Taylor series expansion at (t 1 , t 2 ) = (1, 1) of the probability generating function converges on an open set containing the origin.

Bivariate Poisson mixtures
We elaborate here on bivariate Poisson mixtures which will arise in the limiting distribution of S n in Section 4. We denote (γ) k as an ascending factorial with (γ Definition 2.3.We will say that the distribution U = (U 1 , U 2 ) is a bivariate Poisson mixture with mixing parameter F whenever there exists a bivariate random vector The next lemma brings into play bivariate Dirichlet(a 1 , a 2 , a 3 ); , and the bivariate hypergeometric or Humbert Φ 2 function given by The connection between these two entities, which we exploit in the following lemma, is that the probability generating function of a Dirichlet(a 1 , a 2 , a 3 ) random vector V is given by (Lee, 1971) This is obtained in a straightforward manner by expanding the exponential terms in the evaluation of the moment generating function and is also valid for cases where a 3 = 0 by a direct evaluation of E(t (2.8)  1 The probability function is obtained with the help of (2.5) and (2.6).(b) With the conditional Poisson representation, we have so that the result is immediate when a 3 = 0.For a 3 > 0, the result follows by verifying 3 Distribution of the totals T n and T For studying the distribution of T n , it suffices to consider the probability generating function G n (t 1 , . . ., t r ) of S n evaluated at t 1 = • • • = t r .Simplifications will thus follow when applying Lemma 2.2.Moreover, in the particular cases where we have assumption (2.1), the components S n,1 , . . ., S n,r are equidistributed for a given n, and the same is true for the W n,j 's, j = 1, . . ., r.As a consequence, the quantities H n,j (t, . . ., t) will be, for fixed t and n ≥ 1 constant in j, j = 1, . . ., r.And this also leads to further simplifications when applying Lemma 2.2.
Our key finding, which we now proceed to describe, establishes a link between a multivariate problem with r ≥ 2 and an univariate problem where r = 1.This will be especially useful given the known results in the literature applicable to the distribution of T n and T for r = 1.
1 We note here the general relationship between the probability generating function of the Poisson mixture U with the moment generating function of the mixing variable V , which also illustrates that the Poisson mixtures in Definition 2.3 are identifiable.
Theorem 3.1.Let ψ r n,p1,p2, ... ,pn+1 (•) be the probability generating function of T n with assumption (2.1).Let ψ 1 n,rp1,rp2, ... ,rpn+1 (•) be the probability generating function of S n = n k=1 X k X k+1 where X k ∼ ind.Ber(rp k ), k ≥ 1.2Then, we have for all n ≥ 1, r ≥ 2: for n ≥ 2, and with the initial values ψ 1 (t) = 1 + rp 1 p 2 (t − 1) and h 1 (t) = 1 + p 1 (t − 1).Now, we set ψ n (t) = a n ( t−1 r ), h n (t) = b n ( t−1 r ), and p n = rp n , so that the above system of equations becomes for n ≥ 2, u ∈ R. The last two systems of equations tell us that ψ r n,p1,p2, ... ,pn+1 (ru + 1) = a n (u) = ψ 1 n,rp1,rp2, ... ,rpn+1 (u + 1), for all n ≥ 2 and the result follows.Finally, results carry over to T in the same manner by making use of (2.3) and (2.4), and with a radius of convergence greater than 1/r for the pgf of S implies a radius of convergence greater than 1 for the pgf of T .Theorem 3.1 describes a powerful relationship between our r ≥ 2 problem of identifying the distribution of T n , or of T , and a corresponding univariate or r = 1 problem for which there exists already a certain number of results in the literature.We conclude this section with applications of Theorem 3.1.
for k ≥ 1, with (3.2) and (3.3) yielding the binomial moments and probability mass function of T n .For instance, we obtain for j = 0, 1, . . ., n, To conclude, observe that for p = 1 r , we have p k = 1 so that P (S n = n) = 1 and E(t S n ) = t n .Theorem 3.1 still applies and (3.1) yielding E(u Tn ) = (1 + u−1 r ) n , i.e., T n ∼ Bin(n, 1 r ).This serves more as an illustration as the result here for the distribution of T n follows at once from the representation T n = ) Theorem 3.1 along with expressions (2.3) and (2.4) yield immediate expressions for the binomial moments, the probability generating function and the probability mass function of T n through the binomial moment identity (3.2).Similarly, for the distribution of T , we obtain Moreover, it is straightforward to verify the following representation from the above, which constitutes a multivariate (r > 1) generalization of (1.1).
], and H n+1, ] are given by Proof.We proceed by induction.A direct evaluation yields , which matches the given formulas for n = 1.Now suppose the above formulas hold for n = 1, . . ., m.A slight reorganization of Lemma 2.2 tells us that j=1 p m+2,j H m+1,j yields the desired expression for G m+1 .Similarly, an evaluation of G m+1 + p m+2,j s j H m+1,j leads to the desired expression for H m+2 .Lemma 4.2.Suppose ψ is the probability generating function of a random vector (Z 1 , Z 2 ) on N 2 satisfying the equation and that the Taylor series development at (1,1) of ψ converges on an open set containing the origin.Then, ψ is given by

On runs and Bernoulli arrays
Proof.With the series representation (2.5) and the uniqueness of the coefficients, it suffices to show that Now, equation (4.1) implies, for all (t But the above is equivalent to The key result that follows concerns the limiting distribution of S n for the homogeneous bivariate case p k,1 = p k,2 = 1 b+k .A first explicit form (equation 4.8) for the probability generating function is easily derived from Theorem 4.1.A second explicit form is obtained via Lemma 4.2 by verifying directly that the probability generating function of S verifies (4.1).This permits to write down the probability generating func- tion of S in terms of the binomial moments of S 1 , that either can be derived from our expressions or taken from known results in the univariate case.
with p α the bivariate probability mass function on N 2 given by ) .(c) Given that the probability generating function of S is necessarily expressible as in (2.5), it will suffice given part (a) to show that under representation (4.6)-(4.7).In turn, it will suffice to show that the mixed binomial moments of (S 1 , S 2 ) under probability function p α are given by since this would imply along with representation (4.6 ) k+l , which is (4.9).Finally, manipulations lead to (4.10) as follows: with this, alternatively, following also from Corollary 3.4.In contrast to the Dirichlet mixing (when b > 1), the dependence in representation (d) is reflected through the conditional distributions of (S 1 , S 2 ), and the mixing variable α is univariate.Furthermore, it is readily verified that the conditional marginal distributions of S i |α are Poisson(α), which is consistent with the univariate result in (1.1).We conclude with some observations on the probability functions p α in (4.7).
Remark 4.4.The bivariate probability function in (4.7) has a simple enough form so that it possibly has arisen in previous work, but we cannot identify such a source.Anyhow, it is most interesting that it arises here in a natural way from the Bernoulli array in the representation of S for r = 2 and the configuration p k = 1 b+k , b ≥ 1.The probability generating function, using the binomial moments at the end of the proof of Theorem 4.3 and (2.5), may be written as α k+l (t 1 − 1) k (t 2 − 1) l (k + l)! .
As seen above, the marginals are Poisson(α) distributed.These distributions, as expanded on by Ait Aoudia and Marchand (2013), possess at least two other interesting properties: (i) the distribution of S 1 + S 2 (conditional on α) is given by the convolution of a Poisson(α) with a Bernoulli(α); (ii) the correlation coefficient between S 1 and S 2 (conditional on α) is equal to − α 2 .

Concluding Remarks
The main findings in this paper concern the numbers of runs of length 2 in Bernoulli arrays with independently distributed multinomial distributed rows and identically distributed row components.Exploiting the structure of the problem through a recurrence involving probability generating functions and building on known marginal distribution results in the literature, we have explored the distributions of totals across columns and joint distributions of column sums.Elegant representations have been obtained: (i) through Section 3's correspondence between multivariate and univariate problems to describe the distribution of a total, and (ii) with Section 4's bivariate distributions, where we have for instance obtained in a specific situation a bivariate Poisson mixture with a Dirichlet mixing parameter.Many other open and interesting problems can be envisioned.These include an analysis of the distributions of S n and S for r > 2, closed form distributional results in the absence of assumption (2.1), and an extended framework for probability models other than multinomial.

n k=1 I
{X k =X k+1 } with the indicator variables I {X k =X k+1 } independently distributed as Bernoulli( 1 r ).Example 3.3.(Case where p k = a r(a+b+k−1) ) In the setup for T n with assumption (2.1), consider cases where p k = a r(a+b+k−1) with a > 0, b ≥ 0. The analysis for general a, b will cover many interesting particular cases which we will point out below.First, following Theorem 3.1, we consider the univariate sequence S n with p k = rp k = a a+b+k−1 .From Holst (2008), we have for k ∈ {1, . . ., n}:

Corollary 3 . 4 .
For cases where p k = a r(a+b+k−1) with a > 0, b ≥ 0, the distribution of T admits the following Poisson mixture representation: T |L = l ∼ P oisson( a l r ) , L ∼ Beta(a, b) .

(3. 6 ) 1 b+kTheorem 4 . 1 .
We signal the following further applications.(I) With b = 0 and a = rλ; λ > 0; , i.e., p k = λ λ r+k−1 , we obtain that T has a Poisson distribution with mean equal to λ.When r = 1, this corresponds to result (1.1) with b = 0. (II) For the distributions of T n and T with the configuration p k = a k−1+rb with b ≥ a , the results above also apply by taking a = ra and b = r(b − a ).Corollary 3.4 hence yields the representation T |L = l ∼ Poisson(a l), L ∼ Beta(ra , r(b − a )) , for such p k 's.4 Distributions of S n and S: bivariate case with p k = In this section, we obtain the probability generating functions of S n , n ≥ 1, and S in the bivariate case (r = 2) with p k = 1 b+k , b ≥ 1.Moreover, by taking n → ∞ and by making use of a representation (Lemma 4.2) for the pgf of S in terms of the marginal binomial moments, we arrive at explicit forms for the probability generating and mass functions of S, as well as mixture representations (Theorem 4.3).This is achieved by first solving the recurrence given in Lemma 2.2 yielding explicit expressions for the probability generating functions G n , H n+1,1 and H n+1,2 for n ≥ 1 (Theorem 4.1).Under assumption (2.1) with r = 2 and p k = 1 b+k , the probability gen-

Theorem 4 . 3 .
Under assumption (2.1) with r = 2 and with p k = 1 b+k , b ≥ 1, (a) The probability generating and probability mass functions of S are given by

( b )
Part (a) paired with Lemma 2.4 imply the given representation.

( d ) 1 0
It suffices to show that P (S 1 = s 1 , S 2 = s 2 ) is a function of s 1 + s 2 , s 1 , s 2 ≥ 0. From part (c), we have P (S 1 = s 1 , S 2 = s 2 ) = p α (s 1 , s 2 ) b (1 − α) b−1 dα, with p α given in (4.7), and which indeed depends on (s 1 , s 2 ) only through the sum s 1 + s 2 .The bivariate Poisson mixture representation of S extends in a most interesting way the known marginal distribution representation for S 1 and S 2 (i.e., (1.1)) expressible as a Beta mixture of Poisson distributions.Here S 1 and S 2 are clearly dependent but the representation tells us that they are conditionally independent and the dependence is reflected through the dependence of the mixing components of the Dirichlet.Similarly, we obtain from part (b) of the Theorem and Lemma 2.4 the mixture representation We have for all t = (t 1 , . . ., t r ), n ≥ 2, j ∈ {1, . . ., r}: Indeed, the Taylor series expansion about 1 of the probability generating function ψ Z , of an non-negative integer-valued random variable Z, having radius of convergence greater than 1 can be expressed as 1, . . ., n} .Furthermore, the above applies to T = lim n→∞ T n by replacing T n by T , S n by S = lim n→∞ S n , and taking n → ∞; for u ∈ [−1, 1] in (3.1) and with (3.2) and (3.3) applicable as long as the probability generating function of S has radius of convergence about 1 greater than 1/r.(say, for short), and H n,j (t, . . ., t) = h n (t) (say) by virtue of assumption (2.1).Rewrite Lemma 2.2's system of equations as Proof.(3.2) and (3.3) follow from (3.1), (2.3) and (2.4).There remains (3.1).As remarked upon above, we have for fixed t: G n (t, . . ., t) = ψ r n,p1,p2, ... ,,pn+1 (t) = ψ n (t) − 1].