Orthogonality and probability: beyond nearest neighbor transitions

In this article, we will explore why Karlin-McGregor method of using orthogonal polynomials in the study of Markov processes was so successful for one dimensional nearest neighbor processes, but failed beyond nearest neighbor transitions. We will proceed by suggesting and testing possible fixtures.


Introduction
This paper was influenced by the approaches described in Deift [1] and questions considered in Grünbaum [5].
The Karlin-McGreogor diagonalization can be used to answer recurrence/transience questions, as well as those of probability harmonic functions, occupation times and hitting times, and a large number of other quantities obtained by solving various recurrence relations, in the study of Markov chains, see [7], [8], [9], [10], [6], [15], [14], [3], [12]. However with some exceptions (see [11]) those were nearest neighbor Markov chains on half-line. Grünbaum [5] mentions two main drawbacks to the method as (a) "typically one cannot get either the polynomials or the measure explicitly", and (b) "the method is restricted to 'nearest neighbour' transition probability chains that give rise to tridiagonal matrices and thus to orthogonal polynomials". In this paper we attempt to give possible answers to the second question of Grünbaum [5] for general reversible Markov chains. In addition, we will consider possible applications of the newer methods in orthogonal polynomials such as using Riemann-Hilbert approach, see [1], [2] and [13], and their probabilistic interpretations.
In Section 2, we will give an overview of the Karlin-McGregor method from a naive college linear algebra perspective. In 2.3, we will give a Markov chain interpretation to the result of Fokas , Its and Kitaev, connecting orthogonal polynomials and Riemann-Hilbert problems. Section 3 deals with one dimensional random walks with jumps of size ≤ m, the 2m + 1 diagonal operators. There we consider dioganalizing with orthogonal functions. In 3.2, as an example we consider a pentadiagonal operator and use Plemelj formula, and a two sided interval to obtain the respective diagonalization. In Section 4, we use the constructive approach of Deift [1] to produce the Karlin-McGregor diagonalization for all irreducible reversible Markov chains. After that, we revisit the example from Section 3.

Eigenvectors of probability operators
Suppose P is a tridiagonal operator of a one-dimensional Markov chain on {0, 1, . . . } with forward probabilities p k and backward probabilities q k . Suppose λ is an eigenvalue is the corresponding right eigenvector such that Q 0 = 1. So λq T = P q T generates the recurrence relation for Q j . Then each Q j (λ) is a polynomial of j-th degree. The Karlin-McGregor method derives the existance of a probability distribution ψ such that polynomials Q j (λ) are orthogonal with respect to ψ. In other words, if π is stationary with π 0 = 1 and < ·, · > ψ is the inner product in L 2 (dψ), then Also observe from the recurrence relation that the leading coefficient of Q j is Since the spectrum of P lies entirely inside [−1, 1] interval, then so is the support of ψ. Hence, for |z| > 1, the generating function

Converting to a Jacobi operator
π k q k+1 due to reversibility condition. Thus the recurrence relation for q, can be rewritten as is a Jacoby (symmetric triangular with b k > 0) operator. Observe that P is self-adjoint. The above approach extends to all reversible Markov chains. Thus every reversible Markov operator is equivalent to a self-adjoint operator, and therefore has an all real spectrum.

Karlin-McGregor: a simple picture
It is a basic fact from linear algebra that if λ 1 , . . . , λ n are distinct real eigenvalues of an n × n matrix A, and if u 1 , . . . , u n and v 1 , . . . , v n are the corresponding left and right eigenvectors. Then A diagonalizes as follows Here U σ(A) (λ) is the uniform distribution over the spectrum σ(A).
It is important to observe that the above integral representation is only possible if u(λ) and v(λ) are well defined -each eigenvalue has multiplicity one, i.e. all distinct real eigenvalues. As we will see later, this will become crucial for Karlin-McGregor diagonalization of reversible Markov chains. The operator for a reversible Markov chain is bounded and is equivalent to a self-adjoint operator, and as such has a real bounded spectrum. However the eigenvalue multiplicity will determine whether the operator's diagonalization can be expressed in a form of a spectral integral.
Since the spectrums σ(P ) = σ(P * ), we will extend the above diagonalization identity to the operator P in the separable Hilbert space l 2 (R). First, observe that u(λ) = (π 0 Q 0 , π 1 Q 1 , . . . ) satisfies uP = λP due to reversibility. Hence, extending from a finite case to an infinite dimensional space l 2 (R), obtain The above is the weak limit of where A n is the restriction of P to the first n coordinates, < e 0 , . . . , e n−1 > The orthogonality follows if we plug in t = 0. Since π 0 Q 0 Q 0 = 1, ψ should integrate to one.
Example. Simple random walk and Chebyshev polynomials. The Chebyshev polynomials of the first kind are the ones characterizing a one dimensional simple random walk on half line, i.e. the ones with generator Chebyshev polynomials satisfy the following trigonometric identity: where π(0) = 1 and π(1) = π(2) = · · · = 2. Here Thus if X n ∼ U {cos(n cos −1 (λ))=0} , then Y n = cos −1 (X n ) ∼ U { π 2n + πk n , k=0,1,...,n−1} and Y n converges weakly to Y ∼ U [0,π] . Hence X n converges weakly to Also observe that if x = cos(λ), then 3 Riemann-Hilbert problem and a generating function In preparation for the next step, let w(λ) be the probability density function associated with the spectral measure ψ: dψ(λ) = w(λ)dλ on the compact support, denote the Cauchy transform w.r.t. measure ψ. First let us quote the following theorem.
is the unique solution to the Riemann-Hilbert problem with the above jump matrix v(x) and Σ that satisfies the following condition The Riemann-Hilbert problem, for an oriented smooth curve Σ, is the problem of finding m(z), analytic in C \ Σ such that where m + and m − denote respectively the limit from the left and the limit from the right, for the function m, as we approach a point on Σ.
Suppose we are given the weight function w(λ) for the Karlin-McGregor orthogonal polynomials q. If m (n) (z) is the solution of the Riemann-Hilbert problem as in the above theorem, then for |z| > 1,

Beyond nearest neighbor transitions
Observe that the Chebyshev polynomials were used to diagonalize a simple one dimensional random walk reflecting at the origin. Let us consider a random walk where jumps of sizes one and two are equiprobable   · · · · · · · · · . . . . . . . . . . . .
The above random walk with the reflector at the origin is reversible with π(0) = 1 and π(1) = π(2) = · · · = 2. The Karlin-McGregor representation with orthogonal polynomials will not automatically extend to this case. However this does not rule out obtaining a Karlin-McGregor diagonalization with orthogonal functions. In the case of the above pentadiagonal Chebyshev operator, some eigenvalues will be of geometric multiplicity two as where P ch is the original tridiagonal Chebyshev operator.
Let again A n denote the restriction of P to the first n coordinates, < e 0 , . . . , e n−1 > Observe that if Q n (λ) = · · · = Q n+m−1 (λ) = 0 then (Q 0 (λ), . . . , Q n−1 (λ)) T is the corresponding right eigenvector of A n . Thus the spectrum of σ(A n ) consists of the roots of

Chebyshev operators
Let us now return to the example generalizing the simple random walk reflecting at the origin. There one step and two step jumps were equally likely. The characteristic equation z 4 + z 3 − 4λz 3 + z 2 + z = 0 for the recurrence relation c n+2 + c n+1 − 4λc n + c n−1 + c n−2 = 0 can be easily solved by observing that if z is a solution then so arez and 1 z . The solution in radicals is expressed as z 1,2 = r 1 ± i 1 − r 2 1 and z 3, . Observe that r 1 and r 2 are the two roots of s(x) = λ, where s(x) = x 2 + 1 2 x − 1 2 is the polynomial for which P = s(P ch ) In general, the following is true for all operators P that represent symmetric random walks reflecting at the origin, and that allow jumps of up to m flights: there is a polynomial s(x) such that P = s(P ch ) and the roots z j of the characteristic relation in λc = P c will lie on a unit circle with their real parts Re(z j ) solving s(x) = λ. The reason for the latter is the symmetry of the corresponding characteristic equation of order 2m, implying 1 z j =z j , and therefore the characteristic equation for λc = P c can be rewritten as where 1 2 z + 1 z is the Zhukovskiy function. In our case, s(x) = x + 1 4 2 − 9 16 , and for λ ∈ − 9 16 , 0 , there will be two candidates for µ 1 (λ), µ + (λ) = r 1 = −1 + √ 9 + 16λ 4 and µ − (λ) = r 2 = −1 − √ 9 + 16λ 4 Taking 0 ≤ arg z < 2π branch of the logarithm log z, and applying Plemelj formula, one would obtain where µ + (λ) = lim z→λ, Im(z)>0 µ 1 (z) and µ − (λ) = lim z→λ, Im(z)<0 µ 1 (z). Now, as we defined µ 1 (z), we can propose the limits of integration to be a contour in C consisting of − 9 16 , 0 + = lim ε↓0 z = x + iε : x ∈ − 9 16 , 0 , and − 9 16 , 0 − = lim ε↓0 z = x − iε : x ∈ − 9 16 , 0 , and the [0, 1] segment. Then where u(λ) is defined as before, and Let us summarize this section as follows. If the structure of the spectrum does not allow Karlin-McGregor diagonalization with orthogonal functions over [−1, 1], say when there are two values of µ T (λ) for some λ, then one may use Plemelj formula to obtain an integral diagonalization of P over the corresponding two sided interval.

Spectral Theorem and why orthogonal polynomials work
The constructive proofs in the second chapter of Deift [1] suggest the reason why Karlin-McGregor theory of diagonalizing with orthogonal polynomials works for all time reversible Markov chains. Using the same logical steps as in [1], we can construct a map M which assigns a probability measure dψ to a reversible transition operator P on a countable state space {0, 1, 2, . . . }. W.l.o.g. we can assume P is symmetric as one can instead consider  In the above representation a ≥ 0 and b are real constants and dψ is a Borel measure such that Until now we were reapplying the logical steps in Deift [1] for the case of reversible Markov chains. However, in the original, the second chapter of Deift [1] gives a constructive proof of the following spectral theorem, that summarizes as where U is one-to-one onto. Thus, if P is irreducible, then f j = Q j (P )e 0 is an orthonormal basis for Karlin- where F T = F −1 . Also Deift [1] provides a way for constructing Since P △ is a Jacobi operator, it can be represented as To obtain the above expression for dψ we used the fact that e 0 , (P − zI) −1 e 0 would be the same if there were no reflector at zero.

Applications of Karlin-McGregor diagonalization
Let us list some of the possible applications of the diagonalization.
• One can extract a sharp rate of convergence to a stationary probability distribution, if there is one, see Diaconis et. al. [3].
• The generator • One can use the Fokas, Its and Kitaev results, and benefit from the connection between orthogonal polynomials and Riemann-Hilbert problems.
• One can interpret random walks in random environment as a random spectral measure.