Large deviation principle for moment map estimation

We consider a family of positive operator valued measures associated with representations of compact connected Lie groups. For many independent copies of a single state and a tensor power representation we show that the observed probability distributions converge to the value of the moment map. For invertible states we prove that the measures satisfy the large deviation principle with an explicitly given rate function.


Introduction
This paper is concerned with probability distributions related to decompositions of tensor power representations into irreducibles. More precisely, given a representation π of a compact Lie group K on a finite dimensional Hilbert space H as well as a positive operator ρ on H with unit trace (a state), for every n the values Tr P λ ρ ⊗n determine a probability distribution on the set of dominant weights λ of K.
The asymptotic behaviour of such distributions has been studied by many authors with different motivations. In the context of random walks, it was shown in [13] that counting the multiplicities of irreducible representations in certain tensor power representations is equivalent to enumerating the number of walks for "reflectable" walk types, conditioned on staying within a Weyl chamber. This class of random walks was introduced by Gessel and Zeilberger in [12] as a generalisation of the classical ballot problem [2,25] to finite reflection groups. In [23], Tate and Zelditch analysed the multiplicities in high tensor powers and proved a central limit theorem as well as large deviation results by relating them to the (much simpler) weight multiplicity asymptotics. Moving away from the tracial state to positive operators arising from the representation of the complexified group, Postnova and Reshetikhin [20] generalised these asymptotic formulas to character distributions.
A different viewpoint is provided by interpreting these processes as restrictions of a random walk on a noncommutative space, the dual of the compact Lie group K, to a classical subalgebra [3,19]. More precisely, given a state on the group von Neumann algebra vN (K) one constructs a quantum Markov chain on the infinite tensor product vN (K) ⊗∞ , and the probability distributions in question are obtained by restriction to the center Z(vN (K)), which can be identified with the ∞ space on the set of isomorphism classes of irreducible representations of K (see also [4]).
In the context of statistical mechanics, Cegła, Lewis and Raggio investigated the multiplicities arising from the isotypic decomposition of tensor products of representations of the group SU (2), and proved a large deviation principle [7]. Duffield [9] extended their result to an arbitrary compact semisimple Lie group using the Gärtner-Ellis theorem [10, Theorem II. 6.1.].
In quantum statistics, Alicki, Rudicki and Sadowski [1], and later Keyl and Werner [15] proposed an estimator for the spectrum of the density operator, which is based on the decomposition of the tensor powers of the defining representation of SU (d) (or U (d)).
Based on Duffield's result, Keyl and Werner found the rate function for the exponential decay to be the relative entropy between the normalised Young diagram labelling the irreducible representation and the nonincreasingly ordered spectrum of the state. In [14] Keyl refined the estimator to a continuous positive operator valued measure (POVM) estimating both the spectrum and the eigenvectors of an unknown state and proved a large deviation principle in that setting. The appearance of the relative entropy in the result of Keyl and Werner suggests that similar large deviation rate functions should be viewed as information quantities. In [24] a family of entanglement measures have been constructed based on the rate function corresponding to the standard representation of products of unitary groups. While studying a tripartite extension of the Matsumoto-Hayashi universal distortion-free entanglement concentration protocol [18], Botero and Mejía [5] found formulas for the probabilities induced by the isotypic decomposition or LDP for moment map estimation a highest weight vector of norm 1. Then K · |v λ v λ | can be identified (via the moment map) with the orbit of λ in ik * under the (complexified) coadjoint action. Weighted with the suitably normalised invariant measure on the orbit and tensored with the identity operator on the multiplicity space, these projections give rise to a POVM E H ⊗m from the Borel σ-algebra of ik * to B(H ⊗m ). In the special case K = U (d) and π the standard representation, this measure is the same as the one proposed in [14].
If ρ is a state on H then we can form the sequence of probability measures µ m (A) = Tr ρ ⊗m E H ⊗m (mA). These measures are interpreted as the probability distribution of the (rescaled) random classical outcome of the measurements that are described by the POVM E H ⊗m . We find that the measures µ m converge weakly to the Dirac measure concentrated at J(ρ) and the convergence is exponentially fast.
To formulate a more precise statement, choose a Borel subgroup B in the complexification of K, let N be its maximal unipotent subgroup, let a = it where t is the Lie algebra of the maximal torus T = K ∩ B, and let it * + ⊆ it * be the closure of the positive Weyl chamber. Then every element x ∈ ik * can be written as x = h · x 0 with a unique x 0 ∈ it * + and a non-unique h ∈ K. Let x 0 , α − ln Tr π(n) * π(exp α/2)π(h) * ρπ(h)π(exp α/2)π(n). (1.2) We will show in Section 3.3 that I ρ is well defined and is a good rate function (i.e. has compact sub-level sets).

Theorem 1.1 (Large deviation principle)
. Let H be a finite dimensional Hilbert space, K a compact connected group, π : K → U (H) and µ m as above.
(i) For every state ρ and closed subset C ⊆ ik * we have lim sup  For faithful states the theorem says that the measures µ m satisfy the large deviation principle with rate function I ρ . For general states we can only prove a weaker version of (1.4), replacing O on the right hand side with O ∩ M ρ where M ρ is a dense subset of dom I ρ . In the examples below I ρ is continuous on its domain, therefore the stronger conclusion (1.4) still holds even if ρ is not invertible.
In addition, we show that I ρ only vanishes at J(ρ), which identifies the weak limit as a Dirac measure:  If one is only interested in the way ρ ⊗m distributes the probabilty among the isotypic components, then it is possible to extract this coarse-grained information by taking the pushforward of the measure µ m along the continuous function that sends x ∈ ik * to the unique element x 0 ∈ (K · x) ∩ it * + . According to the contraction principle, for invertible states these measures also satisfy the large deviation principle with rate functioñ x 0 , α − ln Tr π(n) * π(exp α/2)π(h) * ρπ(h)π(exp α/2)π(n). Our result reduces to the formula of Cegła, Lewis and Raggio [7] and to [ [14,Theorem 3.2] when K = U (d), H = C d and π is the standard representation. From the latter the result of Keyl-Werner [15] follows by the contraction principle. Setting K = U (1) d we also recover Cramér's large deviation theorem [8] in the special case of finitely supported integer-valued random variables.
The key element of our proof can be viewed as a noncommutative generalisation of the exponential tilting method, applied directly to the quantum states before projecting onto the classical subalgebra (or pairing with the POVM). More precisely, we replace the state ρ ⊗n with the transformed state (π(g) * ρπ(g)) ⊗n , where g is an element of the complexification of K. Consequently, the transformed measures are not obtained by multiplication with a suitable function and must be related to the original one more carefully, also taking into account the (complexified) coadjoint action of K. A substantial part of our work is the development of these techniques in Section 3.2.
The paper is organised as follows. In Section 2 we fix the notation and collect some facts related to the representations and structure of compact Lie groups and their complexifications, and to positive operator valued measures. Section 3 contains the proof of our main results: in Section 3.1 we define the POVM used in the estimation scheme, in Section 3.2 we introduce an action of the complexification of K on ik * , in Section 3.3 we give several formulas for the rate function and prove some of its key properties, in Section 3.4 we prove the large deviation upper bound, in Section 3.5 we prove weak convergence to the value of the moment map and in Section 3.6 we address the large deviation lower bound.

Related work.
Closely related work has been done independently by Cole Franks and Michael Walter [11].

Preliminaries
Throughout Hilbert spaces are assumed to be finite dimensional and the inner product ·, · is linear in the second argument. U (H) denotes the group of unitary operators on H. We denote by S(H) the set of positive semidefinite operators on H with trace equal to 1 (states). Every state ρ ∈ S(H) admits a purification, i.e. ρ = Tr C d |ψ ψ| for some unit vector ψ ∈ H ⊗ C d , where Tr C d is the partial trace and |ψ ψ| is the orthogonal projection onto the subspace spanned by ψ. Below we will collect the necessary standard results on compact Lie groups and their complexifications. More details can be found in many textbooks, e.g. [17].

Complexification
Let K be a compact connected Lie group, k = T e K its Lie algebra (T e stands for the tangent space at the identity when applied to a Lie group and the derivative at the identity when applied to homomorphisms). The complexification G = K C is a complex Lie group together with an inclusion K → G, defined by the property that every smooth homomorphism from K to a complex Lie group extends uniquely to a holomorphic homomorphism from G. G is a reductive group and its Lie algebra is g = C ⊗ R k. We identify k * with the subspace of g * that consists of functionals that are real on k. We use angle brackets ·, · for the pairing between a vector space and its dual.
The group multiplication gives a diffeomorphism K × P → G where P = exp(ik). The (global) Cartan involution is a group homomorphism Θ : G → G that fixes K and acts on p ∈ P as Θ(p) = p −1 . K is the fixed point set of Θ. For g ∈ G we define g * = Θ(g) −1 . The exponential map provides a diffeomorphism between ik and P . A finite dimensional unitary representation π : K → U (H) extends uniquely to a homomorphism G → GL(H) as complex Lie groups. The extension, denoted with the same symbol, satisfies π(g * ) = π(g) * .
Example 2.1. Let K = U (1) d be a torus. Then k can be identified with iR d , G = (C × ) d and g is the Lie algebra C d . ik R d and P R d >0 . Θ sends a d-tuple g = (g 1 , . . . , g d ) ∈ G to (g 1 −1 , . . . , g d −1 ) and g * is the (componentwise) conjugate of g.
Example 2.2. Let K = U (d) be the group of d × d unitary matrices. Then k consists of skew-hermitian matrices, G = GL(d, C) and g is the Lie algebra of all d × d complex matrices. We identify g * with g via the pairing (x, ξ) → Tr(xξ). Under this identification k * corresponds to the space of skew-hermitian matrices. ik is the space of hermitian matrices and P is the set of positive definite matrices. Θ sends a matrix to the conjugate transpose of its inverse, and g * is the conjugate transpose of g.

Moment map
Let π : K → U (H) be a representation on a finite dimensional Hilbert space. We consider the map J : S(H) → ik * defined as J(ρ), ξ = Tr (T e π(ξ)ρ) . We identify the projective space P H with the set of rank one orthogonal projections. For v ∈ H \ {0} we denote the corresponding projection (or equivalence class) by [v]. In particular, the restriction of J to P H is P H is a symplectic manifold with the Fubini-Study symplectic form and the action of K is Hamiltonian with 1 2πi J as a moment map [16, 2.7]. In the special case when π : K → U (H λ ) is an irreducible representation with highest weight λ we will use the notation J λ for the above map. Its value on the ray through a highest weight vector v λ is J λ ([v λ ]) = λ. The restriction of J λ to K · [v λ ] is injective. Remark 2.3. Let ρ ∈ S(H) and ψ ∈ H ⊗ C d be any purification of ρ. Consider the representation K → U (H ⊗ C d ) given by k → π(k) ⊗ I. Then J H⊗C d ([ψ]), ξ = ψ, (T e π(ξ) ⊗ I) ψ = Tr (T e π(ξ)ρ) = J(ρ), ξ . Thus in general 1 2πi J • Tr C d is a moment map for K acting on the larger projective space. K acts on S(H) as k · ρ := π(k)ρπ(k) * . For k ∈ K we have J(k · ρ), ξ = Tr (T e π(ξ)π(k)ρπ(k) * ) = Tr π(k −1 )T e π(ξ)π(k)ρ = Tr T e π(k −1 · ξ)ρ = J(ρ), k −1 · ξ = k · J(ρ), ξ , (2.4) i.e. J is equivariant with respect to the coadjoint action of K on ik * . Remark 2.4. P H is connected, therefore any two moment maps differ by a constant, and for any two equivariant moment maps the difference is fixed by the coadjoint action.
Thus when K is semisimple, 1 2πi J is the unique equivariant moment map.
EJP 26 (2021), paper 79. Example 2.5. Let K = U (1) d . Irreducible representations are one dimensional and are of the form π n (g 1 , . . . , g d )v = g n1 1 g n2 2 · · · g n d d v for n = (n 1 , . . . , n d ) ∈ Z d . Let π : K → U (H) be an arbitrary representation and decompose H into isotypic subspaces as H = n∈Z d H n ⊗ C dn (d n is nonzero for only finitely many terms). If P n denotes the orthogonal projection onto these subspaces, then J(ρ) is determined by the numbers r n = Tr P n ρ as J(ρ), ξ = n∈Z d r n d i=1 n i ξ i for ξ = (ξ 1 , . . . , ξ d ) ∈ C d g. Example 2.6. Let K = U (d) and π the standard representation on C d . Under the identification of ik * with the space of hermitian matrices (see Example 2.2) J maps every state ρ to itself.

Borel subgroups
A Borel subgroup B ≤ G is a maximal solvable subgroup. Any two such subgroups are conjugate by an element of K. From now on we fix a Borel subgroup B and use the following notations: T is a maximal torus in K, exp : a → A is a diffeomorphism, N is the maximal unipotent subgroup of B and T normalises N . For any element g ∈ G we have the Iwasawa decomposition g = kan with uniquely determined elements k ∈ K, a ∈ A and n ∈ N . For g ∈ G we write α(g) for the element of a that exponentiates to a. The map α : G → a is smooth. In a similar way we write k(g) for the K-component of g in the Iwasawa decomposition.
decomposition is the same as the componentwise polar decomposition, more precisely Iwasawa decomposition is essentially the QR decomposition, but with the triangular part decomposed into its diagonal and another upper triangular matrix with 1 entries on the main diagonal.
for every b ∈ B or equivalently, v = π(n)v for every n ∈ N and v is an eigenvector of A. Let g ∈ G and let g = kan be its Iwasawa decomposition. Then Let it * + ⊆ ik * be the closure of the dominant Weyl chamber, using an Ad-equivariant inner product to identify it * with a subspace of ik * . Then every coadjoint orbit in ik * intersects it * + in a unique point. For x ∈ ik * let K x ≤ K be the stabiliser subgroup with respect to the coadjoint action. Writing x = h · x 0 with x 0 ∈ it * + and h ∈ K, the subgroup of G generated by K x and hBh −1 is a parabolic subgroup and its intersection with K is K x .

Positive operator valued measures
Let (X, X ) be a measurable space and H a Hilbert space. A positive operator valued measure is a σ-additive map E : X → B(H) such that E(A) ≥ 0 for every A ∈ X , E(X) = I.
We will construct positive operator valued measures in the following way. Let µ 0 be a measure on (X, X ) and let f : X → B(H) be a function that is measurable, its values are EJP 26 (2021), paper 79.
defines a positive operator valued measure, which will be denoted f µ 0 .
Given a state ρ ∈ S(H) and a positive operator valued measure E : X → B(H) we can form a probability measure µ : X → [0, 1] as µ(A) = Tr E(A)ρ. In particular, when E = f µ 0 with µ 0 and f as above, then

Moment map estimation
In this section we define precisely our estimation scheme (Section 3.1) and prove our main results, a large deviation principle and a law of large numbers for the induced measures. The proofs are divided into separate sections: in Section 3.2 we introduce an action of G on ik * as well as a function (Definitions 3.2 and 3.5) which encode the G-actions on the highest weight orbits of irreducible K-representations and extend them in a continuous and scale-equivariant way; in Section 3.3 we present various expressions for the rate function and prove that it is a good rate function; Proposition 3.24 in Section 3.4 proves part i of Theorem 1.1; Section 3.5 contains the proof of Theorem 1.2; part ii of Theorem 1.1 is proved in Section 3.6.
From now on K will be an arbitrary but fixed compact connected Lie group, B a Borel subgroup of its complexification G, π : K → U (H) a finite dimensional representation and ρ ∈ S(H). In this generality we can prove a large deviation upper bound and a law of large numbers, whereas for the matching lower bound we will make the additional assumption supp ρ = H.

The measurements
Let H λ be an irreducible representation of K with highest weight λ. The orbit of the highest weight ray in P H λ has a unique K-invariant probability measure ν λ . Every representation K → U (K) can be decomposed as where the sum is over the integral weights in it * + . We define p λ,K : P H λ → B(H λ ⊗ Hom K (H λ , K)) ⊆ B(K) to be the function which sends the equivalence class of the unit vector v to |v v| ⊗ id Hom K (H λ ,K) . With the notation of Section 2.4, we take X to be the orbit of the highest weight ray with its Borel σ-algebra as X , the probability measure ν λ plays the role of µ 0 and (dim H λ )p λ,K is the measurable function f .
Next we glue these together into a POVM on ik * by taking the pushforward and summing over the isomorphism classes of irreducible representations. Let J λ : P H λ → ik * be the map as in Section 2.2. We define the positive operator valued measure For every m ∈ N, E H ⊗m corresponds to a measurement that can be performed on m copies of the state ρ, with values in ik * . As we will see, the typical values behave in an extensive way. For this reason we will include a 1 m rescaling in the probability measures.
Explicitly, for Borel sets A ⊆ ik * we define

Extension of the coadjoint action
The aim of this section is to define a continuous action of G on ik * such that the restriction of J λ to the highest weight orbit is G-equivariant for every dominant weight λ, and the action commutes with scaling by nonnegative real numbers.
Consider an irreducible representation π λ : K → U (H λ ) and let v λ be a highest weight vector of norm 1.
. While the action of K keeps the norm fixed, elements of G will change the norm. For g ∈ G and an element |h · v λ h · v λ | in the orbit, let us write gh = kan for the Iwasawa decomposition.
We have (3.4) in the last line emphasizing the dependence k = k(gh).
The restriction of J λ to K · [v λ ] is injective and K-equivariant, therefore we can use it to define a G-action on K · λ = J λ (K · [v λ ]) and the constant factor in eq. (3.4) gives rise to a function (g, h · λ) → 2 λ, α(gh) on K · λ.
In what follows we extend both the G-actions and the functions giving the constant factor to ik * in a continuous way. The extensions will still be additive in the above sense (i.e. on h · it * + for every h ∈ K) as well as positively homogeneous of degree 1. Note that there is at most one such extension since the positive Weyl chamber is the cone generated by the dominant weights. For the action, the only possibility is that g maps h · x 0 to k(gh) · x 0 when h ∈ K and x 0 ∈ it * + . The next proposition shows that this is indeed well defined and determines an action of G on ik * Proposition 3.1. Proof.
Since h −1 2 h 1 ∈ K x0 , the product is in the subgroup generated by B and K x0 . This is a parabolic subgroup whose intersection with K is K x0 , therefore k −1 (ii) Let g 1 h = k 1 a 1 n 1 and g 2 k 1 = k 2 a 2 n 2 be the Iwasawa decompositions (so k 1 = k(g 1 h) and k 2 = k(g 2 k 1 ) = k(g 2 k(g 1 h))). Then The last two factors are in N since A normalises N , therefore k 2 = k(g 2 g 1 h).
This proposition justifies the following definition and also shows that it defines an action of G on ik * . Notice that we use the same notation for the newly introduced map as for the (complexified) coadjoint action of K. This will not lead to a confusion as the two agree on K.
. Then we set g · x := k(gh) · x 0 . By construction, the restriction of J λ to the highest weight orbit is G-equivariant for every dominant weight λ.
For x ∈ ik * we will denote by G x the stabiliser subgroup with respect to this action. If x = h · x 0 with x 0 ∈ it * + and h ∈ K then G x is the (parabolic) subgroup generated by K x and hBh −1 and satisfies G x ∩ K = K x .
Since the G-action is defined in terms of the K-action, the G-orbits and the K-orbits are clearly the same. On the other hand, for general x, x ∈ ik * the orbits K x · x and G x · x are not the same. We will need the following condition for equality.
In particular, for every x ∈ h · it * + there is a neighbourhood in h · it * where the orbits under K x and G x agree.
Proof. It is clear that K For the other direction, let x 0 = h −1 · x and x 0 = h −1 · x (so x 0 , x 0 ∈ it * + ) and let g ∈ G x . We have (3.7) We have h −1 gh ∈ G x0 , therefore k(h −1 gh) ∈ K x0 (since B fixes x 0 ). This implies The second statement follows from the fact that a sufficiently small neighbourhood of x in h · it * only intersects those Weyl chambers whose closure contains x, and these can be moved into h · it * + with an element of K x .
We turn to the constant factor in (3.4). Again, there is at most one extension, which should map (g, h · x 0 ) to e 2 x0,α(gh) and our next goal is to verify that this is well defined.
Proof. K x0 is generated by T and the infinitesimal generators k∩(g ω +g −ω ) where ω ∈ ik * are those simple roots which are orthogonal to x 0 (with respect to any Ad-invariant inner product) and g ω is the corresponding root space.
Let u ∈ T . T normalises N and commutes with A, therefore if g = kan is the Iwasawa decomposition of g, then gu = kua(u −1 nu) is the Iwasawa decomposition of gu, so α(gu) = α(g).
EJP 26 (2021), paper 79. Let η ∈ g ω . We wish to know how the α-component of g changes under right multiplication in the direction of η. The Iwasawa decomposition gives a diffeomorphism between K × A × N and G, therefore there exist uniquely ξ ∈ k, β ∈ a and ν ∈ n such that d ds k exp(α)n exp(sη) where g = k exp(α)n. Let L g : G → G be the left translation by g and T e L g its derivative at the identity. We calculate both sides of the above equation as From this we read off Ad exp(α)n η = ξ + Ad exp(α) β + Ad exp(α)n ν = ξ + β + Ad exp(α)n ν since A acts trivially on a. This is the Iwasawa decomposition (on the Lie-algebra level), since Ad exp(α)n ν ∈ n (we use that A normalises N ), therefore β = d ds α(g exp(sη)) s=0 is the a-component of the Iwasawa decomposition of Ad exp(α)n η. We have For the second statement, note that the condition h 1 · x 0 = h 2 · x 0 is equivalent to Using the first part with gh 1 and u = h −1 This justifies the following definition.
In particular, the restriction of χ x to G x is a one dimensional character. From χ x (e) = 1 and multiplicativity it follows that χ x (g −1 ) = χ g −1 ·x (g) −1 .

Rate function and some properties
We now give the definition of the rate function appearing in Theorem 1.1, then we prove its equivalence with several other expressions.
The maximum over n is attained at n = e, as can be seen by computing the trace in a basis where T e π(α) is diagonal and π(N ) consists of upper triangular matrices with 1 on their diagonals. The formula is K-invariant, therefore the infimum over K in (1.6) can be omitted, leading to the formulã The character Tr π(·) is K-invariant and x 0 , α is maximal within the K-orbit of α when α is in the image of the dominant Weyl chamber under the identification of it with it * via an invariant inner product. Therefore the supremum can be restricted to this subset of a.
For a more precise asymptotic formula for the multiplicities see [23,Theorem 9.]. (3.30) Example 3.16 (Cramér,[8]). Let K = U (1) d and π : K → U (H) a finite dimensional unitary representation. Let ρ be arbitrary and write r n = Tr P n ρ where P n is the orthogonal projection corresponding to the irreducible representation labelled by n ∈ Z d (see Example 2.5). If X 1 , X 2 , . . . , X m are independent and identically distributed discrete vector random variables that take the value n with probability r n , then µ m is the distribution of 1 m (X 1 + X 2 + · · · + X m ). To compute the rate function we use π(exp α) = n∈Z d e n,α P n :   n ∈ N . With N the set of upper triangular unipotent matrices (see Example 2.8), it is possible to choose n in such a way that π(n) * σπ(n) is diagonal. As in Example 3.14, this is where the minimum is attained. To find the diagonal entries, note that the principal minors are invariant under this action of N . If pm j (σ) denotes the determinant of the upper left j × j submatrix of σ (with the convention pm 0 (σ) = 1), then the resulting Let α 1 , . . . , α d ∈ R and x 0,1 , . . . , x 0,d ∈ R be the diagonal entries of α ∈ a R d and x 0 ∈ it * + . With σ = π(exp(α/2))π(h) * ρπ(h)π(exp(α/2)) the rate function formula (3.21c) becomes The supremum can be found by differentiation, which gives  as before, we choose N to be group of pairs of upper triangular unipotent matrices. Let x 1,0 , α 1 ∈ R d1 , x 2,0 , α 2 ∈ R d2 , identified with diagonal matrices of sizes d 1 × d 1 and d 2 × d 2 and with x 1,0 and x 2,0 nonincreasing. For (h 1 , h 2 ) ∈ K, the pair (h 1 · x 1,0 , h 2 · x 2,0 ) can be viewed as an element of ik. As in Example 3.17, the supremum over (n 1 , n 2 ) ∈ N in (3.21c) is attained when n * 1 exp(α 1 /2)h * 1 ψh 2 exp(α 2 /2)n 2 is diagonal, and the diagonal form can be determined using the invariance of the principal minors under the N -action.
The rate function is therefore The supremum is infinite if x 1,0 differs from x 2,0 up to trailing zeros, or any of the entires is negative, or if the vectors do not sum to one. Otherwise it evaluates to Next we prove some properties of I ρ .
(3.37) G x is a smooth manifold, the expression ln χ x (g)−ln Tr π(g) * ρπ(g) is a smooth function of g ∈ G x and is zero for g = e. It follows that if e is not a critical point then the supremum is strictly positive. The tangent space decomposes as If e is a critical point then both derivatives vanish for every β and ν, thus J(ρ) and x agree on h · b. Since they both vanish on k, this means x = J(ρ).
Recall that the domain of an extended real valued function f : X → (−∞, ∞] is the set dom f = {x ∈ X|f (x) < ∞}. We now show that the domain of the rate function is precompact.
Proposition 3.21. Let ∆ ⊆ it * be the convex hull of the set of weights appearing in the decomposition of H with respect to the action of T . Then for every x / ∈ K · ∆ we have I ρ (x) = ∞.
Proof. Suppose that x / ∈ K · ∆ and write x = h · x 0 with x 0 ∈ it * + , h ∈ K. Since ∆ is compact and convex, there is a hyperplane in it * separating ∆ and x 0 , so there is an element β ∈ t such that x 0 , β > max x ∈∆ x , β .
We use (3.21a) and that G x contains hAh −1 , therefore for any s ∈ R we have = 2s x 0 , β − max x ∈∆ x , β . The coefficient of s is strictly positive, therefore letting s → ∞ shows that I ρ (x) = ∞.

Upper bound
We are in a position to prove the upper bound part of the large deviation principle. We prove a stronger statement involving any measurable set instead of only closed ones.   For any x = 1 m J λ ([v λ ]) = h · x 0 and g ∈ G we can estimate the integrand using (3.12) as ≤ χ mx (g −1 ) Tr π ⊗m (g) * ρ ⊗m π ⊗m (g) = χ x (g −1 ) Tr π(g) * ρπ(g) m = e −m(− ln χx(g −1 )−ln Tr π(g) * ρπ(g)) . (3.50) We take the infimum over g and then bound from above by the supremum over x ∈ F : Since ν λ is a probability measure, this value is also an upper bound on each integral. For the highest weights λ appearing in the decomposition of H ⊗m we can estimate the dimension as dim H λ ≤ (m + 1) dim H(dim H−1)/2 . This can be seen by first decomposing EJP 26 (2021), paper 79. into U (H)-isotypic components (which are also K-invariant subspaces) and then into Kisotypic ones and using the dimension formula for U (H)-representations corresponding to partitions of m into at most dim H parts.
It remains to bound the number of isomorphism classes of K-representations appearing in H ⊗m . These are distinguished by their highest weights, so we get an upper bound by counting the total number of different weights. The weights of H ⊗m are sums of m weights from H (with multiplicity), therefore their number is upper bounded by (m + 1) dim H .
Combining these estimates we get (3.52) as claimed.
In particular, Proposition 3.24 implies part i of Theorem 1.1.

Law of large numbers
As an application of the upper bound (Proposition 3.24) we now show that the measures µ m converge weakly to the Dirac measure located at the value of the moment map.
Proposition 3.25. Let C ⊆ ik * closed. Then I ρ has a minimum on C.
Proof. I ρ is lower semicontinuous by Corollary 3.19 and infinite outside a compact set by Proposition 3.21, therefore it has a minimum on C. For the lower bound we employ a variant of the "change of measure" or "exponential tilting" technique [8]. However, instead of multiplying the measures µ m by a suitable function, we replace ρ with (the normalised version of) an element in its G-orbit so that we retain the form (3.3) and thus we can use Theorem 1.2.
The following lemma translates the rate of exponential decay of the probability of an open set to the decay of the probability density on the rescaled integral orbits. This equivalent characterisation will ease the comparison of the original and the tilted measures.  For the other direction, let > 0, m 0 ∈ N >0 , h ∈ K, and λ 0 a dominant integral weight such that 1 Tr ρ ⊗m0 p λ0,π m 0 (h · [v λ0 ]) ≥ e m0(L− ) .
For m ∈ N let q = m m0 and r = m − qm 0 . The tensor product of highest weight vectors is also a highest weight vector for the product representation and the weights are added. For h ∈ U and large m we have h · (qλ 0 + rλ 1 ) ∈ mO, therefore  This is true for every > 0, therefore L is also a lower bound.
The next proposition introduces the tilted measures and compares them with the original one on open sets. We remark that, by the symmetry between ρ and ρ , a similar inequality holds in the reverse direction.
Consider the set M ρ = g · J π(g) * ρπ(g) Tr π(g) * ρπ(g) g ∈ G . holds and M ρ ⊆ dom I ρ . We now show that M ρ is a dense subset of dom I ρ .
The sequence x j = J(ρ j ) converges to x and satisfies x j ∈ h · it, therefore for large j we have g j · x j ∈ K x · x j by Lemma 3.3. K x acts by isometries (for any Ad-invariant inner product) and fixes x, therefore g j · x j → x.
When I ρ is continuous on its domain, Propositions 3.29 and 3.30 imply (1.4) for every open set. In general we have not been able to prove continuity, although we conjecture that it is indeed true. However, for states with full support Proposition 3.23 is sufficiently strong to finish the proof as follows.
Proof of part ii of Theorem 1.1. Let x ∈ O ∩ dom I ρ and write x = h · x 0 with h ∈ K and x 0 ∈ it * + . Let y ∈ relint((h · it * + ) ∩ dom I ρ ) be arbitrary. Then for s ∈ [0, 1) we have sx + (1 − s)y ∈ relint((h · it * + ) ∩ dom I ρ ), so by Proposition 3.23 I ρ is continuous at these points. From Propositions 3.29 and 3.30 we conclude that  We take the limit s → 1 and note that the function s → I ρ (sx + (1 − s)y) is finite, convex and lower semicontinuous on [0, 1], therefore also continuous. Thus −I ρ (x) is a lower bound.