On the transition between the disordered and antiferroelectric phases of the 6-vertex model

The symmetric six-vertex model with parameters $a,b,c>0$ is expected to exhibit different behavior in the regimes $a+b<c$ (antiferroelectric), $|a-b|<c\leq a+b$ (disordered) and $|a-b|>c$ (ferroelectric). In this work, we study the way in which the transition between the regimes $a+b=c$ and $a+b<c$ manifests. When $a+b<c$, we show that the associated height function is localized and its extremal periodic Gibbs states can be parametrized by the integers in such a way that, in the $n$-th state, the heights $n$ and $n+1$ percolate while the connected components of their complement have diameters with exponentially decaying tails. When $a+b=c$, the height function is delocalized. The proofs rely on the Baxter-Kelland-Wu coupling between the six-vertex and the random-cluster models and on recent results for the latter. An interpolation between free and wired boundary conditions is introduced by modifying cluster weights. Using triangular lattice contours ($\mathbb{T}$-circuits), we describe another coupling for height functions that in particular leads to a novel proof of the delocalization at $a=b=c$. Finally, we highlight a spin representation of the six-vertex model and obtain a coupling of it to the Ashkin-Teller model on $\mathbb{Z}^2$ at its self-dual line $\sinh 2J = e^{-2U}$. When $J<U$, we show that each of the two Ising configurations exhibits exponential decay of correlations while their product is ferromagnetically ordered.


Introduction
The six-vertex model is a classical model in statistical mechanics, which was initially introduced by Pauling [Pau35] in 1935 to study the structure of ice in three dimensions.
A two-dimensional version as well as ferroelectric (Slater [Sla41]) and antiferroelectric (Rys [Rys63]) variants were later introduced, see [Bax82, Chapter 8], [LW72], and [Res10] for introductory texts. In this work we discuss the two-dimensional six-vertex model, whose configurations are orientations of edges of the square grid (graphically indicated by arrows on the edges) which satisfy the ice rule: at each vertex, there are exactly two outgoing and two incoming arrows, yielding six possible local configurations. The local possibilities are assigned nonnegative weights a 1 , a 2 , b 1 , b 2 , c 1 , c 2 as in Figure 1 and, in finite domains, the probability of each arrow configuration is proportional to the product of local weights (see Section 2.3). The model is typically studied under the assumption that the weights are invariant to reversal of all arrows, that is, a 1 = a 2 , b 1 = b 2 , c 1 = c 2 (zero external electric field), with the three weights termed a, b, c. Following Yang-Yang [YY66], Lieb [Lie67d,Lie67b,Lie67a,Lie67c] and Sutherland [Sut67] who found an expression for the free energy using the Bethe ansatz [Bet31], it is predicted that the behaviour of the model is governed by the value of ∆ := a 2 +b 2 −c 2 2ab with the following distinguished regimes: a 1 a 2 b 1 b 2 c 1 c 2 Figure 1: The six possible arrow orientations at a vertex which satisfy the ice-rule (exactly two outgoing and two incoming arrows) and their associated weights.
The degenerate case c = 0 is known as the corner percolation model [Pet08]. The goal of this work is to discuss the behavior of natural probabilistic observables in the different regimes. Our focus here is on the antiferroelectric phase (a + b < c) and on the boundary (a + b = c) between the antiferroelectric and disordered phases (see [BCG16,CGST20,Agg18] for recent progress on the ferroelectric phase). Our main results are: • Height function of the six-vertex model: The variance of the height function with flat boundary conditions is bounded when a + b < c and grows as a logarithm of the distance to the boundary of the domain when a + b = c.
The translation-invariant, under parity-preserving translations, extremal Gibbs states are characterized. When a + b = c no such states exist. When a + b < c, the states are parameterized by the integers, with the n-th state having unique infinite connected components (in diagonal connectivity) for heights n and n+1, while all other heights together form connected components whose diameters have exponentially decaying tails.
• Spin representation of the six-vertex model: A spin representation introduced by Rys [Rys63], of a mixed Ashkin-Teller type, is analyzed. Constant boundary conditions for each of the two Ising spins induce order in both spins when a + b < c while leading to a disordered state when a + b = c.
• Gibbs states of the six-vertex model: The Gibbs states which exhibit infinitely many disjoint oriented circuits of alternating vertical and horizontal edges surrounding the origin (flat states) are classified. There are exactly two such extremal states when a + b < c, and in each the orientation of each edge has a non-uniform distribution. There is a unique such state when a + b = c.
• Self-dual Ashkin-Teller model on Z 2 : We introduce a coupling of the six-vertex model in the regime a = b (the F-model), the symmetric Ashkin-Teller model on the self-dual curve sinh 2J = e −2U and an associated graphical representation. It is then proved that when J < U each of the two Ising configurations exhibits exponential decay of correlations while their product is ferromagnetically ordered. This is in contrast to the behavior for J = U (the critical 4-state Potts model) and is in agreement with predictions in the physics literature.
The basic tool underlying the results is the Baxter-Kelland-Wu coupling [BKW76]. In the regime a + b ≤ c it gives a probabilistic coupling of the six-vertex model with the critical FK model with parameter q = 4∆ 2 (the coupling extends as a complex measure to other values of the parameters). This allows the transfer of known results on the FK model to the six-vertex setting, with the most relevant fact being that the phase transition in the FK model is of first order when q > 4 and of second order when q = 4. Our results on the fluctuations of the height function follow from this correspondence in a relatively direct manner. However, the characterization of Gibbs measures for the height function when a + b < c requires additional tools. The challenge is to show that the model with flat boundary conditions (with values 0 and 1) converges in the thermodynamic limit irrespective of the sequence of domains used to exhaust Z 2 . This is achieved via a new technique involving T-circuits, triangular lattice contours suitably embedded in Z 2 , combined with a careful analysis of the percolative properties of the level sets of the height function (see overview in Section 2.7.1). Our understanding of the Gibbs states of the spin representation and of the six-vertex model itself are derived as a consequence, with the additional tool of an FKG inequality for the marginal of the spin configuration on one of the sublattices which is established in the regime a, b ≤ c. Lastly, as mentioned above, our results for the Ashkin-Teller model are based on a coupling of it with the six-vertex model with a = b and an associated graphical representation that we introduce; a coupling which extends previously discussed duality statements between the models. The graphical representation is proved to satisfy the FKG inequality in the regime 2a, 2b ≤ c which is instrumental in deducing the exponential decay of correlations in each of the Ising configurations when J < U .
We add that some of the ideas that we develop in the current article have already found further applications. In particular, the technique of T-circuits that we introduce (Definition 6.3) allows to give a short proof for the delocalisation of the height function in the uniform case a = b = c = 1 (square ice), as we sketch in Section 9. Unlike the original proof [CPST21], our argument does not rely on Sheffield's seminal work [She05]. Also, our extension of the Baxter-Kelland-Wu coupling to the six-vertex model with the same weights in the bulk and on the boundary (Theorem 7, part 1) was used in the recent short proof by Ray and Spinka of the discontinuity of the phase transition in the Potts and random-cluster models when q > 4, originally proven in [DCGH + 21] via the Bethe Ansatz.

Results
In this section we describe our results in detail.
We use the following definitions: The square lattice Z 2 is embedded in R 2 so that its edges are parallel to the coordinate axes and its faces are centered at points with integer coordinates. Let F (Z 2 ) denote the set of faces of Z 2 . A face centered at (i, j) is called even if i + j is even, and it is called odd if i + j is odd. For u, v ∈ F (Z 2 ) we write |u − v| for the ℓ 1 distance between the centers of u and v. A finite subgraph D ⊂ Z 2 is called a domain if there exists a simple cyclic path P in Z 2 such that D coincides with the part of Z 2 surrounded by P , including P itself. The path P is then termed the boundary of D and is denoted by ∂D. For N ∈ N, let Λ N be the domain whose vertices border the faces centered at all pairs of integers (i, j) that satisfy |i ± j| ≤ N − 1. As is common, we write cluster for connected component. By parity-preserving translations we mean translations by vectors in {(i, j) ∈ Z 2 : i + j is even}. By all translations we mean translations by vectors in Z 2 .

Height function representation
A function h : F (Z 2 ) → Z is called a height function if it satisfies that (see Figure 3): • for any two adjacent faces u, v, |h(u) − h(v)| = 1; • for any face u, the parity of h(u) is the same as the parity of u.
Gradients of height functions are in a natural bijection with six-vertex configurations as described in Figure 2 (see also Figure 3). Thus, a height function determines a unique six-vertex configuration and a six-vertex configuration determines a height function up to the global addition of an even integer. Given a height function t, the finite-volume height-function measure with parameters a, b, c > 0 on a domain D with boundary conditions t is supported on height functions that coincide with t at all faces outside of D and is defined by where Z t hf,D,a,b,c is a normalizing constant and n i (h) is the number of vertices of D that are of type i according to the correspondence described in Figure 2 (up to an additive constant).
For integer n and domain D, let HF n,n+1 D,a,b,c be the height-function measure on D with boundary conditions given by a function that takes values in {n, n + 1} (each face according to its parity); see Figure 3. Note that if f is sampled from HF 0,1 D,a,b,c then f + 2n is distributed as HF 2n,2n+1 D,a,b,c and −f + 2n is distributed as HF 2n−1,2n D,a,b,c . In the next theorem, we show that the variance of the height at a fixed face is uniformly bounded when a + b < c and logarithmic in the distance to the boundary when a + b = c.
Theorem 1 (Fluctuations). Let D be a domain, let a, b, c > 0 and let h 0,1 D,a,b,c be sampled from HF 0,1 D,a,b,c . Then there exist c 1 , C 1 , C 2 > 0, depending only on a, b, c, such that for every face u of D, We note that an analogue of (1) is proven in [DCGH + 21] for periodic boundary conditions (i.e., when the height functions are defined on a torus). Analogues of (2) are known when: Ken00] (the free fermion point, by using its relation with the dimer model), A measure HF on height functions is called a Gibbs state for height functions with parameters a, b, c > 0 if the following holds: Let h be sampled from HF. For any domain D, conditioned on the values of h on the faces outside of D, the distribution of h equals HF t D,a,b,c , where t is an arbitrary height function which agrees with h outside of D. A Gibbs state is called extremal if it has a trivial tail σ-algebra.
The next theorem characterizes the extremal Gibbs states for the height function which are invariant under parity-preserving translations.
Theorem 2 (Gibbs states: height functions). 1) Let a, b, c > 0 satisfy a+b < c. For each integer n and sequence of domains {D k } increasing to Z 2 the sequence of finite-volume measures HF n,n+1 D k ,a,b,c converges to a Gibbs state HF n,n+1 a,b,c , which does not depend on {D k }. The limiting Gibbs states are extremal and invariant under parity-preserving translations, and each Gibbs state with these two properties equals HF n,n+1 a,b,c for some integer n. Moreover, the following properties are satisfied: • Under HF n,n+1 a,b,c , clusters (in augmented connectivity) of heights different from n and n+1 have diameters with exponential tail decay. Precisely, there exist M, α > 0 such that for all N ∈ N, (3) where we write γ : (0, 0) → ∂Λ N to mean a path in F (Z 2 ) starting at (0, 0) and ending at a face bordered by an edge of ∂Λ N .
• Each HF n,n+1 a,b,c is positively associated and the stochastic ordering relation HF m,m+1 a,b,c ≺ HF n,n+1 a,b,c holds for m < n.
2) Let a, b, c > 0 satisfy a + b = c. Then there are no extremal Gibbs states for the height function which are invariant under parity-preserving translations.
It is a straightforward consequence that, in the case a + b < c, HF n,n+1 a,b,c -a.s., there exist infinitely many disjoint level lines separating the heights n and n + 1 surrounding the origin.
Remark. It follows from [She05,Chapter 8] that all ergodic height-function Gibbs states are in fact extremal (see also [CPST21, Section 5] for a short proof in the uniform case a = b = c = 1). With this fact, Theorem 2 gives a complete description of translation-invariant height-function Gibbs states when a + b ≤ c.

Spin representation (mixed Ashkin-Teller model)
A function σ : F (Z 2 ) → {−1, 1} is called a spin configuration satisfying the ice rule if around each vertex there is a pair of diagonally-adjacent faces on which σ agrees. The set of all such functions is denoted E spin (Z 2 ). Spin configurations satisfying the ice rule, regarded up to a global spin flip, are in a natural bijection with six-vertex configurations as described in Figure 2 and its caption (see also Figure 3). In other words, each σ ∈ E spin (Z 2 ) determines a unique six-vertex configuration while a six-vertex configuration determines a pair of spin configurations satisfying the ice rule, related by a global spin flip.
There is a direct correspondence between height functions h and spin configurations σ satisfying the ice rule, which is consistent with the bijections of these objects with six-vertex configurations and is defined by setting σ(u) = 1 if h(u) ≡ 0, 1 (mod 4) and σ(u) = −1 if h(u) ≡ 2, 3 (mod 4) (see Figure 2 and Figure 3). A height function thus determines a unique σ ∈ E spin (Z 2 ) while each σ ∈ E spin (Z 2 ) determines a height function up to the global addition of an integer divisible by 4.
The finite-volume spin measure with parameters a, b, c > 0 on a domain D with boundary condition τ ∈ E spin (Z 2 ) is supported on σ ∈ E spin (Z 2 ) that coincide with τ at all faces outside of D and is defined by where Z τ spin,D,a,b,c is a normalizing constant and n i (σ) is the number of vertices of D that are of type i according to the correspondence described in Figure 2 and its caption. In particular, if t is a height function which maps to τ under the modulo 4 mapping described above then the measure Spin τ D,a,b,c is the push-forward of the measure HF t D,a,b,c by this mapping. For α, β ∈ {−, +}, we use the notation Spin αβ D,a,b,c to denote the measure Spin τ D,a,b,c in which τ is the configuration having sign α on all even faces and having sign β on all odd faces.
A measure Spin on E spin (Z 2 ) is called a Gibbs state for the spin representation with parameters a, b, c > 0 if the following holds: Let σ be sampled from Spin. For any domain D, conditioned on the values of σ on the faces outside of D, the distribution of σ equals Spin τ D,a,b,c , where τ ∈ E spin (Z 2 ) is an arbitrary configuration which agrees with σ outside of D. Let G spin a,b,c denote the set of all extremal (i.e., tail trivial) Gibbs states that are invariant under parity-preserving translations.
In the next theorem, we study the Gibbs states of the spin representation. In the regime of the antiferroelectric phase a + b < c, we construct four distinct measures (the push-forwards of the height measures HF n,n+1 a,b,c for different values of n by the modulo 4 mapping) and show that, under these measures, the correlations of spins at faces of the same parity are uniformly positive. In the regime a + b = c, we construct a measure Spin a,b,c ∈ G spin a,b,c and show that it may be characterized by the absence of certain infinite clusters. In discussing clusters of even (or odd) faces we consider two faces of the same parity adjacent if they share a vertex.
1) Let a, b, c > 0 satisfy a + b < c. For each α, β ∈ {−, +} and sequence of domains {D k } increasing to Z 2 the sequence of finite-volume measures Spin αβ D k ,a,b,c converges to a Gibbs state Spin αβ a,b,c ∈ G spin a,b,c , which does not depend on {D k }. The four limiting measures are distinct. Moreover, the measure Spin αβ D k ,a,b,c satisfies the following properties: • Samples from Spin αβ a,b,c exhibit a unique infinite cluster of even faces with sign α and a unique infinite cluster of odd faces with sign β, almost surely.
• For each even (odd) face u, Spin αβ a,b,c (σ(u)) does not depend on u (by invariance), is non-zero and of the same sign as α (as β). In addition, for any u and v of the same parity, 2) Let a, b, c > 0 satisfy a + b = c. There exists a Gibbs state Spin a,b,c for the spin representation with the following properties: • For any sequence of domains {D k } increasing to Z 2 and any τ ∈ E spin (Z 2 ) which is either constant on all even faces or constant on all odd faces, the sequence of finite-volume measures Spin τ D k ,a,b,c converges to Spin a,b,c . • The measure Spin a,b,c is invariant under all translations and is extremal (in particular, Spin a,b,c ∈ G spin a,b,c ). In addition, it is invariant under a global sign flip applied on either all even faces or all odd faces. Consequently, Spin a,b,c (σ(u)) = 0 and Spin a,b,c (σ(u)σ(v)) = 0 for two faces u and v of different parity.
• There exist C, α > 0 so that for two faces u, v of the same parity, • Samples from Spin a,b,c exhibit no infinite cluster of faces having the same parity and the same spin, almost surely.
In contrast, each other element of G spin a,b,c exhibits, almost surely, at least one infinite cluster of each of the four types -even +1, even −1, odd +1, odd −1.
The next theorem verifies the Fortuin-Kasteleyn-Ginibre (FKG) inequality (which implies positive association [FKG71]) for marginals of the spin representation in the regime a, b ≤ c (all of the antiferroelectric phase and part of the disordered phase). Denote by F • (Z 2 ) and F • (Z 2 ) the set of even (resp. odd) faces of Z 2 . Given σ ∈ {−1, 1} F (Z 2 ) denote by σ • ∈ {−1, 1} F • (Z 2 ) and by σ • ∈ {−1, 1} F • (Z 2 ) the restrictions of σ to F • (Z 2 ) and to F • (Z 2 ) respectfully. We endow {−1, 1} F • (Z 2 ) with the pointwise partial order: Theorem 4 (Positive association: spin representation). Let D be a domain and consider τ ∈ E spin (Z 2 ) that is equal to 1 at all odd faces outside of D. Suppose that a, b, c > 0 satisfy a, b ≤ c. Then the marginal of Spin τ D,a,b,c on σ • satisfies the FKG lattice condition. In particular, for any increasing functions f, g : (loop O(n) model). In particular, the result in [CMW98], via the mapping between the spin representation and the standard Ashkin-Teller model described in Section 8 below (see also [HDJS13]) allows to derive Theorem 4 when a/c = b/c ≤ 1 √ 2 (but apparently not in the full range a, b ≤ c). Also, the proof of Theorem 4 is closely related to that of [GM21, Theorem 2.6].
The spin representation of the six-vertex model was considered already by Rys [Rys63]. In the terminology of [HDJS13], it can be called an infinite-coupling limit mixed Ashkin-Teller model. The term 'mixed' refers to the fact that the spin configurations σ • and σ • are defined on two lattices that are dual to each other, while in the standard Ashkin-Teller model both spin configurations are defined on the same lattice (see Section 2.4). The term 'infinite-coupling limit' refers, in our case, to the the ice rule constraint.
A follow-up work of Lis [Lis22] studies the case of two interacting Potts models (see also Owczarek and Baxter [OB87] where a more general Temperley-Lieb interactions model is introduced) and proves, in particular, that they too satisfy an FKG inequality.

Orientations of edges in the six-vertex model
In this section, we state an immediate consequence of Theorem 3 for the six-vertex model in its classical representation in terms of edge orientations. As stated in the introduction, a six-vertex configuration is an orientation of the edges of Z 2 that satisfies the ice-rule at every vertex (two incoming and two outgoing edges); see Figure 1. Given a six-vertex configuration ⃗ τ , the finite-volume six-vertex measure SixV ⃗ τ E,a,b,c , on a finite subset of edges E ⊂ E(Z 2 ) with boundary conditions ⃗ τ , is supported on six-vertex configurations that coincide with ⃗ τ at all edges outside of E and is defined by: SixV,E,a,b,c is a normalizing constant and n i (⃗ ω) is the number of endpoints of edges in E at which the six-vertex configuration ⃗ ω is of type i according to Figure 1. Gibbs states for the six-vertex model are then defined in the standard way (as in the previous sections).
Our analysis classifies extremal (i.e., tail trivial) Gibbs states which are flat in an appropriate sense (these are expected to be the only Gibbs states for which the associated height function has zero slope but that is not proved here).
Corollary 2.1 (Gibbs states: six-vertex model). A Gibbs state is termed 'flat' if in a configuration sampled from that state, almost surely, there are infinitely many disjoint oriented circuits which surround the origin and consist of alternating vertical and horizontal edges.
1. When a + b < c, there are exactly two extremal flat Gibbs states SixV ⟲ a,b,c , SixV ⟳ a,b,c . These states are invariant under parity-preserving translations and under ninetydegree rotations around the origin and differ from each other by a global edgeorientation flip. Moreover, if we denote by A(e) the event that the edge e ∈ E(Z 2 ) is oriented so that the even face that it borders lies on its left, then SixV ⟲ a,b,c (A(e)) does not depend on e (by invariance) and it holds that SixV ⟲ a,b,c (A(e)) > 1 2 and for all edges e, f ∈ E(Z 2 ). (corresponding statements hold for SixV ⟳ a,b,c as it differs from SixV ⟲ a,b,c by a global edge-orientation flip).
2. When a + b = c, the six-vertex model has a unique flat Gibbs state. This state is extremal and invariant under all translations.

Ashkin-Teller model
The Ashkin-Teller model was originally introduced [AT43] as a generalization of the Ising model to a four-state system. The definition in terms of two coupled Ising models that we provide below is due to Fan [Fan72]. We consider the (symmetric) Ashkin-Teller model on the square grid. We will later describe a coupling of the Ashkin-Teller model with the spin representation of the sixvertex model (Proposition 8.1) and in anticipation of this it is convenient to define the Ashkin-Teller model on the set of even faces F • (Z 2 ) of Z 2 with diagonal connectivity (this graph is isomorphic to Z 2 ) rather than on Z 2 itself. Accordingly, we let (Z 2 ) • be the graph with vertex set F • (Z 2 ) and with edges between diagonally-adjacent faces. The Ashkin-Teller measure with parameters J, U ∈ R on a subgraph Ω of (Z 2 ) • is supported on pairs of spin configurations (τ, τ ′ ) ∈ {−1, 1} V (Ω) × {−1, 1} V (Ω) and is defined by where Z Ω,J,U is a normalizing constant and the sum is taken over all edges in Ω.
Proposition 8.1 shows that there is a coupling of the Ashkin-Teller measure with parameters J, U and the spin representation measure (4) with parameters a, b, c, on suitable domains and with suitable boundary conditions, when the parameters satisfy the relations sinh 2J = e −2U , a = b = 1, c = coth 2J so that the configurations (τ, τ ′ ) and σ satisfy at every even face the equality Figure 5: A sketch of the predicted phase diagram of the Ashkin-Teller model. The selfdual curve sinh 2J = e −2U is in bold: it is critical when J ≥ U (solid) but expected to be non-critical when J < U (dashed) while the critical curve is expected to split at the point J = U into two critical curves dual to each other (no prediction for their exact location). Phase I: τ, τ ′ , τ τ ′ are ferromagnetically ordered. Phase II: τ, τ ′ , τ τ ′ are disordered. Phase III: τ τ ′ is ferromagnetically ordered, while τ, τ ′ are disordered. Phase IV : τ τ ′ is antiferromagnetically ordered, while τ, τ ′ are disordered. The line J = U (dotted) corresponds to the 4-state Potts model. The line U = 0 corresponds to two independent Ising models. Only the regime J ≥ 0 is drawn as the regime J ≤ 0 is equivalent to it, by flipping the signs of the Ising models on one partite class of the square grid.
The first equality in (7) describes the self-dual curve of parameters for the Ashkin-Teller model and was first found by Mittag and Stephen [MS71] (see Figure 5). The relation between the Ashkin-Teller and the eight-vertex model was noticed already by Fan [Fan72] and then made explicit by Wegner [Weg72] (see also [Kot85,Section III]). In the particular case given by (7), this turns into a correspondence between the Ashkin-Teller and sixvertex models (see, e.g., [Bax82, Section 12.9]). This correspondence is upgraded here to a coupling of the models together with an FK-Ising representation that is introduced (Section 7.1 and Section 8), which facilitates the transfer of results between the models. Thus we obtain a coupling of the six-vertex model with a = b = 1 and c > 1 and the self-dual Ashkin-Teller model (see Section 9 for the limiting c = 1 case).
In the next theorem, we show that on the self-dual curve when J < U (see Figure 5), correlations in τ and τ ′ decay exponentially (disordered regime) while correlations in the product τ τ ′ are uniformly positive (ordered regime), in agreement with the predicted [Kno75,DBGK80] phase diagram of the Ashkin-Teller model (see also [HDJS13, Section 5] for a recent survey with explicit computations). For integer k ≥ 0, let Ω k be the induced subgraph of (Z 2 ) • on the faces of Z 2 whose centers are at (i, j) ∈ [−k − 1, k + 1] 2 . Define AT free,+ Ω k ,J,U to be the Ashkin-Teller measure conditioned on the event that τ = τ ′ on the internal boundary of Ω k .
Theorem 5. Let J, U > 0 be such that sinh 2J = e −2U and J < U . Then, the sequence of measures AT free,+ Ω k ,J,U converges (weakly) to a measure AT free,+ J,U that is translation invariant and extremal. Further, there exist C, c, α > 0 such that for any two vertices u, v of (Z 2 ) • , We briefly survey some of the rigorous results on the phase transition of the Ashkin-Teller model near its self-dual curve. It is natural to search for possible phase transitions when changing the parameters along the lines in which J/U is constant (this corresponds to changing the temperature when the term in the exponent of (6) is multiplied by an inverse temperature parameter). When doing so with J > U > 0, under plus boundary conditions, correlations of τ , τ ′ and τ τ ′ can be shown to undergo a sharp phase transition at the same curve of parameters γ c : they decay exponentially fast in the distance when (J, U ) is below γ c and stay uniformly positive when (J, U ) is above γ c 2 . Provided that the transition under free boundary conditions also occurs at γ c , this implies that γ c coincides with the self-dual curve sinh 2J = e −2U . Rigorous results on the critical behavior are available only at J = U = 1 4 log 3 (critical 4-state Potts model) where all correlations are known to have power-law decay [DCST17] and at J = 1 2 log(1 + √ 2), U = 0 (two independent critical Ising models) where correlations in τ and τ ′ decay as |u − v| −1/4 [Ons44, MW14, Smi10, CHI15]. When 0 < J < U and the parameters are varied on a line with J/U < 1 2 fixed, Pfister [Pfi82] (see also Häggström [Häg98]) proved that there exist three phases -a disordered phase and an ordered phase (for τ, τ ′ and τ τ ′ ) as well as an intermediate phase in which τ and τ ′ are disordered but τ τ ′ is ordered (see Figure 5). This behavior is predicted to persist, when 0 < J < U , for all values of the ratio J/U and Theorem 5 supports this prediction as it shows that on the part of the self-dual curve sinh 2J = e −2U with 0 < J < U the model indeed has the properties of the intermediate phase. However, our results do not show that the intermediate phase extends beyond the self-dual curve.
The following corollary is a straightforward consequence of the positive association of the spin representation established in Theorem 4 and the coupling between the spin representation and the Ashkin-Teller models described in Proposition 8.1.
Corollary 2.2. Let J > 0 and U ∈ R be such that sinh 2J = e −2U . Then, for any k, the marginal of the measure AT free,+ Ω k ,J,U on the product τ τ ′ of the spins satisfies the FKG lattice condition (in particular, it is positively associated).

Monotonicity in the boundary coupling constant
The starting point for our analysis of the six-vertex model is the Baxter-Kelland-Wu [BKW76] coupling of it with the random-cluster model. Originally, the coupling was stated for the six-vertex model on general planar graphs with no boundary condition. In Section 3 we describe an extended version of this coupling on domains on Z 2 for the models under two different types of boundary conditions. In particular, the wired boundary condition in the critical random-cluster model (discussed in the original work [BKW76]) corresponds to a six-vertex model, in which the parameters a, b, c assigned to vertices on the boundary of the domain differ from the parameters in the bulk. We call attention to it as a useful extension of the model. As we describe below, on a class of domains, the height-function measures are stochastically ordered with respect to these boundary parameters and these parameters enable a sort of continuous interpolation between different boundary conditions for the six-vertex model.
Let D be a domain on Z 2 . Denote by ∂ V D the set of vertices belonging to exactly one face of D; see Figure 6. Given a height function t, the finite-volume height-function measure on D with boundary conditions t and parameters a, b, c, c b > 0 is supported on height functions that coincide with t at all faces outside of D and is defined by where Z t,c b hf,D,a,b,c is a normalizing constant and n ′ i (h) is the number of vertices of D \ ∂ V D that are of type i according to the correspondence described in Figure 2 (up to an additive constant) and n b (h) is the number of vertices in ∂ V D which are of type 5 and 6 according to the figure (the boundary weights of vertices of types 1-4 are fixed to one).
As in Section 2.1, we write HF n,n+1,c b D,a,b,c when the boundary condition t takes values in {n, n + 1}. Denote by ∂ ext D (and call it the external boundary of D) the set of faces in Z 2 \ D that are adjacent to faces in D. We say that a domain D on Z 2 is of fixed parity if all faces in ∂ ext D have the same parity. According to this parity, the domain is then called even or odd.
where D\∂ V D denotes the graph obtained from D after removing all vertices ∂ V D (together with the edges incident to them).
Remark. It follows from the proof that varying c b from 0 to ∞ allows to continuously interpolate between −1, 0 and 0, 1 boundary conditions.
This monotonicity with respect to c b follows from the well-known positive association for the height-function measure when c ≥ max{a, b} (see [BHM00, Proposition 2.2] and Proposition 5.1 below). Similarly, the positive association stated in Theorem 4 for the marginals of the spin representation on the even and the odd sublattices, implies that these marginals are stochastically ordered in c b . More precisely, let Spin +;c b D,c be supported on the set of spin configurations on Z 2 that are equal to +1 outside of D and defined by where Z +;c b spin,D,c is a normalizing constant; n ′ i (σ) and n b (σ) are the counterparts for the spins of the corresponding quantities in (10) for the heights.

An FK model with a modified boundary-cluster weight
In Section 3 we further introduce a second coupling of the six-vertex model with the random-cluster model, in which the six-vertex model has the same parameters on the boundary and in the bulk but the random-cluster model is modified by assigning a special weight q b ̸ = q to boundary clusters. This modified random-cluster model appears natural, as varying q b between 1 and q allows a continuous interpolation between wired and free boundary conditions. Indeed, following our work, it was used in [RS20] in a short proof of the discontinuity of the phase transition for q > 4. We describe the model and some of its basic properties in this section.
Let Ω be a subgraph of a square lattice and let E(Ω) denote the set of edges in Ω. Given q, q b > 0 and p ∈ [0, 1], the random-cluster measure RC q b Ω,q,p is supported on η ∈ {0, 1} E(Ω) and is defined by where Z q b RC,Ω,q,p is a normalizing constant, k b (η) denotes the number of boundary clusters of η (i.e. connected components containing at least one boundary vertex), k i (η) denotes the number of the interior (i.e. non-boundary) clusters, o(η) denotes the number of open edges in η, and c(η) denotes the number of closed edges in η.
In the classical definition of the random-cluster measure due to Fortuin and Kasteleyn [FK72] (see also [Gri06, Section 1.2]), one does not distinguish between boundary clusters and interior clusters. The boundary conditions are defined only by merging (wiring) certain boundary vertices, thus influencing the count of boundary clusters. If all boundary vertices are wired together, the boundary conditions are called wired, and if there is no wiring, the boundary conditions are called free. It is easy to see that q b = 1 corresponds to the wired boundary conditions, q b = q corresponds to the free boundary conditions, and the measures RC q b Ω,q,p with different values of q b ∈ [1, q] thus interpolate between wired and free boundary conditions (see Proposition 4.2 below).
In [BDC12] (see also [DCRT18,DCRT19] for alternative proofs), it was shown that when q ≥ 1, the model undergoes the random-cluster model undergoes a phase transition at in terms of the correlation length -independently of the boundary conditions, the model exhibits exponential decay of the size of clusters when p < p c (q), and the origin belongs to an infinite cluster with a positive probability when p > p c (q). In particular, for all p ̸ = p c (q), the infinite-volume limit does not depend on the boundary conditions.
We focus on the critical case p = p c (q). Here it was shown that the free and the wired measures are the same [DCST17] when q ∈ [1, 4], and different [DCGH + 21, RS20] when q > 4. This raises a natural question -when q > 4, what is the limit for each particular value of q b ∈ [1, q]? In the next theorem, we partially answer it.
Theorem 6. i) Let q > 4 and λ > 0 be such that √ q = e λ + e −λ . Take any sequence Ω k of increasing domains. Then is the same and is equal to the wired random-cluster Gibbs measure; • for all q b ∈ [e λ √ q, q], the limit of RC q b Ω k ,q,pc(q) is the same and is equal to the free random-cluster Gibbs measure.
ii) When q ∈ [1, 4], the infinite-volume limit of RC q b Ω k ,q,pc(q) is the same, for any q b ∈ [1, q], and is equal to the unique random-cluster Gibbs measure.
It is reasonable to expect that the limiting measure is wired for all q b < √ q and free for all q b > √ q (see Question 3 in Section 9).

Overview of the proofs
The proofs of our main results on the height function (Theorem 1, Theorem 2), spin (Theorem 3 and Theorem 4) and six-vertex representations (Corollary 2.1) are detailed below under the restriction that a = b = 1. This special case is called the F-model and was first considered by Rys [Rys63] (apparently named after Rys' advisor Fierz [Gaa79,Sim93]). We chose to present the proofs in this restricted setting in order to further highlight the main ideas and to avoid encumbering the notation but they extend to the general case directly, as detailed in Section 2.7.2. We note that the connection to the self-dual Ashkin-Teller model (Section 2.4) is available only in the restricted setting a = b = 1.
In this section we provide an overview of the proofs in the restricted setting and then discuss the required modifications to handle the general case.

Overview of the proofs for a = b = 1
Baxter-Kelland-Wu correspondence: A crucial tool in our analysis is a correspondence introduced by Baxter-Kelland-Wu (BKW) [BKW76], extending an earlier partition function relation by Temperley-Lieb [TL71]. BKW described a correspondence of the random-cluster model on a finite planar graph G with a six-vertex model on the medial graph of G. For domains in Z 2 , the random-cluster model corresponds to the F-model with the choice of parameters This choice leads to a critical random-cluster model (criticality is proven in [BDC12] for q ≥ 1. See also [DCRT18,DCRT19] for alternative proofs). The BKW correspondence allows to make use of recent results establishing the order of the phase transition in the random-cluster model with q ≥ 1: second order for q ∈ [1, 4] [DCST17] and first order for q > 4 [DCGH + 21, RS20]. Our results apply in the regime c ≥ 2, corresponding to q ≥ 4, where the correspondence is given by a probabilistic coupling. For c < 2, the correspondence involves a complex measure, complicating the further transfer of results.
In the BKW correspondence it is important to take into account the effect of boundary conditions. Theorem 7 states the correspondence for q ≥ 4, as probabilistic couplings of the random-cluster model and the height function representations, for two different sets of boundary conditions.
In the work of BKW, the random-cluster model is considered on a general finite planar graph under free or wired boundary condition, and boundary vertices in the corresponding six-vertex model have degree two. On domains on Z 2 , we show that this six-vertex model can be defined in the standard way on full-plane arrow configurations satisfying the icerule if a modified parameter c b is introduced on the boundary (see Section 2.5) and satisfies the relation We extend the correspondence to the six-vertex model with the standard boundary weight as long as the random-cluster model is modified so that connected components touching the boundary receive the cluster weight q b satisfying the relation The results on the modified random-cluster model are stated in Section 2.6. When q ≥ 1 and q b ∈ [1, q], we show that the modified random-cluster model is positively associated and thus the results known for the standard random-cluster model can be extended directly. This is used in our proofs for the case c = 2.
Height representation: When coupling the height representation and the randomcluster model, the faces on which the height function is defined correspond to the vertices and the dual vertices of the random-cluster model (even faces to vertices and odd faces to dual vertices) in such a way that the height is constant on every primal and dual cluster. The fluctuations of the height function (Theorem 1) are then analyzed for c > 2 using the existence of an infinite cluster and for c = 2 using Russo-Seymour-Welsh (RSW) techniques. We point out that for c = 2 the random-cluster model is considered with the modified boundary weight q b but monotonicity of the model in the boundary weight (Proposition 4.2 and Proposition 4.1) allows to extend the known RSW estimates to this case. The case c = 2 of Theorem 2 may be deduced from Theorem 1 and its proof. In order to prove Theorem 2 in the case c > 2, a more detailed analysis of the height functions is performed. More precisely, when q > 4, it is known [DCGH + 21] that the critical random-cluster measures on finite domains with wired boundary conditions converge to an infinite-volume limit that exhibits an infinite cluster with finite 'holes' whose diameters have exponentially decaying tail probability. Assigning heights to the primal and dual clusters according to the BKW coupling, this translates to the convergence of the height-function measure with parameter c > 2 on even domains with 0, 1-boundary conditions (and a modified boundary weight) to an infinite-volume height-function measure HF 0,1 even,c that exhibits an infinite cluster of diagonally-adjacent faces of height 0 (with holes whose diameter has exponentially decaying tail probability). Similarly, the measure HF 0,1 odd,c is obtained as the limit over odd domains and exhibits an infinite cluster of height 1. However, it is a priori not clear whether these two measures are equal.
The argument proving that HF 0,1 even,c = HF 0,1 odd,c is one of the main novelties of the current article. Monotonicity of the heights implies that HF 0,1 odd,c stochastically dominates HF 0,1 even,c and we derive the equality of the measures by showing stochastic domination in the opposite direction.
As the first step, we study the set of even faces u where h(u) ≥ 2 and the set where h(u) ≤ 0 when h is sampled from HF 0,1 odd,c . The standard connectivity structure on the even faces is to link (i, j) with (i ± 1, j ± 1). An important trick in our argument is to augment this to a triangular lattice connectivity, denoted T-connectivity, in which (i, j) is linked with (i±1, j±1) and also to (i±2, j). As the T-connectivity is still planar, standard percolation techniques (uniqueness and non-coexistence of infinite clusters) imply that, in this connectivity, there is at most one infinite cluster of h(u) ≥ 2 and at most one infinite cluster of h(u) ≤ 0 and these may not coexist. From this we deduce that h(u) ≥ 2 cannot have an infinite cluster in the T-connectivity. The latter uses several ideas: symmetry between the set h(u) ≥ 2 and the seth(u) ≤ 0 whenh is sampled from the (naturally-defined) HF 2,1 odd,c measure; monotonicity arguments; the non-coexistence statement. Consequently, as the T-connectivity is dual to itself, there are infinitely many disjoint T-circuits on which h(u) ≤ 0.
As a second step, we define a new height function h ′ by The crucial observation in the argument (indeed, the main use of the T-connectivity) is the following: On any finite connected component of Z 2 \(C∪C ′ ) the boundary values in h ′ are necessarily larger or equal to those of h (see Figure 9). Thus, conditioned on C, the distribution of h ′ in this component stochastically dominates that of h. As there are infinitely many disjoint T-circuits with h(0) ≤ 0, this implies that HF 0,1 even,c stochastically dominates HF 0,1 odd,c , as we wanted to show.
A technical point for concluding Theorem 2 is that the height-function measure on finite domains needs to be taken with a modified boundary weight and it is a priori possible that its effect remains in the infinite-volume limit. However, after proving the equality of the two infinite-volume measures, an extra monotonicity argument (using Proposition 2.3) allows to adjust the boundary weight (Corollary 6.4).
Spin and FK-Ising representations: As discussed in Section 2.2, it is useful to regard the spin representation σ as consisting of two coupled Ising configurations σ • and σ • . Here σ • (σ • ) is the restriction of σ to the even (odd) faces (a mixed Ashkin-Teller representation). It is natural to condition on one of the Ising configurations, say σ • , and consider the other configuration σ • which is then exactly an Ising model on a graph obtained from the even faces by contracting some of the diagonal edges in a manner specified by σ • . This Ising model is ferromagnetic when c ≥ 1 and then naturally has an associated FK-Ising bond configuration which we denote by ξ. When conditioning on σ • one can treat ξ using the standard tools, in particular, its known monotonicity (FKG) properties. In our arguments, however, it is also important to consider the unconditional measure of ξ, i.e., when averaging over σ • , and this measure is later referred to as an FK-Ising representation of the Ashkin-Teller model (related to a random-cluster representation introduced in [PV97]). As it turns out, the averaged measure satisfies monotonicity (FKG) properties when c ≥ 2 (see Proposition 7.4) but this is not used in proving the theorems of Section 2.2.
The monotonicity properties of the spin configuration σ • stated in Theorem 4 follow by checking the FKG lattice condition, but the calculation is non-trivial as the probability density of σ • involves a sum over all possibilities for σ • . This sum can be rewritten using a product of partition functions of Ising models on different domains and eventually the required inequality is proved using the monotonicity properties of the standard FK-Ising model.
For c > 2, Theorem 3 is derived from Theorem 2 using the fact that the spin representation is obtained from the modulo 4 of the height function. An additional ingredient is the equality (a standard relation for FK-Ising models) which relates the expectation of the spin σ(u) in the limiting measure to the probability that u is connected to infinity in the FK-Ising representation of the Ashkin-Teller model. This already proves the non-negativity of the spin expectations and strict positivity is then deduced from the properties in Theorem 2. For c = 2, Theorem 3 is derived by coupling the spin representation directly to the random-cluster model with modified boundary cluster weight q b and using the RSW estimates which are available there (again, using monotonicity in q b ). The description of infinite clusters in the extremal invariant Gibbs measures relies on percolation techniques and uses the monotonicity properties of Theorem 4.
Six-vertex representation: The properties of flat Gibbs states of the six-vertex model stated in Corollary 2.1 are derived from Theorem 3. Recall that a flat Gibbs state satisfies that sampled configurations contain infinitely many disjoint oriented circuits surrounding the origin and consisting of alternating vertical and horizontal edges. The connection to Theorem 3 is enabled by noting that the spins on either side of such an oriented circuit must take a constant value.
Self-dual Ashkin-Teller model: As discussed in Section 2.4, the self-dual Ashkin-Teller model is known to be in correspondence with the six-vertex model (see, e.g., [Bax82,Section 12.9]). This correspondence is upgraded here to a probabilistic coupling of the Ashkin-Teller model, the spin representation and the aforementioned FK-Ising representation of the Ashkin-Teller model. The correspondence maps the entire self-dual line to the six-vertex model with parameters a = b = 1 and c > 1 (the limiting case c = 1 is discussed in Section 9), with the regime J < U mapped to the regime c > 2. The coupling is described in Proposition 8.1 and we review its main features here.
The Ashkin-Teller configuration (τ, τ ′ ) is defined on the lattice of even faces (with diagonal connectivity). Under the coupling, the spin configuration on the even faces σ • equals the product of τ τ ′ , which already implies long-range order for the product when J < U (equation (8) of Theorem 5) and also the FKG inequality of Corollary 2.2, by using Theorem 3 and Theorem 4. Recall now that the FK-Ising representation ξ is a percolation configuration on the diagonal bonds between the odd faces, whence the dual percolation ξ * is on the bonds between even faces. Under the coupling, conditioned on σ • (which equals τ τ ′ ) and on ξ * , the configuration τ is obtained by assigning a uniform spin value to each cluster of ξ * , independently among the clusters; this also specifies τ ′ .
The coupling directly links the decay of correlations in τ to the connection probabilities in ξ * . Existence of an infinite cluster in ξ is deduced from Theorem 2. It is additionally shown (Proposition 7.4) that ξ * satisfies the FKG inequality in the regime c ≥ 2 (this should be compared with the fact that the spins σ • are shown in Theorem 4 to satisfy the inequality in the wider range c ≥ 1). By a general non-coexistence theorem ([DCRT19, Theorem 1.5]), all clusters in ξ * are finite. An exponential decay of their diameters (stated in (9)) is then derived from the BKW coupling and an exponential decay of dual clusters in the critical wired random-cluster measure for q > 4.

Extension to the case a ̸ = b
We proceed to explain the modifications required to our arguments when the parameters of the six-vertex model satisfy a ̸ = b.
The main required modification is to the coupling between the six-vertex model and the random-cluster model that is described in Section 3. In the case a = b = 1, Theorem 7 of that section provides a coupling of the six-vertex measure HF 0,1 D,c with the randomcluster measure with modified boundary-cluster weight RC q b D • ,q,pc(q) , when the parameters satisfy with the coupling supported on compatible configurations (i.e., assigning constant heights on primal and dual clusters). The theorem additionally provides a coupling of the sixvertex measure with modified boundary-vertex weight HF 0,1;c b D,c and the standard randomcluster measure RC 1 D • ,q,pc(q) when In the case a ̸ = b there is still a coupling of the six-vertex model with a random-cluster model, as described by Baxter-Kelland-Wu [BKW76]. In this case, the relation between q and the parameters a, b, c is given by The random-cluster model obtained assigns different probabilities p h c , p v c to open horizontal and open vertical edges, respectively, according to the formulas Thus one obtains, with the same proof, an analog of the coupling theorem, Theorem 7, with the same notion of compatible configurations and with the parameters related by (15) and (16) and additionally by where the relations determine λ only up to its sign but both choices lead to a valid coupling. Additionally, in the coupling in which the modified boundary-vertex weight c b is used, the weights assigned to boundary vertices of the first four types in Figure 1 are fixed to one.
Our arguments further use input on the critical properties of the random-cluster model as described in items (iv) and (v) of Proposition 4.2. To adapt these, we rely on the work of Duminil-Copin-Li-Manolescu [DCLM18] who extended many of the results known for the random-cluster model on the square lattice to the setting of the random-cluster model on isoradial graphs. In the language of that work, the above random-cluster model with edge probabilities (16) is the critical random-cluster model on the isoradial graph given by a rectangular grid (the criticality condition is p v . This allows us to use the following results from [DCLM18]: (i) for q ∈ [1, 4], the phase transition is of the second order and one has Russo-Seymour-Welsh estimates on crossings [DCLM18, Theorem 1.1]; (ii) for q > 4, the phase transition is of the first order and the wired infinite-volume measure exhibits a unique infinite cluster and exponential decay of dual clusters [DCLM18, Theorem 1.2].
A modification of a more minor nature concerns the formulas involving the FK-Ising representation of the Ashkin-Teller model ξ introduced in Section 7.1. As explained in Section 2.7.1 for the case a = b, we obtain ξ by conditioning on σ • and then letting ξ be the FK-Ising bond configuration corresponding to σ • (which, after the conditioning, is a ferromagnetic Ising model). Exactly the same construction of ξ is used in the case a ̸ = b and thus the arguments involving ξ are still applicable, though the precise formulas governing the distribution of ξ are modified. Specifically, the construction of ξ conditioned on both σ • and σ • which is described in Figure 10 is used with the following modification: in vertices where both the diagonally-adjacent spins of σ • agree and the diagonallyadjacent spins of σ • agree, the probability with which the edge of ξ is present equals c−a c if the top-left face of the vertex is odd and equals c−b c if it is even. The rest of the arguments used for deriving our results apply directly to the case a ̸ = b with only notational changes. We emphasize in particular that the arguments involving T-circuits do not rely on reflection symmetry. We also point out that the connection with the Ashkin-Teller model and the results of Section 2.4 are specific to the case a = b and are not extended here (the case a ̸ = b of the six-vertex model corresponds to a staggered Ashkin-Teller model). Theorem 6, concerning the random-cluster model with modified boundary weights, and its proof directly extend to the critical random-cluster model with the edge probabilities (16).
3 Coupling between the six-vertex and the randomcluster models The correspondence between the six-vertex and the random-cluster models is known since the seminal paper by Temperley and Lieb [TL71] and was described geometrically by Baxter, Kelland, and Wu [BKW76] (BKW). As outlined in Sections 2.5 and 2.7, we upgrade the correspondence to a coupling and show how to describe the boundary condition for the six-vertex model by introducing the boundary parameter c b -instead of changing the set of configurations as in the work of BKW. Additionally, we extend the statement to the setup where the parameters in the six-vertex model are the same inside the domain and on its boundary. Following our work, this extension was used in [RS20] to provide a new proof of the first-order phase transition in the random-cluster model with q > 4. We start by introducing the graphs where the random-cluster model will be defined. Let D be a domain. Recall that ∂D denotes the circuit formed by boundary edges of D, and ∂ ext D denotes the set of faces in Z 2 \ D that are adjacent to faces in D. For every face u ∈ ∂ ext D and every vertex z on ∂D belonging to u, we call the pair (u, z) a corner of D. The corner (u, z) is called even or odd according to the parity of u.
Define a graph on the set of all even faces and corners of D by drawing edges according to the rule: • any two even faces of D having a common vertex are linked by an edge; • even corner (u, z) and even face v of D are linked by an edge if z ∈ v; • even corners (u, z) and (v, z ′ ) of D are linked by an edge if z = z ′ .
In this graph, we identify every two corners (u, z) and (u, z ′ ) such that zz ′ is an edge of D. We say that a height function h ∈ E 0,1 hf (D) is compatible with an edge-configuration η ∈ {0, 1} E(D • ) and write η ⊥ h, if it has a constant value at every cluster of η and η * (primal and dual clusters); see Figure 8. We say that cluster C of η and cluster C * of η * are adjacent, and denote this by C ∼ C * , if there exist u ∈ C and u * ∈ C * that correspond to two adjacent faces of D or to two corners of D that share a vertex.
For two adjacent clusters C and C * , we write C ≺ C * if C is surrounded by C * .
Recall the height-function measures HF 0,1 D,c and HF 0,1;c b D,c defined in Sections 2.1 and 2.5 (where we fix a = b = 1), the random-cluster measure RC q b D • ,q,p defined in Section 2.6, and that p c (q) := √ q √ q+1 .
Then, the measures HF 0,1 D,c and RC q b D • ,q,pc(q) can be coupled in such a way that the joint law is supported on pairs of compatible configurations (h, η) and can be written in either of the two following ways: claim, it is enough to show that P edge (h, η) = P cluster (h, η) and that: Relation (20) follows immediately. Indeed, summing P edge over all edge configurations compatible with h, one obtains that every vertex of D contributes e λ/2 + e −λ/2 = c if the corresponding four heights agree on diagonals, and it contributes 1 otherwise. This coincides with the definition of P hf (h). We now show (21). The height at the unique boundary cluster C * of η * equals 1 and the height at every boundary cluster C of η equals 0, whence the contribution of each such pair (C, C * ) to the LHS of (21) equals e −λ = q b / √ q. All height functions h compatible with η can be obtained by exploring the adjacency graph of clusters of η and η * starting from the boundary and at each step choosing independently whether the height is increasing or decreasing by one. Thus, every non-boundary cluster of η and η * contributes e λ + e −λ = √ q to the LHS of (21). Substituting this in (21), we get where in the first line we use that k i (η) = k(η) − k b (η) and k i (η * ) = k(η * ) − 1; in the second line we used the identity k(η * ) − k(η) − 1 = o(η) − |V (D • )| (follows from Euler's formula and can be checked by induction in o(η)) and the fact that √ q = pc(q) 1−pc(q) . It remains to show that (19) and (18) describe the same probability measure. For any pair of adjacent non-boundary clusters C and C * that satisfy C ≺ C * and any height 24 function h compatible with η and such that h(C) − h(C * ) = 1, define h C,C * on the faces of Z 2 such that h C,C * (·) = h(·) − 2 on C and its interior, and h C,C * (·) = h(·) outside of C. It is immediate that P cluster (h C,C * , η) = e −2λ P cluster (h, η).
We now prove that the same is true for P edge ; see Figure 8 for an illustration. The edges of D separating C from C * form a cyclic path ℓ of alternating vertical and horizontal edges that does not visit twice the same edge (but can visit twice the same vertex of D). Consider the difference between the expression in (19) computed for h C,C * and for h: Only edges of D • corresponding to vertices on ℓ have a non-zero contribution to ∆ C,C * : -if z has degree 2 in ℓ, then e z contributes λ/2 if e z ∈ η and −λ/2 if e * z ∈ η * ; -if z has degree 4 in ℓ, then e z contributes λ if e z ∈ η and −λ if e * z ∈ η * .
Going along ℓ in a clockwise direction, we obtain that every left turn occurs at an edge of η (and contributes λ/2 to ∆ C,C * ) and every right turn occurs at an edge of η * (and contributes −λ/2 to ∆ C,C * ). Since ℓ is a non-self-intersecting curve oriented clockwise, it has 4 more right turns than left turns, whence ∆(h C,C * , h) = −2λ and P edge (h C,C * , η) = e −2λ P edge (h, η).
The operation h → h C,C * can be described analogously when C * is surrounded by C. The combination of such operations can bring any height function h ∈ E 0,1 hf (D) to the 0, 1 height function that is equal to 0 at all even faces and to 1 at all odd faces. Since we showed that this operation has the same effect on P edge and P cluster , it is enough to show that the two probability measures are equal when h is a 0, 1 height function. In the latter case, we have: By Euler's formula, the right-hand sides of the above equations are the same up to a constant and this finishes the proof.
2) The second item is a straightforward consequence of the first item when one conditions all boundary edges to be open (where an edge of D • is a boundary edge if its endpoints are corners of D). Indeed, this sets wired boundary conditions for the randomcluster model and the contribution of the boundary edges to (20) equals e λ/2 = c b .
Since height functions are in correspondence with spin configurations (see Section 2.2), the coupling with the random-cluster model can also be stated for the spin representation. Similarly to above, we say that a spin configuration σ ∈ E ++ spin (D) and an edgeconfiguration η ∈ {0, 1} E(D • ) are compatible if σ is constant at each cluster of η and η * .
Corollary 3.1. In the notation of Theorem 7, the measures Spin ++ D,c and RC q b D • ,q,pc(q) (and the measures Spin +;e λ/2 D,c and RC 1 D • ,q,pc(q) if D is even) can be coupled in such a way that the joint law is supported on pairs of compatible configurations (σ, η) and can be written in either of the two following ways: For c = 2, the coupling becomes a uniform measure.
Corollary 3.2. 1) The measures HF 0,1 D,2 and RC 2 D • ,4,pc(4) can be coupled in such a way that the joint law is a uniform measure on pairs (h, η) of compatible configurations. In particular, the distribution of the height at a particular face u of D according to HF 0,1;1 D,2 is that of a simple random walk that starts at 0, at each step goes up or down by 1 uniformly, and makes in total as many steps as there are clusters of η and η * surrounding u, where η is distributed according to RC 2 D • ,4,pc(4) . If D is an even domain, the same holds for the measures HF 0,1;2 D,2 and RC 1 D • ,4,pc(4) . 2) Similarly, the measures Spin ++ D,2 and RC 1 D • ,4,pc(4) can be coupled in such a way that the joint law is a uniform measure on pairs (σ, η) of compatible configurations; and the same for Spin +;1 D,2 and RC 1 D • ,4,pc(4) if D is an even domain.

Input from the random-cluster model
In this section, we discuss some fundamental properties of the random-cluster model RC q b Ω,q,p introduced in Section 2.6, with a priori different weights q b and q for boundary and non-boundary clusters. These properties are derived in a straightforward manner from the known results on the standard random-cluster model -classical results are described in [Gri06], and the relevant recent results were established in [BDC12, DCST17, Define a partial order on {0, 1} E(Ω) as follows: η ≤ η ′ if η(e) ≤ η ′ (e) for any e ∈ E(Ω). An event A ⊂ {0, 1} E(Ω) is called increasing if its indicator is an increasing function with the respect to this partial order.
It is well-known that when q ≥ 1, the standard random-cluster model is positively associated ([Gri06, Theorem (3.8)]): any two increasing events are positively correlated. Below we show that when q b ∈ [1, q], the measure P q b RC,Ω,q,p satisfies the FKG lattice condition, which in particular implies positive association [FKG71].
Proof. We write P instead of RC q b Ω,q,p for brevity. By [Gri06, Theorem (2.22)] it is sufficient to consider only pairs of configurations that differ on exactly two edges. In this case, the lattice conditions takes the form: where e, f ∈ E(Ω), η ∈ {0, 1} E(Ω) , all four configurations η ef , η ef , η e f , η f e agree with η The term counting the edges cancels out in (25) and it remains to take care of the number of clusters. We are going to use the following notation: In this notation, we need to show that: It is easy to see that (e, f ) = 2, then each of e and f connects two different boundary clusters. Using the inequalities q ≥ 1 and q b ∈ [1, q], it is then enough to show that This means that clusters in η ef containing endpoints of e and f can be denoted by C 1 , C 2 , C 3 , so that: e connects C 1 and C 2 ; f connects C 2 and C 3 ; C 1 and C 3 are boundary clusters; C 2 is an interior cluster. Clearly, in this case ∆ i (e)+∆ i (f )−∆ i (e, f ) = 1 and the proof is finished.
Denote by RC wired Ω,q,p and RC free Ω,q,p the standard random-cluster measures on Ω ⊂ Z 2 with wired and free boundary conditions, respectively. As defined above,  i) One has RC 1 Ω,q,p = RC wired Ω,q,p and RC q Ω,q,p = RC free Ω,q,p . In particular, as Ω ↗ Z 2 , the infinite-volume limits RC 1 q,p and RC q q,p are well-defined and coincide with the wired and free Gibbs states for the random-cluster model.
iii) Let p ̸ = p c (q) and Ω k be a sequence of domains increasing to Z 2 . Then the infinitevolume limit of RC q b Ω k ,q,p exists, is independent of q b and coincides with the unique Gibbs state for the random-cluster model with parameters q, p. We denote it by RC q,p .
iv) The statement of item iii) holds true also if q ∈ [1, 4] and p = p c (q). Also, the following Russo-Seymour-Welsh (RSW) type estimate holds for any vertex u ∈ Ω and some constants c, C > 0 independent of Ω: where N Ω is the number of connected components surrounding u.
v) Let q > 4. Then, under RC 1 Ω,q,pc , the size of any dual cluster has exponential tails. In particular, RC 1 q,pc -a.s. there exists a unique infinite cluster and, under RC q q,pc , the sizes of clusters exhibit exponential decay.
Proof. i) When q b = q, all clusters receive the same weight. There is no imposed connectivity on the boundary. Thus, this value of q b corresponds to free boundary conditions. When q b = 1, the number of boundary clusters has no influence on the distribution. This is equivalent to counting all of them as one cluster. Thus, this value of q b corresponds to wired boundary conditions.
ii) In the same way as for the standard random-cluster model ([Gri06, Theorem (3.21)]), the statement follows from the FKG inequality shown in Proposition 4.1.
iii) By [Gri06, Theorem (6.17)] and item i), for any q ≥ 1 and p ̸ = √ q √ q+1 , the measures RC 1 Ω k ,q,p and RC q Ω k ,q,p have the same limit, as k tends to infinity. By item ii), for any q b ∈ [1, q], RC 1 Ω k ,q,p ⪰ RC q b Ω k ,q,p ⪰ RC q Ω k ,q,p , whence the claim follows. iv) By [BDC12], when q ≥ 1, the random-cluster model exhibits a phase transition at p = p c (q) (see also [DCRT18,DCRT19] for alternative proofs). It was shown [DCST17] that, when q ∈ [1, 4], the phase transition is of the second order. In particular, this means that the Gibbs measure is unique. In the same way as in item iii), this implies that the limit of RC q b Ω k ,q,p is independent of q b ∈ [1, q]. The estimate (26) is a standard consequence of the RSW theory developed in [DCST17]. We provide only a sketch of the proof. It is enough to consider only q b = q, since for q b = 1 the proof is completely analogous and then the statement can be extended to any q b ∈ (1, q) by monotonicity shown in Item ii. The following claim allows to bound N Ω from above and below by Bernoulli random variables.
Without loss of generality, we can assume that u = 0.

Claim 1.
Let E open and E closed be the events that there exists a circuit of open (resp. closed) edges in Ω \ Λ rad(Ω)/2 that goes around 0. Then there exists a constant c ′ > 0 not depending on Ω such that we have Proof. Inequalities for E closed and E open are completely analogous, so we will show only the first one. The lower bound follows readily from the box-crossing property established in Theorems 2 and 3 of [DCST17] for q ∈ [1, 4] under any boundary conditions. The upper bound is also a rather straightforward consequence of the Russo-Seymour-Welsh theory but it is less standard so we prefer to give details below. Let r := dist(0, Ω c ). Let F 1 be the event that there exists a circuit of open edges contained in Λ r/2 \ Λ r/4 and going around 0. Let F 2 be the event that there exists an open path linking two different points on the boundary of Ω and passing through Λ r/4 . Since F 1 ∩ F 2 ∩ E closed = ∅, it is enough to show that there exists c ′ > 0 such that Events F 1 and F 2 are increasing, thus it is enough to show the statement for each of them separately. By the definition of r, there exists a vertex z ∈ Λ r+1 that belongs to the boundary of Ω. Then RC 1 Ω,q,pc (F 2 ) is greater or equal than the probability to have an open circuit going around z and crossing Λ r/4 under RC q,pc (the unique infinite-volume measure). The latter, as well as RC 1 Ω,q,pc (F 1 ), can be bounded below as explained in the beginning of the proof.
To see how the estimate (26) follows from the claim, we refer the reader to the proof of [GM21, Theorem 1.2 (v)]. The only difference is that in our case one has two types of clusters -primal and dual. However, since Claim 1 takes care of both of them, this does not have any impact on the proof.

FKG for heights, proof of Proposition 2.3
In this section we discuss the positive association properties of the height function measures. These are deduced from straightforward applications of the FKG inequality (as done in [BHM00] for the uniform case a = b = c). We also deduce Proposition 2.3 as a corollary. In addition, the measures HF t D,a,b,c are stochastically increasing in t. Moreover, the FKG lattice condition is satisfied also for the measure HF t;c b D,a,b,c with c ≥ max{a, b}, c b ≥ 0. If c b ≥ 1, then HF t;c b D,a,b,c is stochastically increasing in t.
Proof of Proposition 5.1. By [FKG71, Proposition 1], it is enough to check for any two height functions f, g on D with the given boundary conditions that the FKG lattice condition is satisfied: where f ∨ g and f ∧ g denote the point-wise maximum and minimum respectively. We start the proof by the following claim.
Claim. If f (u) > g(u) for some u ∈ F (D), then on all four faces adjacent to u we have that f ∨ g coincides with f and f ∧ g coincides with g.
Proof. The functions f and g must have the same parity at u. Thus, f (u) > g(u) implies that f (u) − g(u) ≥ 2. Take any face v adjacent to u. Since f, g are height functions, Note that the Claim implies that, on any two adjacent faces u, v in D, each of the functions f ∨ g and f ∧ g coincides either with f or with g (or with both of them). We know that |f (u) − f (v)| = 1 and |g(u) − g(v)| = 1. Thus, the same holds for f ∨ g and f ∧ g, and hence these two functions are also height functions.
It remains to show for any vertex z of D that its contribution to the LHS of (27) is greater or equal than to the RHS of (27). Denote by (u i ) i=1,2,3,4 the four faces of D containing z (in this cyclic order). If either f or g is larger than the other on all of the (u i ) then this statement is trivial. Otherwise, by the claim, there is a pair of diagonally adjacent faces u i , u j such that f (u i ) > g(u i ) and f (u j ) < g(u j ). In this case, z is necessarily a c-type vertex (see Figure 2) for both f ∨ g and f ∧ g and the statement follows from the assumption that c ≥ max{a, b}.
The same argument applies also to the analogue of (27) for HF t;c b D,a,b,c , when z ∈ D\∂ V D. If z ∈ ∂ V D, then contribution to both sides of (27) is trivially the same.
Monotonicity in t for HF t D,a,b,c and HF c b ,t D,a,b,c follows in the standard manner (by enlarging D).
Proof of Proposition 2.3. Recall that E 0,1 hf (D) is the set of all height functions on D with 0, 1 boundary conditions. Let A be any increasing event on E 0,1 hf (D). We need to show that the derivative of HF 0,1;c b D,c (A) in c b is non-negative. Define Z and Z(A) by Then HF 0,1;c b D,c (A) = Z(A)/Z, and its derivative in c b can be written as where E denotes the expectation with respect to HF 0,1;c b D,c . The random variable n b is equal to the number of vertices z ∈ ∂ V D, such that the unique face of D containing z has height 1. Since the height at these faces can be either 1 or −1, the variable n b is increasing. Thus, the RHS of the last equality is positive by the FKG inequality (Proposition 5.1).
This implies the second inequality of the claim and the rest follows, since

6 The behavior of the height function
Throughout this section we assume that a = b = 1 and c ≥ 2. The proofs can be adapted to the general case a + b ≤ c in a straightforward way (see Section 2.7.2).

Fluctuations
In this section we prove Theorem 1.
Proof. Let D be a domain. Define D even and D odd as the domains obtained from D by removing from D all its boundary faces that are even (resp. odd). It is easy to see that D even is an even domain and D odd is an odd domain (we assume here that D even and D odd are connected but this has no effect on the proofs). Take the unique λ ≥ 0, such that e λ/2 + e −λ/2 = c. The following comparison inequalities follow from Proposition 2.3 or can be obtained along the same lines: Since HF 2,1 D,c is the image of HF 0,−1 D,c under the bijection h(·) → h(·) + 2 between E 0,−1 hf and E 2,1 hf , it is enough to prove Theorem 1 for measures HF 0,1;e λ/2 D even ,c and HF 0,1;e λ/2 D odd ,c . We prove the statement only for D even (the case of D odd is analogous) and, to simplify the notation, we assume that D is an even domain, so that D even = D. Clearly, where the variance and the expectation are with respect to the height-function measure HF 0,1;e λ/2 D,c . Since E HF (h(0, 0)) ∈ [0, 1] by (28), it is enough to estimate E HF (h 2 (0, 0)). Take q := (e λ + e −λ ) 2 . For an edge-configuration η on D • , denote the number of primal and dual clusters of η surrounding the origin (0, 0) by N (η). By the coupling stated in Theorem 7, given η sampled according to RC 1 Ω • D ,q,pc(q) , the height at the origin is distributed as a simple random walk on Z starting at 0, making N (η) steps and at each step going up or down by 1 with probability e −λ / √ q (resp. e λ / √ q). Then, If c = 2, then λ = 0 and q = 4, whence the second term cancels out and the first term is treated in Item (iv) of Proposition 4.2. If c > 2, then q > 4 and by Item (v) of Proposition 4.2, the size of any dual cluster in η has expenontial tails. In particular, it means that RC 1 D • ,q,pc(q) (N η > t) < e −αt , for a certain constant α > 0 depending only on q, whence the statement follows.

Gibbs states
In this section we prove Theorem 2.
The main step in the proof is Proposition 6.1, which is proven by considering percolation on faces of particular heights. This is somewhat reminiscent to the approach used in [She05]. However, we emphasize that unlike in [She05], here we consider percolation on a suitable triangular lattice (T • and T • defined below). The latter has a benefit of being self-dual and we hope that this approach will turn out to be useful in the future research.
Proposition 6.1. Fix c > 2 and n ∈ Z. Let D k be an increasing sequence of domains exhausting Z 2 . Then, the sequence of measures HF n,n+1 D k ,c converges to a Gibbs state HF n,n+1 c , which is extremal, invariant under parity-preserving translations, and satisfies the exponential tail decay (3).
The first step is to prove a similar statement under modified boundary conditions. Lemma 6.2. Let c > 2. Take λ > 0 such that e λ/2 + e −λ/2 = c. Let D k be an increasing sequence of even domains exhausting Z 2 . Then, the sequence HF 0,1;e λ/2 D k ,c converges to a Gibbs state HF 0,1;e λ/2 even,c , which is extremal and invariant under parity-preserving translations. Moreover, HF 0,1;e λ/2 even,c -a.s. faces of height 0 contain a unique infinite cluster (in the diagonal connectivity), and diameters of connected components of non-zero heights have exponential tails.
Similarly, for odd domains, the limit HF 0,1;e λ/2 odd,c exhibits a unique infinite cluster of height 1, and Proof. We prove the statement only for even domains, since the case of odd domains is completely analogous. As was already mentioned above, the centres of even faces of Z 2 form another square lattice that we denote by (Z 2 ) • .
Take q := (c 2 − 2) 2 . Let RC 1 q be the wired infinite-volume random-cluster measure on (Z 2 ) • with parameters q and p c (q) := √ q √ q+1 . By Item (v) of Proposition 4.2, RC 1 q -a.s. there is a unique infinite cluster and dual clusters are exponentially small, that is there exists α > 0 such that, for any k ∈ N and any u * ∈ (Z 2 ) • , RC 1 q ( the dual cluster of u * has size > k) < e −αk .
Define a random height function h in the following way: sample η ∈ {0, 1} E((Z 2 ) • ) according to RC 1 q , set h to be 0 on the unique infinite cluster of η, then sample h in the holes of this cluster according to the coupling (18) in Theorem 7 for c b = e λ/2 and q b = 1. Denote by HF 0,1;e λ/2 even,c the distribution of h. Note that measure HF 0,1;e λ/2 even,c is well-defined since the values of h in different holes of η are independent (conditioned on η) and the size of each hole has exponential tails. Properties of HF 0,1;e λ/2 even,c (extremality, invariance under parity-preserving translations and existence of an infinite cluster of height 0) follow from the corresponding properties of RC 1 q (extremality, invariance under all translations and existence of an infinite cluster). It remains to show that HF 0,1;e λ/2 D k ,c tends to HF 0,1;e λ/2 even,c . Fix any ε > 0 and take n big enough so that e −αn < ε. Recall the definition of D • for a domain D given in Section 3. Since RC 1 D • k ,q tends RC 1 q , there exists K > 0 such that D K ⊃ Λ 4n and, for all k > K, the total variation distance between the restriction of RC 1 q and RC 1 D • k ,q to Λ • 2n is less than ε. Fix any k > K. Define C n,k to be the exterior-most circuit of open edges in η contained in Λ • 2n that goes around Λ • n (take C n,k := ∅ if no such circuit exists).
Using the estimate on the total variation distance we get that where the sum is taken over all circuits C ⊂ Λ • 2n \ Λ • n . Also note that by (29) we have If η satisfies conditions in (31) and (32), then given C n,k the heights on Λ n sampled according to HF 0,1;e λ/2 even,c and HF 0,1;e λ/2 D k ,c have the same law. Putting this together with the estimates (30), (31) and (32) we get that the total variation distance between restrictions of HF 0,1;e λ/2 even,c and HF 0,1;e λ/2 D k ,c to Λ n is less than 5ε, whence convergence follows.
As we will show below, measures HF 0,1;e λ/2 even,c and HF 0,1;e λ/2 odd,c are in fact equal. The next step in the proof of Proposition 6.1 is to establish certain percolation statements for the faces of height 1 under HF 0,1;e λ/2 even,c and for the faces of height 0 under HF 0,1;e λ/2 odd,c .
Definition. Denote by T • (resp. T • ) the graph on the even (resp. odd) faces of Z 2 , where a face (i, j) is linked by an edge to the faces (i ± 1, j ± 1) and (i ± 2, j). It is easy to see that both T • and T • are isomorphic to the standard triangular lattice. As usual, a circuit in and v 1 , . . . , v n−1 distinct. The circuit is said to surround a vertex v / ∈ {v 1 , . . . , v n−1 } if the collection of edges (v i , v i+1 ), viewed as straight line segments in the ambient R 2 , disconnect v from infinity.
For K ∈ N and (i, j) ∈ Z 2 , denote by Λ K (i, j) the ball of radius K around (i, j): Let C 0 K be the exterior-most T • -circuit of height 0 in Λ K (1, 0) that surrounds the face (1, 0) (take C 0 K := ∅ if there is no such circuit). Similarly, let C 1 K be the exterior-most T • -circuit of height 1 in Λ K (0, 0) that surrounds the face (0, 0); see Figure 9. coincides with the distribution of C 1 K under HF 0,1;e λ/2 even,c shifted by 1 to the right. In addition, for any N ∈ N, Proof. For k ∈ N, let f k be distributed according to HF 0,1;e λ/2 Λ 2k (0,0),c . Define g k by: It is straightforward that g k is supported on height functions and the image of 0, 1 boundary conditions under this mapping are again 0, 1 boundary conditions, though on a slightly different domain. More precisely, the domain is 1 + Λ 2k (0, 0), which is the same as Λ 2k (1, 0). In conclusion, height function g k is distributed according to HF 0,1;e λ/2 Λ 2k (1,0),c . The domains Λ 2k (1, 0) form a sequence of odd domains, whence by Lemma 6.2 the weak limit of HF 0,1;e λ/2 Λ 2k (1,0),c is HF 0,1;e λ/2 odd,c . Also, by Lemma 6.2 the weak limit of HF 0,1;e λ/2 Λ 2k (0,0),c is HF 0,1;e λ/2 even,c . Thus, measure HF 0,1;e λ/2 odd,c is obtained from HF 0,1;e λ/2 even,c by the operation described in (34). Finally, it is easy to see that under the operation (34), circuit C 1 K is mapped into circuit C 0 K . It remains to show (33), which is equivalent to showing that HF 0,1;e λ/2 even,c -a.s. there are infinitely many disjoint T • -circuits of height 1 surrounding the origin. Lemma 6.2 implies that, for all n, measure HF n,n+1;e λ/2 even,c is extremal. Thus, the HF 0,1;e λ/2 even,c -probability to have infinitely many disjoint T • -circuits of height 1 surrounding the origin is either zero or one. Assume that this probability is zero. Then, by the self-duality of T • and the extremality of HF 0,1;e λ/2 even,c , at least one of the following occurs: where F • ≤n (resp. F • ≥n ) denotes the set of odd faces of height smaller (resp. greater) or equal to n. The bijection mapping h → 2 − h implies that (36) is equivalent to This means that, HF 0,1;e λ/2 even,c -a.s. each of F • ≤−1 and F • ≥1 contains an infinite T • -cluster. These clusters are unique. Indeed, the measure HF 0,1;e λ/2 even,c is invariant under paritypreserving translations by Lemma 6.2. Then, the general argument of Burton and Keane [BK89] can be applied in the same way as in [CPST21,Theorem 4.9] to yield uniqueness of the infinite clusters.
Since HF 0,1;e λ/2 even,c is positively associated and F • ≥1 is increasing, the marginal of HF 0,1;e λ/2 even,c on F • ≥1 is also positively associated. Also, the compliment to F • ≥1 in T • is the set F • ≤−1 . In conclusion, the distribution of F • ≥1 is invariant under all translations of T • , is positively associated, and in both F • ≥1 and its compliment there is almost surely a unique infinite cluster. This contradicts [DCRT19, Theorem 1.5]. We note that the latter theorem was established for pairs of dual edge-configurations but it adapts in a straightforward manner to the setting of pairs of site-configurations on the triangular lattice used here.
We are now ready to finish the proof of Proposition 6.1. Proof of Proposition 6.1. Without loss of generality, one can assume that n = 0. The main step of the proof is to show that HF 0,1;e λ/2 even,c = HF 0,1;e λ/2 odd,c . Take any N ∈ N and ε > 0. By Lemma 6.3, there exists K ∈ N such that HF 0,1;e λ/2 even,c (C 1 K surrounds Λ N (0, 0)) > 1 − ε.
Fix such K. Consider any T • -circuit C 1 which surrounds Λ N (0, 0) and for which HF 0,1;e λ/2 even,c (C 1 K = C 1 ) > 0. Define C 0 as the T • -circuit obtained from C 1 after a shift by 1 to the right. Let D K be the set of faces in the connected component of Z 2 \ (C 1 ∪ C 0 ) containing Λ N (0, 0). Conditioned on the event that C 1 K = C 1 , the heights on the circuit C 0 are at least 0, whence HF 0,1;e λ/2 even,c (· | C 1 Similarly, conditioned on the event that C 0 K = C 0 , the heights at the faces of the circuit C 1 are at most 1, whence We now couple h 1 ∼ HF 0,1;e λ/2 even,c and h 0 ∼ HF 0,1;e λ/2 odd,c in the following way; see Figure 9. First, we explore C 1 K (h 1 ) from outside, for all (x, y) ∈ Z 2 in the exterior of C 1 K (h 1 ), we set h 0 ((x, y)) = 1 − h 1 ((x − 1, y)).
This, in particular, implies that C 0 K (h 0 ) is obtained from C 1 K (h 1 ) by shift by one to the right. Also, conditioned on C 1 K (h 1 ) = C 1 , the distribution of h 1 and h 0 on D K is given by HF 0,1;e λ/2 even,c (· | C 1 K = C 1 ) and HF 0,1;e λ/2 odd,c (· | C 0 K = C 0 ), respectively. By (38) and (39), these two measures can be coupled in such a way that, for all (x, y) ∈ D K , h 0 ((x, y)) ≤ h 1 ((x, y)).
The opposite inequality follows from monotonicity in boundary conditions. Indeed, let D k be any increasing sequence of even domains exhausting Z 2 . Define the sequence D ′ k of odd domains obtained from D k after a shift by one to the right. By Proposition 2.3, Also, by the monotonicity in boundary conditions (Proposition 5.1), By taking limits on both sides with we get the desired in equality. In conclusion, HF 0,1;e λ/2 even,c = HF 0,1;e λ/2 odd,c .
We denote this measure by HF 0,1 c . Then, by (28), for any sequence of domains D ′′ k , the limit of HF 0,1;e λ/2 D ′′ k ,c exists and is given by HF 0,1 c . Properties of HF 0,1 c follow immediately from Lemma 6.2.
The following corollary is a straightforward consequence of the convergence proven in Lemma 6.2, the equality HF 0,1;e λ/2 even,c = HF 0,1;e λ/2 odd,c proven in Proposition 6.1 and the monotonicity in c b established in Proposition 2.3. We are now ready to finish the proof of Theorem 2.
Proof of Theorem 2. Fix c > 2. Existence, invariance under parity-preserving translations and extremality of Gibbs measures {HF n,n+1 c } n∈Z is established in Proposition 6.1. Also, this proposition gives existence of a unique infinite cluster on heights n and n+1 with exponentially small holes whose diameters satisfy (3). Invariance under transformation h(i, j) → 1 − h(i − 1, j) and stochastic ordering in n follow readily from the construction of {HF n,n+1 c } as the infinite-volume limit under (n, n + 1)-boundary conditions. Now take any extremal Gibbs state HF for the height-function measure with parameter c that is invariant under parity-preserving translations. It remains to show that HF = HF n,n+1 c , for some n ∈ Z. For any n ∈ Z, define the following events: F 2n := {∃ infinitely many disjoint T • -circuits of height 2n surrounding (0, 0)}, F 2n+1 := {∃ infinitely many disjoint T • -circuits of height 2n + 1 surrounding (0, 0)}.
Assume there exist m, n ∈ Z, such that m < n and HF(F m ) = HF(F n ) = 1. All faces that are adjacent to a face of height m have height at most m + 1. Thus HF-a.s., there exist infinitely many circuits (in the usual Z 2 connectivity, even and odd faces are alternating) of height at most m + 1. By positive association (Proposition 5.1), this implies that HF ⪯ HF m,m+1 c . Similarly, HF ⪰ HF n−1,n c . Then necessarily n = m + 1 and HF = HF m,m+1 c , which would finish the argument. Thus, we can assume that either HF(F m ) = 0 for all even values of m or HF(F m ) = 0 for all odd values of m. Without loss of generality, below we assume that for any n ∈ Z, Define the following events: By (40), for any n ∈ Z, HF(A 2n+2 ∪ B 2n−2 ) = 1.
Without loss of generality, assume that, for n = 0, the first alternative in (41) occurs, that is HF(A 2 ) = 1 (the case HF(B −2 ) = 1 is completely analogous). Then, by [DCRT19, Theorem 1.5], applied here in the same way as in the proof of Lemma 6.3, Applying (41) for n = 1, we obtain that HF(A 4 ) = 1. Continuing in the same way, we obtain HF(B 2n ) = 0, for all n ∈ N -and hence, by monotonicity, for all n ∈ Z. This implies that, for all n ∈ Z, HF ⪰ HF 2n−1,2n c .
This leads to a contradiction, since then, for any L ∈ Z, This finishes the proof in the case c > 2. Now assume there exists an extremal translation-invariant Gibbs state HF for the height-function measure with parameter c = 2. What we showed above implies that, for some n ∈ Z, HF(F 2n ) = 1.
Then, by positive association, for every K, N , where the infimum is taken over all domains D containing Λ K . It follows from the proof of Theorem 1 that the limit in K of the right-hand side of the last inequality is bounded below by a positive constant that is independent of N . Contradiction.

Consequences for the FK model with modified boundarycluster weight
In this section we prove Theorem 6.
Proof. Item (ii) of Theorem 6 follows from Item (iv) of Proposition 4.2 which is in turn implied by known results for the random-cluster model. It remains to show Item (i).
Let q > 4. By Theorem 7, the measure RC e −λ q D • k ,q,pc(q) can be coupled with HF 0,1 D k ,c , and RC 1 and RC 1 D • k ,q,pc(q) also have the same limit. By monotonicity in q b established in Proposition 4.2, we get that, for all q b ∈ [1, e −λ √ q], the limit of RC q b D • k ,q,pc(q) also exists and is equal to RC 1 q,pc(q) .

The behavior of the spin representation and the sixvertex model
Throughout this section we assume that a = b = 1 (see Section 2.7.2 for a ̸ = b).
The main tools in the proof are the FK-Ising-type representation ξ introduced in Section 7.1 and the height representation of the six-vertex model.
Let D be a domain on Z 2 . Recall a pair of dual graphs D • and D • defined on even and odd faces of D (Section 3). Recall that, for a spin configuration σ on Z 2 , the restrictions of σ to even and odd faces are denoted by σ • and σ • . Define

FK-Ising-type representation: Item 1 of Theorem 3, Corollary 2.1
The FK-Ising representation of the six-vertex model that we discuss in this section is directly related (see the remark after Proposition 8.1) to the random-cluster representation of the Ashkin-Teller model introduced by Pfister and Velenik [PV97] and is used in Section 8 to describe the coupling between the two models. For the Ashkin-Teller model, this representation allowed to derive the Lebowitz inequality [CS00]. Here we choose to define this representation in terms of the six-vertex model in order to avoid confusion between different models and restrict the appearance of the Ashkin-Teller model to Section 8. We refer the reader to the works of Ray and Spinka [RS22] and Lis [Lis22] where the representations on the primal and the dual lattices are considered simultaneously. Given σ • and σ • , define a random edge-configuration , then ξ(e) = 1 with probability c−1 c and ξ(e) = 0 with probability 1 c ; see Figure 10. We call ξ the FK-Ising representation of the six-vertex model. The measure FKIs D • ,c is defined as the distribution of ξ when σ is distributed according to Spin ++ D,c . It is easy to see that σ has constant value on clusters of ξ and ξ * (in the terminology of Section 3, σ and ξ are compatible). In the next lemma, we state further properties of this coupling.
Lemma 7.1. i) The joint law of σ and ξ can be written as: ii) The measure FKIs D • ,c can be written in the following way: iii) Let ξ be distributed according to FKIs D • ,c . Assign plus to all boundary clusters of ξ and plus or minus with probability 1/2 independently to all other clusters of ξ. Then, the obtained spin configuration has the same distribution as the marginal distribution of Spin ++ D,c on σ • . Proof. i) By the definition of ξ, one has The indicators in the formula are equivalent to saying that σ and ξ are compatible.  In order to discuss the properties of the FK-Ising representation in the infinite volume for c > 2, we first state a straightforward consequence of Theorem 2 for the thermodynamic limits of spin measures. in such a way that the joint law is supported on pairs of compatible configurations (σ, ξ) and satisfies the following properties: • given σ, if σ(u) = σ(v) and σ(u * ) = σ(v * ) for a pair of dual edges uv of (Z 2 ) • and u * v * o f(Z 2 ) • , then ξ(uv) = 1 with probability (c − 1)/c; • given ξ, if C is a finite cluster of ξ, the value of σ on C is constant plus or minus with probability 1/2, and the value of σ is fixed to be plus on infinite clusters of ξ.
Moreover, the measure FKIs c is extremal, translation-invariant and there exists a unique infinite cluster in ξ almost surely. Lastly, the following relation holds for any two odd faces u, v of Z 2 : Proof. Existence of the limiting measure FKIs c , existence of the coupling and the itemized statements follow immediately from the finite-volume coupling of Lemma 7.1 and the limiting properties of Lemma 7.2. The coupling satisfies that, conditioned on σ, the edges of ξ are independent. Thus, extremality and translation invariance of FKIs c follow from the corresponding properties of Spin ++ c . We proceed to show the uniqueness of the infinite cluster. It is easy to see that FKIs c satisfies the finite energy property. Thus, the argument of Burton and Keane [BK89] can be applied to show that FKIs c -a.s. there is at most one infinite cluster in ξ. Assume in order to get a contradiction that there is no infinite cluster. Then the spin measure Spin Proof of Item 1 of Theorem 3. Convergence to the infinite-volume limit, as well as existence and uniqueness of the infinite clusters, follow from Lemma 7.2. The property in the second bullet follows from (44) and the fact that the spins at faces of the same parity are non-negatively correlated by Theorem 4 (the theorem is stated in finite volume but extends to the infinite volume limiting measure).
We point out that an alternative proof of the non-negative correlation of the spins at faces of the same parity could be obtained using positive association of FKIs D • ,c (Proposition 7.4 below), together with (44) and (45). However, such a proof would not apply in the entire regime a + b < c since we establish Proposition 7.4 only in the restricted range c ≥ max{2a, 2b}. This is in contrast to the FKG property of the spins (Theorem 4) which applies whenever c ≥ max{a, b}.
We proceed to prove Corollary 2.1. We will view each configuration ⃗ ω of the six-vertex model as an element in {1, −1} E(Z 2 ) , where ⃗ ω(e) = 1 iff when following e in the direction that it is assigned in ⃗ ω, the even face bordering e is on the left. In particular, when c = 2, by Theorem 3, one has Spin ++ c = Spin +− c -and hence SixV ⟲ c = SixV ⟳ c is the unique flat Gibbs state. This finishes the proof for c = 2. Below we assume that c > 2. We will now prove that SixV ⟲ c (A(e)) > 1/2, for every edge e ∈ E(Z 2 ). This is equivalent to showing that, for any pair of adjacent faces u (even) and u * (odd), Consider an even face v adjacent to u * and the odd face v * adjacent to u and v. Take λ > 0 such that e λ/2 + e −λ/2 = c. It follows from Lemma 6.2, that the weak limit of Spin ++;e λ/2 D,c over even domains exists and is equal to Spin ++ c . Take q := (e λ + e −λ ) 2 , p c (q) := √ q √ q+1 . By Corollary 3.1, when D is an even domain, measures Spin ++;e λ/2 D,c and RC 1 D • ,q,pc(q) can be coupled in such a way that the joint law is given by (24) and thus: Taking a thermodynamic limit, one obtains Similarly, when D is an odd domain, measures Spin ++;e λ/2 D,c are RC 1 D • ,q,pc(q) are coupled and where RC * ,1 q,pc(q) stands for the wired random-cluster measure on odd faces. By duality, Subtracting (48) from (47), we obtain It was proven in [DCGH + 21] that RC 1 q,pc(q) ̸ = RC 0 q,pc(q) when q > 4 (see Proposition 4.2). Positive association then implies that that the LHS of (49) is strictly positive. Also, by translation invariance of Spin ++ c , we have Spin ++ c (σ(u)σ(u * )) = Spin ++ c (σ(v)σ(v * )). Substituting this in (49) proves 46, since λ > 0.
To finish the proof for c > 2, it remains to prove the inequality (5). We use that SixV ⟲ c (A(e)) = SixV ⟲ c (A(f )) by invariance, approximate the inequality by its finite volume analogue on even domains and rewrite it in terms of spins: When Spin ++;e λ/2 D,c is coupled to RC 1 D • ,q,pc(q) , the spins are assigned to different clusters independently, whence the inequality follows readily.
The next two propositions describe properties of the FK-Ising representation of the six-vertex model required for the proof of Theorem 5 for the Ashkin-Teller model (Section 8). We want to emphasize that, unlike other proofs in this section, the proofs of the propositions below do not extend to the full parameter regime c ≥ a + b but rather only to its subset c ≥ max{2a, 2b} (we do not know whether positive association holds whenever c ≥ a + b, however, the FKG lattice condition may fail when either 2a > c or 2b > c). and ξ e f . Substitute (43) in this inequality. The term (c − 1) |ξ| 2 k(ξ 1 ) is just the usual FK-Ising measure (random-cluster with q = 2) and it is standard (and follows from Proposition 4.1) that it satisfies the FKG lattice condition. Thus, it is enough to show that Let P denote the Ising measure with parameter β = 1 2 log(c − 1) and plus boundary conditions on the graph obtained from D • after identifying all vertices belonging to the same connected component of (ξ e,f ) * . Also, denote the endpoints of e * (resp. f * ) by u e and v e (resp. u f and v f ). Then, the ratios can be written in terms of P as follows: The last inequality follows from the second Griffiths' inequality in the Ising model [Gri67, Theorem 2] (see also [PS19, Theorem 2.3]) and thus holds whenever β ≥ 0, that is c ≥ 2.
Proposition 7.5. Let c > 2. Then FKIs c -a.s. there exists an infinite cluster, while dual clusters are exponentially small -there exist M, α > 0 such that, for any even faces u, v, Proof. Since FKIs c is obtained as a limit of finite-volume measures, Lemma 7.4 implies that it satisfies positive association inequality for any increasing events of finite support. By Corollary 7.3 measure FKIs c is extremal. Hence, approximating any increasing events with increasing events of finite support and using the martingale convergence theorem, we get that FKIs c is positive associated.
Corollary 7.3 showed that ξ exhibits a unique infinite cluster almost surely. Denote it by C ∞ . Applying the argument of Burton and Keane [BK89] to ξ * , we obtain that either FKIs c -a.s. there is no infinite cluster in ξ * or FKIs c -a.s. there exists a unique infinite cluster in ξ * . The latter option, together with the translational invariance, positive association and existence of C ∞ , contradicts [DCRT19, Theorem 1.5]. Hence, FKIs c exhibits no infinite dual cluster.
Recall that an ℓ 1 box of size n centered at the origin is denoted by Λ n . We now show for some M ′ , α ′ > 0. Indeed, assume the event in (51), so that all clusters of ξ that intersect Λ n are finite. Then, by Corollary 7.3, the distribution of σ • on Λ n conditioned on this realisation of ξ is invariant under a global sign flip. By duality then, two opposite sides of Λ n are linked by a T • -crossing (recall definitions above Lemma 6.3) of minus spins with probability at least 1/2. However, averaging over all ξ, we get Spin ++ c , where this event has exponentially small probability -by the pushforward of (3) in Theorem 2 to spins. This gives (51).
It remains to show that (51) implies exponential decay of connectivities in ξ * . For any u ∈ ∂Λ n , let A u be an event that u is connected to the origin by a path in ξ * ∩ Λ n . Define, A ′ u := A u + u and A ′′ u := A u + 2u. Combining the crossings and using the positive association (Proposition 7.4) and translational invariance of FKIs c , we obtain Note that the crossing described above does not intersect Λ n + 3 2 · n(1 + i). Since FKIs c is invariant under rotation by π 2 , we obtain bounds on existence of crossings 3u ↔ 3u + 3iu, 3u + 3iu ↔ 3iu, and 3iu ↔ 0 none of which intersects Λ n + 3 2 · n(1 + i). Combining these crossings and using the FKG inequality once again, we get The LHS of the above inequality is exponentially small by (51), then so is FKIs c (A u ) and the proof is finished.
7.2 FKG for spins: proof of Theorem 4, Proposition 2.4 Theorem 4 will follow from the next proposition.
Proposition 7.6 (FKG lattice condition). Let D be a domain, c ≥ 1 and τ ∈ {1, −1} F (Z 2 ) be such that τ is a plus at all odd faces. Then, for every σ e , σ ′ Recall the pair of dual graphs D • and D • introduced in Section 3. The proof goes through the FK-Ising and the dual FK-Ising representations on these graphs -we use that each of the terms in (53) can be interpreted as the partition function of an FK model with free boundary conditions on the set of all pluses of σ • times the same on the set of all minuses of σ • , and we derive the claim from the FKG inequality for the FK model applied separately to these partition functions.
Proof of Proposition 7.6. By [Gri06, Theorem (2.22)], it is enough to show (52) for any two configurations which differ in exactly two places i.e., that for any σ • ∈ {−1, 1} F • (Z 2 ) , that coincides with τ outside of D, and for any u, v ∈ F • (D), where σ • εε ′ is the configuration coinciding with σ • except (possibly) at u and v, and such that σ • εε ′ (u) = ε and σ • εε ′ (v) = ε ′ . By definition, the marginal of Spin τ D,c on even spins can be written as: where the 1st equality holds since edges not belonging to ω(σ • ) ∪ θ(σ • ) are exactly those contributing c to the probability of a configuration; to obtain the 2nd equality, we develop c = (c − 1) + 1 (and the resulting ξ is in fact the FK-Ising representation defined in Section 7.1); the 3rd equality is obtained by exchanging the order of summation; the 4th equality uses the fact that every non-boundary cluster of ξ receives in σ • a constant spin plus or minus independently. Configuration ξ 1 is obtained from ξ by wiring (i.e., merging) all vertices corresponding to corners of D.
The sum on the RHS of (54) is a partition function of the FK-Ising model on D • conditioned on all edges in ω(σ • ) being open. Performing the usual duality transformation and using that (follows from Euler's formula, can be proven by induction), we obtain

45
where Z ′ is the normalizing constant independent of σ • . The sum on the RHS of (55) is the partition function of the FK-Ising model on D • \ θ(σ • ) with free boundary conditions and parameter p = 2 c+1 . We denote it by Z FK (D • \ θ(σ • )). It is clear that where P (σ • ) and M (σ • ) are the subgraphs of D • spanned on the vertices having σ • -spin plus or minus, respectively. Then (55) takes form: Before inserting this into (53), note that where by u ∼ v we mean that u and v are adjacent in D • . Thus, it is enough to show the following two inequalities: .
We will show only the first inequality, as the second one is analogous. Each ratio in (57) can be linked to the probability of some event under FK-Ising measure on P (σ • ++ ). Indeed, graph P (σ • −+ ) is obtained from P (σ • ++ ) by removing vertex u together with all edges that are incident to it. Let E u (resp. E v ) be the set of edges in P (σ • ++ ) incident to u (resp. v). Then, where we used that, when ξ * ∩ E u = ∅, the number of clusters in ξ * gets increased by one when it is viewed as a spanning subgraph of P (σ • ++ ) instead of a spanning subgraph of P (σ • −+ ) (since we need to count a singleton u). Let P FK denote the FK-Ising measure on P (σ • ++ ) with parameter 2 c+1 . In order to write the RHS of the last equation as P FKprobability, it remains to substitute |E( Similarly, Substituting this in (57) and using that |E u | + |E v | − |E u ∪ E v | = 1 u∼v , we get that it is enough to show If u ̸ ∼ v, then the last inequality takes from of positive association inequality for the FK-Ising model, which is well known (see eg. [Gri06, Thm. (3.8)] and also Section 4 above).
Assume that u ∼ v. Dividing all probabilities in (58) by P FK (uv ̸ ∈ ξ * ) and rewriting them as conditional probabilities, we obtain that it is enough to show that Since the conditional probability P FK (· | uv ̸ ∈ ξ * ) is equal to the FK-Ising measure on P (σ • ++ ) \ {uv}, it is also positively associated. Thus, in order to finish the proof, it remains to show Recall that the parameter of the FK-Ising measure is equal to 2 c+1 . Pairing up edgeconfigurations on P (σ • ++ ) that coincide everywhere except at uv, one obtains 2 · c−1 c+1 · P FK (uv ∈ ξ * ) ≥ 2 c+1 · P FK (uv ̸ ∈ ξ * ), whence the claim follows readily. Proof of Proposition 2.4. The proof of Proposition 2.3 given in Section 5 can be adapted mutatis mutandis using the FKG inequality stated in Corollary 7.7.

Proof of Item 2 of Theorem 3
Proof. Let τ ∈ E spin (Z 2 ) be a constant plus at all odd faces. By Theorem 4, where by ⪯ even we mean the stochastic domination of the marginals on the spin configurations at even faces. We start by proving that Spin +− D,2 and Spin ++ D,2 converge to the same limit and then we show how this implies that the limit of Spin τ D,2 is also the same. By Corollary 3.2, measure Spin ++ D,2 can be obtained from the random-cluster measure RC 2 D • ,4,pc(4) by assigning plus to all boundary clusters and assigning plus or minus independently with probability 1/2 to all other clusters. By Proposition 4.2, the limit of RC 2 D • ,4,pc(4) , as D ↗ Z 2 , exists, is the unique random-cluster Gibbs measure RC 4,pc(4) with parameters q = 4, p = p c (4) and exhibits infinitely many primal and dual clusters surrounding the origin. Then, the infinite-volume limit of Spin ++ D,2 also exists, can be obtained from RC 4,pc(4) by assigning plus or minus independently with probability 1/2 to every cluster and thus exhibits infinitely many circuits of even (or odd) faces having constant spin plus (or minus). Denote this measure by Spin 2 .
Similarly, the infinite-volume limit of Spin +− D,2 is also equal to measure Spin 2 . Extremality of measure Spin 2 and invariance under all translations follow from the same properties of the random-cluster measure RC 4,pc(4) .
By (59), the above immediately implies that the limit of the marginal distribution of Spin τ D,c on the spin configurations at even faces exists and is equal to the corresponding marginal of Spin 2 . In particular, for any ε, N > 0, when D is large enough, where by σ • (C + ) = 1 and σ • (C − ) = −1 we mean that σ • is constant plus at C + and constant minus at C − . When this occurs, the ice-rule implies existence of a circuit C of odd faces between C + and C − , on which σ is constant plus or minus. Then there exists a simple cyclic path γ on Z 2 between C and C + such that all even faces bordering γ have constant spin plus and all odd faces bordering γ have constant spin (plus or minus). This implies that for any fixed n > 0, when N is large enough, the restriction of Spin τ D,c to the box Λ n is ε-close to the restriction of the measure Spin 2 to the same box. Letting ε tend to zero finishes the proof.

The Ashkin-Teller model on the self-dual curve
In this section we prove Theorem 5.
The main tool in the proof is the FK-Ising-type representation FKIs c introduced in Section 7.1 that allows to transfer to the Ashkin-Teller model the results established in Theorem 3 for the spin representation of the six-vertex model.
In the next proposition we describe a coupling between the six-vertex and the Ashkin-Teller models; see Figure 11. Recall graphs D • and D • dual to each other and the notion of compatible spin and edge configurations (Section 3).
Proposition 8.1. Let D be a domain on Z 2 . Let J, U ∈ R be such that sinh 2J = e −2U . Take c = coth 2J. Let (τ, τ ′ ), ξ, and (σ • , σ • ) be random variables distributed according to measures AT free,+ D • ,J,U , FKIs D • ,c , and Spin ++ D,c , respectively. Then, these random variables can be coupled in such a way that their joint law takes the form: where by τ ⊥ ξ * we mean that τ has a constant value on every cluster of ξ * , and similarly for other spin and edge configurations. As before, k(ξ * ) is the number of connected components in ξ * and ω(σ • ) is he set of edges of D • separating opposite spins in σ • .
Consider any u, v ∈ V (D • ). If τ (u) = τ (v) and τ ′ (u) = τ ′ (v), then uv contributes c + 1 to the RHS. If τ (u) ̸ = τ (v) and τ ′ (u) ̸ = τ ′ (v), then uv contributes c − 1 to the RHS. Otherwise, uv contributes (c − 1) · 1 c−1 = 1 to the RHS. It remains to check that these contributions are the same in case of the Ashkin-Teller measure AT free,+ . Multiplying all these contributions by e U and comparing to the above, we get that it is enough to check that e 2J+2U ?
Corollary 8.2. In the notation of Proposition 8.1, assign 1 or −1 uniformly at random independently to every cluster of ξ * . Then, the obtained random spin configuration on D • has the same distribution as the marginal of AT free,+ D • on τ (or, equivalently, τ ′ ). In particular, the following holds: This limiting measure, which is coupled to the uniform six-vertex model (square-ice) studied in [She05, CPST21, DCHL + 22, RS23], may be described explicitly as the case x = 1 2 of the following 4-state Lipschitz clock model. The 4-state Lipschitz clock model with parameter x > 0 on a graph G = (V, E) is supported on pairs of spin configurations (τ, τ ′ ) ∈ {−1, 1} V × {−1, 1} V which satisfy the Lipschitz condition: either τ u = τ v or τ ′ u = τ ′ v for every edge {u, v} ∈ E. It assigns probability proportional to x N (τ,τ ′ ) to each configuration in its support, where N (τ, τ ′ ) = |{{u, v} ∈ E : (τ u , τ ′ u ) ̸ = (τ v , τ ′ v )}|. The case x = 1 2 of the 4-state Lipschitz clock model on Z 2 is further shown in [IR12, Section 4] to be in correspondence with an integrable 19-vertex (19V) model and a dilute Brauer model with loop weight n = 2.
The 4-state Lipschitz clock model at x = 1 on a bipartite graph is equivalent to the proper 4-coloring model (i.e., the zero-temperature 4-state antiferromagnetic Potts model) through the mapping which flips the signs of both τ and τ ′ on one of the bipartition classes. This model is predicted to exhibit exponential decay of correlations on Z 2 [FS95].
On the triangular lattice, the 4-state Lipschitz clock model is equivalent to integervalued Lipschitz height functions (the loop O(2) model; see, e.g., [PS19,GM21]) by taking the modulo 4 of the height function. However, this equivalence does not extend to the square lattice due to the appearance of vortices in the clock model (cycles of length 4 in Z 2 on which all four spin values appear).
As the Ashkin-Teller model with J ≥ U has a phase transition at the self-dual line it is natural to conjecture that this is the case also for the limiting model.
Question 5. Prove that the 4-state Lipschitz clock model on Z 2 undergoes a phase transition at x = 1 2 .
T-connectivity and delocalization of the height function of square-ice. As discussed in Section 2.7, the use of the triangular lattice connectivity in proving the equality of the height function measures HF 0,1 even,c and HF 0,1 odd,c is one of the main novelties of this work. We hope that this argument would be of further use and outline here an additional consequence that may be obtained with it.
The height function of the square-ice model (the case a = b = c of the six-vertex model) has been shown to delocalize [CPST21] with an argument relying on the seminal results of Sheffield [She05]. The method that we use for showing that HF 0,1 even,c = HF 0,1 odd,c can give an alternative proof of delocalization, at least with zero boundary conditions, which does not rely on [She05]. Consider the height function of the square-ice model in a sequence of odd domains with zero boundary conditions which increases to Z 2 . Let L even be the set of parity-preserving translations of Z 2 . A simple consequence of the FKG inequality for the absolute value of the model [BHM00] is the following dichotomy [CPST21, Theorem 1.1]: either the height at the origin is not tight in this limit (that is, the height delocalizes) or a limiting Gibbs measure, denoted HF 0 , exists and is L even -invariant. Assume, in order to obtain a contradiction, that the latter alternative occurs. By symmetry, this also implies the existence of the limiting Gibbs measures HF k for each integer k (as the thermodynamic limit of odd/even domains with boundary value k according to whether k is even/odd). We will show that HF −1 = HF 1 which yields a contradiction since samples of HF 1 can be obtained by adding 2 to samples of HF −1 . Positive association of the heights (in finite volume) implies that HF 1 ⪰ HF 0 (writing ⪰ for stochastic domination). We will show that in fact This implies (63) (thus yielding the contradiction) since HF −1 is obtained from HF 1 by negating the sign of its samples while HF 0 is invariant to this operation. The measure HF 0 is L even -invariant and has positive association (since it is obtained in the thermodynamic limit). To highlight the main parts of the argument first, we now assume that HF 0 is L even -ergodic. This will allow us to conclude, via the method of T-connectivity, after which we will revisit the ergodicity assumption.
We let h be sampled from HF 0 and study the set of odd faces u where h(u) ≥ 1 and the set where h(u) ≤ −1 where we endow both sets with the T • -connectivity (Definition 6.2). As in the proof of Theorem 2 for the case a + b < c, each of the sets may have at most one infinite cluster (Burton-Keane argument) and, moreover, the two sets may not have an infinite cluster simultaneously (non-coexistence). By the symmetry of HF 0 under sign flip, we conclude that, in fact, neither has an infinite cluster. This allows to apply the T-circuit argument (as overviewed in Section 2.7) to deduce that HF 0 ⪰ HF 1 , yielding (64), and thus finishing the proof of delocalization.
We now prove that HF 0 is L even -ergodic. We first observe that if there exists an L evenergodic measure µ which is also extremal, then µ = HF k for some k, whence all the HF k are L even -ergodic and extremal. To see this, let h be sampled from µ. For each integer k, let I ≥k be the event that h ≥ k has an infinite cluster (in the standard connectivity on all faces of Z 2 ) and define I <k analogously with h < k. The standard percolation arguments imply that for each integer k, either µ(I ≥k ) = 0 or µ(I <k ) = 0. In the first case, there are infinitely many circuits surrounding the origin where h < k, from which it follows that µ ⪯ HF k−1 . Thus it is impossible that µ(I ≥k ) = 0 for all k. Similarly, by the second case, it is impossible that µ(I <k ) = 0 for all k. The above facts imply that there is a k 0 such that µ(I ≥k 0 ) = 1 and µ(I ≥k 0 +1 ) = µ(I <k 0 ) = 0, whence there are infinitely many circuits surrounding the origin where h = k 0 . It follows that µ = HF k 0 .
It remains to show that there exists an L even -ergodic and extremal measure µ. This follows from the general results of [She05] (which show that every L even -ergodic measure is extremal) but may also be argued directly for the square-ice model as we now explain. Let h be sampled from HF 0 . There is at most one infinite cluster where h > 0 and at most one infinite cluster where h < 0 (the argument of Burton-Keane may be invoked for this as in the proof of Theorem 2). This further implies that there is zero probability that both h > 0 and h < 0 have infinite clusters since the construction of HF 0 implies that the signs of distinct infinite clusters of h ̸ = 0 are uniform and independent. It follows that |h| is a Gibbs measure for the absolute value specification. Moreover, using the FKG for absolute values, the distribution of |h| is extremal as it is minimal in the absolute value specification. Since the signs of h on clusters where h is non-zero are uniform and independent, it follows that the extremal components of HF 0 differ only in the sign assigned to the infinite cluster where h is non-zero. In particular, if h ̸ = 0 has no infinite cluster with probability one, then HF 0 is extremal, and otherwise h > 0 has an infinite cluster with positive probability and conditioning on this event yields an L even -ergodic and extremal Gibbs measure, as required for the argument.