Energy Landscape and Metastability of Stochastic Ising and Potts Models on Three-dimensional Lattices Without External Fields

In this study, we investigate the energy landscape of the Ising and Potts models on fixed and finite but large three-dimensional (3D) lattices where no external field exists and quantitatively characterize the metastable behavior of the associated Glauber dynamics in the very low temperature regime. Such analyses for the models with non-zero external magnetic fields have been extensively performed over the past two decades; however, models without external fields remained uninvestigated. Recently, the corresponding investigation has been conducted for the two-dimensional (2D) model without an external field, and in this study, we further extend these successes to the 3D model, which has a far more complicated energy landscape than the 2D one. In particular, we provide a detailed description of the highly complex plateau structure of saddle configurations between ground states and then analyze the typical behavior of the Glauber dynamics thereon. Thus, we acheive a quantitatively precise analysis of metastability, including the Eyring-Kramers law, the Markov chain model reduction, and a full characterization of metastable transition paths.

Metastable behaviors of stochastic Ising and Potts models. In this study, we consider the metastability of the stochastic Ising and Potts models evolving according to Metropolis-Hastings-type Glauber dynamics on a large, but fixed three-dimensional (3D) lattice. For such models, the Gibbs invariant measure is exponentially concentrated on monochromatic configurations (i.e., the configurations consisting of a single spin, which are the ground states of the Ising and Potts Hamiltonians) in the very low temperature regime. Hence, in such regimes, the dynamics exhibits metastable behavior between the monochromatic configurations: It starts from a monochromatic configuration, remains in a certain neighborhood of the starting configuration for an exponentially long time, and finally overcomes the energy barrier between monochromatic configurations to reach another monochromatic one.
Several mathematical questions persist regarding the metastable behavior explained above. For instance, in the transition from one monochromatic configuration to another, the mean transition time, the asymptotic law of the rescaled transition time, and the typical transition paths are all points of interest. We are also interested in the characterization of the energy barrier and the saddle configurations that realize this energy barrier via optimal paths between monochromatic configurations. The final issue is particularly important and challenging for the model considered in the present article and has remained open for a long time. It is also important to estimate the mixing time or spectral gap of the associated dynamics; this allows us to measure the effects of metastable behavior on the global mixing properties of the associated Markovian dynamics. In this article, we answer all these questions for the stochastic Ising and Potts models on finite three-dimensional lattices in the absence of external fields.
Model with non-zero external field. The first rigorous mathematical treatment of the metastable behavior of the Ising model was performed in [42,43], where the authors considered the Ising model on a two-dimensional (2D) lattice in the presence of a non-zero external field. These studies verified that the transition from a metastable monochromatic configuration to a stable one is essentially equivalent to the formation of a certain type of critical droplet. From this observation, precise information regarding the transition path was obtained, as well as large deviation-type estimates for the transition time and mixing behavior associated with the Metropolis-Hastings dynamics. This result was extended to the 3D Ising model presented in [1,6]. Similar results for four-or higher-dimensional models remain to be found, because the variational problems related to the analysis of the energy landscape and critical droplet are highly complicated.
In [17], the aforementioned analyses were further refined via the potential-theoretic approach developed in [15]. In [17], the authors obtained the Eyring-Kramers law for the transition time between monochromatic configurations, as well as the spectral gap of the associated dynamics. This new technology does not provide information on the transition path; however, it provides precise asymptotics for the mean metastable transition time and spectral gap. The same model on growing lattice boxes, rather than fixed ones, was investigated in [14], and the Kawasaki-type (instead of Glauber-type) dynamics for the same model were studied in [13].
Model without external field. When studying the metastability of the stochastic Ising model with a non-zero external field (as described above), the crucial object is the critical droplet, which provides a sharp saddle structure for the energy landscape. However, in the zero external field case, the critical droplet does not exist. Instead, the saddle structure is flat, structurally complex, and composed of a large set of saddle configurations. This is the crucial challenge in the zero external field case, which has left the problem unsolved for a long time.
In the present study, we solve this problem by comprehensively analyzing the energy landscape.
Recently, [40] analyzed for the first time the 2D Ising and Potts models in the absence of external fields. More precisely, they characterized (1) the energy barrier between ground states and (2) the deepest metastable valleys in the landscape. Using the energy landscape results and a general tool referred to as the pathwise approach to metastability (developed in [18,19,39,41]), they obtained large deviation-type results for the metastable behaviors of the 2D models in the absence of external fields.
In [25], which is a companion article of the present one, we improved on the refinement of results in the previous studies for the 2D model using the potential-theoretic approach, thereby making the following contributions: • the Eyring-Kramers law for metastable transitions between monochromatic configurations, • the Markov chain model reduction of metastable behavior (cf. [26] for a comprehensive review on this method), and • the full characterization of typical transition paths.
To this end, we derive a highly detailed analysis of the energy landscape and characterize all saddle configurations. In particular, we comprehensively and precisely describe the large and complicated saddle structure of the model. Our analysis is sufficiently accurate to allow the transition paths between ground states to be characterized explicitly.
Main achievement. In the current article, we extend all these analyses to the 3D Ising and Potts models by combining the pathwise approach and the potential-theoretic approach. Indeed, the energy landscape of the 3D model is significantly more complicated than that of the 2D model. For both the 2D and 3D models, there are numerous saddle configurations between ground states, and they form a plateau structure. For the 2D model, at least the bulk part of this plateau structure is relatively simple, because each saddle configuration can only move forward or backward to reach another saddle configuration. In contrast, for the 3D model, we cannot expect such a simplification, because there exist certain configurations for which the legitimate movements between saddle configurations can occur in a substantially more complex manner. We refer to the figure at the front page for an example of a highly complicated saddle configuration in the 3D case (which should be characterized in some way to answer all the questions above). Readers who are familiar with the results on the non-zero external field model can notice from this figure that the saddle configurations for the zero external model may not have a clear structure as in the non-zero external field case.
Approximation method to metastability. In our companion paper [25], we introduced a new approximation method to prove the Eyring-Kramers law and Markov chain model reduction. This method relies on the approximation of the equilibrium potential function (refer to Section 3.1 for the precise definition) in a Sobolev space defined via the Dirichlet norm associated with the Markov chain. It is robust and particularly suitable if the energy landscape is too complex to apply the potential-theoretic approach [15] via variational principles (the Dirichlet and Thomson principles), because it effectively avoids these variational principles via an approximation in the Sobolev space. We apply this method to the 3D model to achieve our main result.
The main mathematical difficulty of applying this method lies in the fact that we must construct a test function that accurately approximates the equilibrium potential function so that we can obtain the precise Sobolev norm. For this procedure, we need a comprehensive understanding of the whole energy landscape regarding the metastable transitions. Thus, compared with the 2D model, the corresponding construction for the 3D model is far more complicated. Overcoming this difficulty is the main contribution of the present study.

Models.
In this subsection, we introduce the stochastic Ising and Potts models on a fixed 3D lattice and review their basic features.
Ising and Potts models. We fix three positive integers K ≤ L ≤ M . Then, we denote by Λ = 1, K × 1, L × 1, M the 3D lattice box. We use the notation a, b = [a, b] ∩ Z throughout this article. We impose either open or periodic boundary conditions upon the lattice box Λ. For the latter boundary condition, we can write where T k = Z/(kZ) represents the discrete one-dimensional torus.
For an integer q ≥ 2, we use S = {1, . . . , q} to represent the set of spins and X = S Λ to represent the space of spin configurations in the 3D box Λ. We express a configuration σ ∈ X as σ = (σ(x)) x∈Λ , where σ(x) ∈ S represents the spin of σ at site x ∈ Λ. For x, y ∈ Λ, we write x ∼ y if they are neighboring sites; that is, x − y = 1 where · denotes the Euclidean distance in Λ. With this notation, we define the Hamiltonian H : X → R as where h ∈ R denotes the magnitude of the external magnetic field. Thus, the first summation on the right-hand side represents the spin-spin interactions, and the second one corresponds to the effect of the external magnetic field. We use µ β (·) to denote the Gibbs measure on X associated with the Hamiltonian H at inverse temperature β > 0; that is, where Z β = ζ∈X e −βH(ζ) is the partition function. The random spin configuration on box Λ corresponds to the probability measure µ β (·) on X ; it is referred to as the Ising model if q = 2 and the Potts model if q ≥ 3. Henceforth, we treat q as a fixed parameter. Our primary concern is the metastability analyses of these models as β → ∞ under Metropolis-Hastings dynamics, which will be defined precisely below.
Remark 2.1 (Results for non-zero external field). Comprehensive analyses of the energy landscape and the metastability of the Ising model with a non-zero external field i.e., h = 0, were performed in [17,42,43] for the 2D case, and in [1,6] for the 3D one. For these models, the characterization of the critical droplet comprehensively explains the metastable behavior. We remark that analysis for cases of more than three dimensions has yet to be undertaken, because the energy landscape is too complex to allow critical droplets to be characterized. Recently, the 2D Potts model with an external field toward one specific spin has been studied [7][8][9].
In this study, we consider the zero external field case (i.e., h = 0); thus, we henceforth assume that h = 0. This case differs from those involving non-zero external fields, in the sense that the energy landscape is not characterized by critical droplets. Instead, we must tackle a large and complex landscape of saddle configurations via complicated combinatorial and probabilistic arguments.
Ground states. For each a ∈ S, denote by s a ∈ X the monochromatic configuration in which all spins are a, i.e., s a (x) = a for all x ∈ Λ. We write S = {s 1 , . . . , s q } . (2.4) It is precisely upon S that the Hamiltonian H(·) attains its minimum 0; hence, S represents the set of ground states of the model. Accordingly, we obtain the following characterization of the partition function Z β that appears in (2.3), as well as the Gibbs measure µ β as β → ∞. Proof. The estimate (2.5) of the partition function comes directly from the expression of the partition function given right after (2.3) and the fact that H(σ) ≥ 3 for σ / ∈ S. The second assertion of the theorem is directly derived from the first one and the expression (2.3) of µ β .
Metropolis-Hastings dynamics and metastability. We give a continuous version of the Metropolis-Hastings dynamics, which is the standard heat-bath Glauber dynamics used for studying the metastability of the Ising model [42]. For x ∈ Λ and a ∈ S, we use σ x, a ∈ X to denote the configuration obtained from σ by updating the spin at site x to a. Then, the continuous version of the Metropolis-Hastings dynamics is defined as a continuous-time Markov chain {σ β (t)} t≥0 on X , whose transition rates are given by We notice from this definition of the rate r β (·, ·) that the Metropolis-Hastings dynamics tends to lower the energy, particularly when β is large, because the jump rate from one configuration to another one with higher energy is exponentially small, whereas the jump rate to another one with lower or equal energy is 1. We let P β σ and E β σ represent the law and expectation, respectively, of the process σ β (·) starting from σ.
From this detailed balance condition, we observe that the invariant measure for the Metropolis-Hastings dynamics σ β (·) is µ β (·) and that {σ β (t)} t≥0 is reversible with respect to µ β (·). We also note that the Markov chain σ β (·) is irreducible. In view of Theorem 2.2, we anticipate that the process σ β (·) will exhibit metastable behavior between ground states, provided that β is sufficiently large. More precisely, the process σ β (·) starting from configuration s ∈ S remains in a certain neighborhood of s for a sufficiently 1 For two collections (a β ) β>0 = (a β (K, L, M )) β>0 and (b β ) β>0 = (b β (K, L, M )) β>0 of real numbers, we write a β = O β (b β ) if there exists some C = C(K, L, M ) > 0 such that |a β | ≤ Cb β for all β > 0 and K, L, M . long time, and then undergoes a rare but rapid transition to another ground state. Our main concern is to precisely analyze such metastability of the stochastic Ising and Potts models under the Metropolis-Hastings dynamics (defined above) in the very low temperature regime; that is, when β → ∞. We explain these results in the following subsection.
Remark 2.3. We employ the continuous-time dynamics (as applied in numerous previous studies) because it offers a simpler presentation than the corresponding discrete dynamics (as demonstrated in [6,17,40]), for which the jump probability is given by (2.7) However, our computations can be applied to this model as well. See also Remark 2.15.

2.2.
Main results: large deviation-type results. Hereafter, we explain our results regarding the metastability of the stochastic Ising and Potts models. In the current subsection, we explain the large deviation-type results obtained for the metastable behavior.
Energy barrier between ground states. First, we introduce the energy barrier associated with the Ising and Potts models considered in this study. This is important for the analysis of metastable behaviors, in that the Metropolis-Hastings dynamics must overcome this energy barrier to make a transition from one ground state to another.
Note that Φ(s, s ) does not depend on the selections of s, s ∈ S, owing to the model symmetry. Additionally, note that Γ represents the energy barrier between ground states, because the dynamics must overcome this energy level to make a transition from one ground state to another.
To characterize the energy barrier, we must check the maximum energy of all paths connecting the ground states. Thus, the energy barrier is a global feature of the energy landscape, and characterizing it is a non-trivial task. For the current model, we can identify the exact value of the energy barrier. Recall that we assumed K ≤ L ≤ M . (2.8) Remark 2.5. Our arguments state that this theorem holds for K ≥ 2829, where the threshold 2829 may be sub-optimal (cf. Remark 8.4). However, the optimality of this threshold is a minor issue, because our main concern is the spin system on large boxes. Henceforth, we assume that K satisfies this condition, i.e., K ≥ 2829.
Theorem 2.4 is proved in Section 8.
Remark 2.6. Several remarks regarding the previous theorem are in order.
(1) Note that Theorem 2.4 does not depend on the value of q, because in the transition from s a to s b for a, b ∈ S, no spins besides a and b play a significant role. (2) Suppose temporarily that Γ d is the energy barrier, defined in the same way as above, subjected to Ising/Potts models defined on a d-dimensional lattice box of size Comparison with non-zero external field case. We conclude this energy barrier discussion by comparing our results for the zero external field case with those for the non-zero external field case obtained in [42] and [6] for the Ising model (i.e., q = 2) in two or three dimensions, respectively. More precisely, they showed that the energy barrier is given by (under some technical assumptions regarding h) where Γ d represents the d-dimensional energy barrier, h = 2/|h| , m h = 4/|h| , and δ h ∈ {0, 1} is a constant depending only on h (provided that the lattice is sufficiently large). We refer to [12,Chapter 17] for details. These energy barriers are characterized by the energy of the critical droplet, and their values do not depend on the size of the box but are determined solely by the magnitude h of the external field. This is primarily because the size of the critical droplet is determined solely by |h|, and the size of the box plays no role provided that the box is sufficiently large to contain a single droplet. In contrast, the zero external field case does not feature such a critical droplet; hence, the magnitude of the energy barrier depends crucially on the box size. This is the key difference between the zero external field and non-zero external field cases.
Large deviation-type results based on pathwise approach. Here, we explain the large deviationtype analysis of the metastable behavior of the Metropolis-Hastings dynamics. These results can be obtained via the pathwise approach developed in [18], provided that we can analyze the model energy landscape to a certain degree of precision. We refer to the monograph [44] for an extensive summary of the pathwise approach. This approach allows us to analyze the metastability from three different perspectives: transition time, spectral gap, and mixing time. All these quantities are crucial for quantifying the metastable behavior. First, we explicitly define them as follows: • For A ⊆ X , we denote by τ A = inf {t ≥ 0 : σ β (t) ∈ A} the hitting time of the set A. If A = {σ} is a singleton, we write τ {σ} = τ σ . • For s ∈ S, we writes = S \ {s}. Then, our primary concern is the hitting time τs or τ s for s ∈s when the dynamics starts from s ∈ S. We refer to this as the (metastable) transition time, because it expresses the time required for a transition to proceed from the ground state to another one. • The mixing time corresponding to the level ∈ (0, 1) is defined as where · TV represents the total variation distance between measures (cf. [34,Chapter 4]). • We denote by λ β the spectral gap of the Metropolis-Hastings dynamics defined in Section 2.1.
The 2D version of the following theorem was established in [40] using the refined pathwise approach developed in [19,39,41]. We extend their results to the 3D model.
Theorem 2.7. The following statements hold.
(1) (Transition time) For all s, s ∈ S and > 0, we have where Exp(1) is the exponential random variable with a mean value of 1.
(2) (Mixing time) For all ∈ (0, 1/2), the mixing time satisfies (3) (Spectral gap) There exist two constants 0 < c 1 ≤ c 2 such that Remark 2.8. The above theorem holds under both open and periodic boundary conditions. Theorem 2.7 states that the metastable transition time, mixing time, and inverse spectral gap become exponentially large as β → ∞, and their exponential growth rates are determined by the energy barrier Γ.
The robust methodology developed in [19,39,41] implies that characterizing the energy barrier between ground states and identifying all the deepest valleys suffice (up to several technical issues) to confirm the results presented in Theorem 2.7. In [40], the authors performed corresponding analyses of the energy landscape; then, they used this robust methodology to prove Theorem 2.7 for two dimensions. We perform the corresponding analysis of the energy landscape for the 3D model as well in Sections 6, 7, and 8. The proof of Theorem 2.7 is given in Section 8.3. Analysis of the energy landscape is far more difficult than that of the 2D one considered in [25] for several reasons. Details are presented at the beginning of Section 6.
Characterization of transition path. Our analysis of the energy landscape is sufficiently precise to characterize all the possible transition paths between ground states in a high level of detail. The transition paths are rigorously defined in Definition 9.13; we do not present explicit definitions here, because we would have to define a large amount of notation. The following theorem asserts that, with dominating probability, the Metropolis-Hastings dynamics evolves along one of the transition paths when a transition occurs from one ground state to another. Theorem 2.9. For all s ∈ S, we have 3 P β s ∃0 < t 1 < · · · < t N < τs such that (σ β (t n )) N n=1 is a transition path between s ands The characterization of the transition paths and the proof of this theorem are given in Section 9.4.

2.3.
Main results: Eyring-Kramers law and Markov chain model reduction. The following results constitute more quantitative analyses of the metastable behavior obtained using potential-theoretic methods. In particular, we obtain the Eyring-Kramers law (which is a considerable refinement of (2.10)) and the Markov chain model reduction of metastable behavior in the sense of [2,3].
For these results, we require an accurate understanding of the energy landscape and the behavior of the Metropolis-Hastings dynamics on a large set of saddle configurations between ground states. We conduct these analyses in Sections 9 and 10.
We further remark that the quantitative results given below depend on the selection of boundary condition, in contrast to Theorems 2.7 and 2.9 (cf. Remark 2.8). For brevity, we assume periodic boundary conditions throughout this subsection. We can treat the open boundary case in a similar manner; the results and a sketch of the proof are presented in Section 11.
Eyring-Kramers law. The following result constitutes a refinement of (2.10) (and hence of (2.11)) that allows us to pin down the sub-exponential prefactor associated with the large deviation-type exponential estimates of the mean transition time between ground states.
Moreover, the constant κ satisfies (2.13) In particular, the quantity E β s [τs] represents the mean time required to jump from s to another ground state; hence, the first formula of (2.12) corresponds to the so-called Eyring-Kramers law for the Metropolis-Hastings dynamics.
Remark 2.11. Here, we make several comments regarding Theorem 2.10.
(1) Although we do not present the exact formula for the constant κ in the theorem, they can be explicitly expressed in terms of potential-theoretic notions relevant to a random walk defined in a complicated space (cf. (3.10) and (3.11) for the formulas). This random walk is vague (cf. Proposition 9.9) compared with the corresponding random walk identified in [25,Proposition 6.22] for the 2D model, which reflects the complexity of the energy landscape of the 3D model compared with that of the 2D one. The proof of Theorem 2.10 is conducted via the potential-theoretic approach, which originates from [15]. Using this approach, we can estimate the mean transition time E β s [τs] by obtaining a precise estimate of the capacity between ground states (cf. [2,Proposition 6.10]). This estimate is typically obtained from variational principles for capacities, such as the Dirichlet and Thomson principles. In contrast, we use the H 1 -approximation technique developed in our companion article [25], which considerably simplifies the proof but still points out the gist of the logical structure needed to estimate the capacity.
To this end, we require precise analyses of the energy landscape and the behavior of the underlying metastable processes on a certain neighborhood of saddle configurations between metastable sets. In most other models for which the Eyring-Kramers law can be obtained via such robust strategies, the energy landscape is relatively simple; hence, the landscape only marginally presents serious mathematical issues. However, in the current model, the saddle consists of a very large collection of saddle configurations, which form a complex structure. Analyzing this structure is a highly complicated task; moreover, it is difficult to assess the behavior of the dynamics in the neighborhood of this large set with adequate precision. The achievement of these tasks is one of the main contributions of this study. We emphasize here that the H 1 -approximation technique, which is used in the proof of the main results in a critical manner, is particularly handy for models with complicated landscapes, such as the one considered in this study.
Markov chain model reduction of metastable behavior. Because the transitions between ground states occur successively, analyzing all these transitions together is also an important problem in the study of metastability. The general method used is Markov chain model reduction [2][3][4]. In this methodology, one proves that the metastable process (accelerated by a certain scale) converges, in a suitable sense, to a Markov chain on the set of metastable sets. For our model, the target Markov chain must be a Markov chain on the collection of ground states, because each ground state corresponds to a metastable set.
To explain this result in the context of our model, we introduce trace process on ground states. In view of Theorem 2.10, we must accelerate the process by a factor e Γβ to observe transitions between ground states in the ordinary time scale; hence, let us denote by σ β (t) = σ β (e Γβ t), t ≥ 0 the accelerated process. Then, we define a random time T (t), t ≥ 0 as which measures the amount of time (up to t) the accelerated process spends on the ground states. Let S(·) be the generalized inverse of T (·); that is, Then, the (accelerated) trace process {X β (t)} t≥0 on the set S of ground states is defined by X β (t) = σ β (S(t)) for t ≥ 0 . (2.14) We observe that the trace process X β (·) is obtained from the accelerated process σ β (·) by turning off the clock whenever it is not on a ground state; thus, the process X β (·) extracts information regarding the hopping dynamics on ground states. It is well known that the trace process X β (·) is a continuous-time, irreducible Markov chain on S; see [2, Proposition 6.1] for a rigorous proof.
Here, in view of the second estimate of (2.12), we define the limiting Markov chain {X(t)} t≥0 on S, which expresses the asymptotic behavior of the accelerated process σ β (·) between the ground states as a continuous-time Markov chain with jump rate r X (s, s ) = κ −1 for all s, s ∈ S .
(1) The law of the Markov chain X β (·) converges to that of the limiting Markov chain X(·) as β → ∞, in the usual Skorokhod topology. (2) It holds that The second part of this theorem implies that the accelerated process spends a negligible amount of time in the set X \ S. Therefore, the trace process X β (·) of σ β (·) on the set S, which is essentially obtained by neglecting the excursion of σ β (·) on the set X \ S, is indeed a reasonable object for approximating the process σ β (·). Combining this observation with the first part of the theorem implies that the limiting Markov chain X(·) describes the successive metastable transitions of the Metropolis-Hastings dynamics.
Remark 2.13. The proofs of Theorems 2.10 and 2.12 are based on the potential-theoretic argument, and we present the arguments in Section 3. We conjecture that these results also hold for the cases of more than three dimensions.
Remark 2.14. Temporarily, we denote by E s the law of the limiting Markov chain X(·) starting at s ∈ S. Theorem 2.12 is consistent with Theorem 2.10, in that for any s ∈s, we have E s [τ s ] = κ.
Remark 2.15 (Discrete Metropolis-Hastings dynamics). The only difference in the discrete dynamics defined by (2.7) is that it is q|Λ| times slower than the continuous dynamics (in the average sense). Therefore, Theorems 2.4, 2.7, and 2.9 are valid for this dynamics without any modification. Theorems 2.10 and 2.12 hold provided that we replace the constant κ with κ = q|Λ| κ. The rigorous verification of the result proceeds in a similar way; thus, we do not repeat it here.
Outlook of proofs of main results. To prove Theorems 2.4 and 2.7, which fall into the category of pathwise-type metastability results, we investigate the energy landscape of the Ising/Potts models on the 3D lattice Λ, as described in Sections 6, 7, and 8. Along the investigation, we present proofs of Theorems 2.4 and 2.7 in Section 8. Then, we proceed to the proofs of Theorems 2.10 and 2.12, which require more accurate analyses of the energy landscape than the previous theorems. These detailed analyses are presented in Section 9, and as a byproduct we present the proof of Theorem 2.9 in Section 9.4. Then, we present the proofs of Theorems 2.10 and 2.12 in Section 10.
Non-reversible models. The stochastic system considered in this study is the continuous-time Metropolis-Hastings spin-updating dynamics, which is reversible with respect to the Gibbs measure µ β (·). In fact, as in our companion paper [25], we can consider various dynamics with invariant measure µ β (·) but are non-reversible with respect to this measure. Since the approximation method and the pathwise approach used in the proof of the main results presented above are robust and can be used in the non-reversible setting as well, we can analyze the 3D version of the non-reversible models introduced in [25] for the 2D model and obtain similar results. However, for simplicity (as analysis of the energy landscape of the 3D model is very complicated itself), we decided not to include the non-reversible content in the current article. Readers who are interested in non-reversible generalizations can refer to [25, Sections 2.2 and 5] for details.

Outline of the Proof
In this section, we provide a brief summary of proof of the main results. We emphasize again that in the remainder of this article (except in Section 11), we assume periodic boundary conditions; that is, Λ = T K × T L × T M . In addition, we always assume that K satisfies the condition given in Remark 2.5.
We reduce the proofs of Theorems 2.10 and 2.12 (which are the final destinations of the current article) to an estimate of the capacity between ground states (cf. Theorem 3.1), and then we reduce the proof of this capacity estimate to the construction of a certain test function (cf. Proposition 3.2) which is a proper approximation of the equilibrium potential function defined in (3.4). The construction and verification of Proposition 3.2 are done in Section 10. This procedure takes into advantage all the information on the energy landscape, analyzed in Sections 6-9.
General strategy to prove such results, which works also in non-reversible cases, was developed in our companion article [25,Section 4]. Thus, we state here only the essential ingredients in a self-contained manner and refer the interested readers to [25, Section 4] for more detail.
3.1. Capacity estimate and proof of Theorems 2.10 and 2.12. The Dirichlet form D β (·) associated with the (reversible) Metropolis-Hastings dynamics σ β (·) is given by, for f : X → R, An alternative expression for the Dirichlet form is given as where ·, · µ β is the inner product on L 2 (µ β ) and L β is the generator of the original process, that is, For two disjoint and non-empty subsets P and Q of X , the equilibrium potential between P and Q is the function h β P, Q : X → R defined by By definition, it readily follows that h β P, Q ≡ 1 on P and h β P, Q ≡ 0 on Q. Then, we define the capacity between P and Q as It is well known that the equilibrium potential is the unique solution to the following equation: Next, we define the constant κ = κ(K, L, M ) that appears in Theorems 2.10 and 2.12.
We explain the strategy used to prove this theorem in Section 3.2. Here, we conclude the proofs of Theorems 2.10 and 2.12 by assuming Theorem 3.1.
Proof of Theorem 2.10. By [2, Proposition 6.10], we have the following formula for the mean transition time: Using Theorem 2.2 and the fact that h β s,s (s) = 1 and h β s,s ≡ 0 ons, we can rewrite the last summation as where the identity follows from the trivial bound |h β s,s | ≤ 1 (cf. (3.4)). Summing up the computations above and applying Theorem 3.1, we obtain We next address the second estimate of (2.12). Assume that the process σ β (·) starts at s and that s = s . We define a sequence of stopping times (J n ) ∞ n=0 by J 0 = 0 and In other words, (J n ) ∞ n=0 is the sequence of random times at which the process σ β (·) visits a new ground state. By (3.13) and the strong Markov property, we have for all n ≥ 0 that (3.14) Then, we define n(s ) = inf {n ≥ 0 : σ β (J n ) = s } such that τ s = J n(s ) ; thus, we can write Note that because we have assumed s = s , it holds that n(s ) ≥ 1. By symmetry, we observe that n(s ) is a geometric random variable with success probability 1 q−1 that is independent of the sequence (J n ) ∞ n=0 . Thus, we get from (3.14) and (3.15) that Finally, from (3.7), (3.8), (3.9), and (3.10), we can easily see that κ satisfies the asymptotics (2.13). This completes the proof.
Next, we consider Theorem 2.12. The general methodology used to prove this type of Markov chain model reduction, based on potential-theoretic computations, was developed in [2,3]. Our proof also uses the potential-theoretic approach; however, the computation is slightly simpler because the metastable sets are singletons. Before stating the proof, we remark that two alternative approaches are available for the Markov chain model reduction in the context of metastability: an approach based on the Poisson equation [31,33,45,46], and one based on the resolvent equation [30,37].
Proof of Theorem 2.12. We first consider part (1). We denote by r tr β : S × S → [0, ∞) the transition rate of the trace process X β (·). In view of the rate (2.15) of the limiting Markov chain, it suffices to prove that r tr β (s, s ) = (1 + o β (1)) 1 κ for all s, s ∈ S. Since r β (s, s ) does not depend on the selections of s, s ∈ S by the symmetry of the model, it remains to prove that We denote by E β s the law of the trace process X β (·) starting at s. Then, 1 r tr β (s,s) where the factor e −Γβ is included because we accelerated the process by the factor e Γβ when defining the trace process; the integrand 1{σ β (t) ∈ S} arises because the trace process is obtained from the accelerated process by turning off the clock when the process resides outside S. Then, by [2, Proposition 6.10], we can write where the second identity follows from the fact that h β s,s (s) = 1 and h β s,s ≡ 0 ons. Therefore, by Theorems 2.2 and 3.1, we obtain

Inserting this into (3.17) yields (3.16).
Here, we address part (2). Denote by P β µ β the law of the Metropolis-Hastings dynamics σ β (·) for which the initial distribution is µ β . Then, for any u > 0, we obtain where the final identity holds because µ β is the invariant distribution. Therefore by the Fubini theorem, which vanishes as β → ∞ by Theorem 2.2.
(2) It holds that (3.20) Remark 3.3. The following statements are remarks on the previous proposition.
(1) Since the (square root of the) Dirichlet form can be regarded as an H 1 -seminorm, by (3.19), the test function h approximates h β S(A), S(B) in the H 1 -sense. Finally, provided that Proposition 3.2 holds, we prove Theorem 3.1.
Hence, to prove the main results given in Theorems 2.10 and 2.12, it remains to prove Proposition 3.2. The proof is given in Section 10.

Neighborhood of Configurations
In this section, we introduce several notions of neighborhoods of configurations, which are analogues of the same concepts defined in [25, Section 6.1]. These notions will be crucially used in the characterization of energy landscape and in the construction of test objects.
For c ∈ R, a path (ω t ) T t=0 in X is called a c-path if it is a path in the sense of Section 2.2, and moreover satisfies H(ω t ) ≤ c for all t ∈ 0, T . Moreover, we say that this path is in P ⊆ X if ω t ∈ P for all t ∈ 0, T . (2) Let Q ⊆ X . For σ ∈ X such that σ / ∈ Q, we define N (σ ; Q) = {ζ ∈ X : There exists a Γ-path in X \ Q connecting σ and ζ} .
With this notation, by the definition of Γ, it holds that N (s) ∩ N (s ) = ∅ and N (s) = N (s ) for any s, s ∈ S. Moreover, in the spirit of the large deviation principle, the only configurations relevant to the study of metastability are the ones in N (S). Hence, it is crucial to understand the structure of the set N (S). That is the content of Proposition 9.6.
We conclude this section with an elementary lemma which will be used in several instances of our discussion. The proof is well explained in [25, Lemma A.1], and thus we omit the detail.

Review of Two-dimensional Model
In this section, we recall some crucial 2D results on the energy landscape from [25, Sections 6, 7 and Appendices B, C], which are needed in our investigation of the 3D model. Since all the results that appear in the current section are proved in [25], we refer to the proofs therein.

Notation.
Greek letters η and ξ are used to denote the spin configurations of the 2D model, while letters σ and ζ are used to denote the 3D configurations. We use the superscript 2D to stress the notation for the 2D model; for example, we shall denote by H 2D (·) the Hamiltonian of the 2D model to distinguish with H(·) which denotes the Hamiltonian of the 3D model. 5.1. 2D stochastic Ising and Potts models with periodic boundary conditions. We denote by Λ 2D = T K × T L the 2D lattice with periodic boundary conditions. Recall that S = {1, 2, . . . , q} denotes the set of spins, and denote by X 2D = S Λ 2D the space of spin configurations on the 2D lattice. Then, the 2D Ising/Potts Hamiltonian function H 2D : We denote by s 2D a , a ∈ S the 2D monochromatic configurations of spin a, that is, s 2D a (x) = a for all x ∈ Λ 2D . Then, it is straightforward that the ground states of this Hamiltonian is also the monochromatic configurations, i.e., the collection S 2D of the ground states is given as Here, Z 2D β is the 2D partition function with the property that (cf. [25, In the 2D model, we also consider the continuous-time Metropolis-Hastings dynamics whose transition rate is defined as This 2D stochastic Ising/Potts model is thoroughly analyzed in our companion article [25]. The remainder of this section presents a review of our analysis.
Then, by replacing Γ that appears in Definition 4.1 with Γ 2D , we get two types of neighborhoods N 2D and N 2D for the 2D model. In this subsection, we explain a class of natural optimal transition paths that achieve this energy level. These paths are denoted as canonical paths. To define these paths, we first define the so-called canonical configurations. We note that the constructions given here is a brief survey of [25, Section 6.2].
Canonical configurations. The following notation is used throughout the article (also for the 3D model).
Notation 5.1. Suppose that N ≥ 2 is a positive integer.
We first introduce the pre-canonical configurations which are illustrated in Figure 5.1.
• For ∈ T L and v ∈ 0, L , we denote by ξ a, b , v ∈ X 2D the configuration whose spins are b on and a on the remainder.
• For ∈ T L , v ∈ 0, L − 1 , k ∈ T K , and h ∈ 0, K , we denote by ξ a, b, + , v; k, h ∈ X 2D the configuration whose spins are b on and a on the remainder. Similarly, ξ a, b, − , v; k, h ∈ X 2D is the configuration whose spins are b on and a on the remainder. The configurations defined here are 2D pre-canonical configurations.
Based on this definition, the 2D canonical and regular configurations are defined.
Definition 5.3 (2D canonical and regular configurations). Fix a, b ∈ S. The definitions are slightly different for the case of K < L and the case of K = L.
Then, the collection of canonical configurations is given as Similarly, is called a 2D regular configuration. • (Case K = L) Define an operator Θ : X 2D → X 2D as a transpose operator, i.e., Denote temporarily by C a, b, 2D the collection C a, b, 2D defined in the case of K < L above. Then for a, b ∈ S, we define the collections of 2D canonical configurations between s 2D a and Similarly, we may define the collections Canonical paths. Now, we explain natural optimal paths between monochromatic configurations (illustrated in Figure 5.2) that consist of canonical configurations.  (1) For P, P ∈ S L with P ≺ P , a sequence (A k ) K k=0 of subsets of Λ 2D is a standard sequence connecting T K × P and T K × P if there exists an increasing sequence ( (2) A sequence (A n ) KL n=0 of subsets of Λ 2D is a standard sequence connecting ∅ and Λ 2D if there exists an increasing sequence (P ) L =0 in S L such that A K = T K × P for all ∈ 0, L , and furthermore for each ∈ 0, L−1 the subsequence (A k ) (4) Moreover, a sequence (ω n ) KL n=0 of 2D configurations is called a canonical path (cf. Figure  5.2) connecting s 2D a and s 2D b if there exists a pre-canonical path ( ω n ) KL n=0 such that (a) (Case K < L) ω n = ω n for all n ∈ 0, KL , (b) (Case K = L) ω n = ω n for all n ∈ 0, KL or ω n = Θ( ω n ) for all n ∈ 0, KL .
It holds that H 2D (η) ≤ 2K + 2 for all η ∈ C a, b, 2D and Moreover, the following lemma is immediate.
Comment on depth of valleys. We conclude this subsection with an application of Definition 5.4 and Lemma 5.5 that is crucially used later to calculate the 3D valley depths.
. Let η ∈ X 2D and a ∈ S. For any standard sequence (A k ) KL k=0 of sets connecting ∅ and Λ 2D and for n ∈ 0, KL , we define ω n ∈ X 2D as In Lemma 5.6, we have ω KL = s 2D a ∈ S 2D which implies that every η ∈ X 2D is connected to each ground state in S 2D with maximum energy H 2D (η) + Γ 2D . This fact implies that the maximum depth of valleys in the 2D energy landscape is Γ 2D .
It can be further proved that only the valleys containing the ground states have maximum depth Γ 2D , and all the other valleys have depth strictly less than Γ 2D . Indeed, this is a necessary condition for the pathwise approach technique to metastability; however, this level of precision is not necessarily needed in our investigation of the 3D energy landscape. Thus, we do not go further into this direction and refer the interested readers to [40, Theorem 2.1-(ii)]. 5.3. Saddle structure. Crucial configurations in the description of the saddle structure of the 2D model is the so-called typical configurations, which turn out to be the elements of the extended neighborhood N 2D (S 2D ) (cf. Proposition 5.8 below). We present in Figure 5.3 an illustration of the saddle structure explained in this subsection. • For a, b ∈ S, the collection of bulk typical configurations (between s 2D a and s 2D b ) is defined by Then, we write B 2D = a, b∈S B a, b, 2D . • Next, define Then, for a ∈ S, the collection of edge typical configurations with respect to s 2D a is defined by Then, the following crucial proposition provides the picture of the saddle structure of the 2D model. We shall provide a similar result for the 3D model in Proposition 9.6.
Gateway configurations. Next, we introduce the gateway configurations.
Intuitively, this set is the collection of saddle configurations between R a, b, 2D 2 and s 2D a . Then, we recall the 2D gateway configurations [25,Section B.5]. The gateway between s 2D a and s 2D b is denoted as which is a decomposition of G a, b, 2D . A configuration belonging to G a, b, 2D is called a gateway configuration between s 2D a and s 2D b .
Here, G a, b, 2D is named the collection of gateway configurations because of the following lemma, which indicates that it indeed contains the saddle configurations between s 2D a and s 2D b .
Lemma 5.10 ([25, Lemma B.10]). For a, b ∈ S, suppose that two 2D configurations η and We note that the construction of regular, canonical, typical, and gateway configurations, as well as canonical paths for the 2D model, will be extended to the 3D model in the remainder of the article.

Test function.
We also recall the 2D test function defined in [25,Section 7]. Although the construction therein was carried out for both Ising and Potts models, we only need the objects for the Ising model in this article. Hence, in this subsection, we assume that q = 2.
Recall that we always assume K ≤ L. We recall a constant from [25, (4.13)], which plays the role of κ in the current article and also satisfies In [25, Definition 7.2], a test function h 2D : X 2D → R (corresponding to h of the 3D model introduced in Proposition 3.2) is constructed as an H 1 -approximation of the equilibrium potential between two ground states. We proclaim that this function is crucially used in the construction of the 3D test function h. In the proof of Proposition 3.2, some estimates of h 2D are crucially used. The next estimate is used in the proof of (3.20).
The next one is crucially used in the proof of (3.19).
(2) We have that 5.5. Auxiliary results. In this subsection, we summarize two auxiliary results of the 2D model that are crucially used in our arguments.
Bridges, crosses and a bound on 2D Hamiltonian. For a configuration η ∈ X 2D , a bridge, which is a horizontal or vertical bridge, is a row or column, respectively, in which all spins are the same. If a bridge consists of spin a ∈ S, we call this bridge an a-bridge. Then, we denote by B a (η) the number of a-bridges with respect to η. A cross (resp. a-cross) is the union of a horizontal bridge and a vertical bridge (resp. a-bridges). With this notation, we have the following lower bound.
Characterization of configurations with low energy. Let a ∈ S. For η ∈ X 2D and σ ∈ X (a 3D configuration), we write The following proposition characterizes all the 2D configurations with energy less than Γ 2D .
Then, η satisfies exactly one of the following properties. (5.14)

Canonical Configurations and Paths
Analyzing the energy landscape of the 3D model is far more complex than that of the 2D model; below, we briefly list the main differences between them that serve to complexify the problem.
(1) In the 2D model, the energy of the gateway configuration is either Γ 2D or Γ 2D −2. Thus, a Γ 2D -path on the gateway configurations does not have the freedom to move. On the other hand, in the 3D model, the energy of the gateway configuration ranges from Γ − 2K − 2 to Γ. This implies that the behavior of a Γ-path around a gateway configuration of energy Γ − 2K − 2 (which is a regular configuration) cannot be characterized precisely.
and finally arrives at s 2D b . Remarkably, this path does not need to visit a configuration in R a, b, 2D 1 and in R a, b, 2D L−1 ; this fact essentially arises from the features of the 2D geometry. In the 3D model, we observe a similar phenomenon. To explain this, let us temporarily denote by R a, b v , v ∈ 1, L − 1 the collection of 3D configurations such that there are v consecutive K × L slabs of spins b and such that the spins at the remaining sites are a. Then, there exists an integer n = n K, L, M such that any Γ-path connecting s a and s b must successively visit configurations in In the 2D model, the number corresponding to this n = n K, L, M is 2. We guess that in the 3D model, n ∼ K 1/2 ; however, we cannot determine the exact value of n. This fact reveals the complex structure of the energy landscape in the 3D model. Instead, we prove below (cf. Propositions 6.14 and 8.1) that Fortunately, this bound suffices to complete our analysis without identifying the exact value of n.
(3) In the 2D model, the N 2D -neighborhoods are fully characterized in Proposition 5.14; meanwhile, in the 3D case, we cannot obtain such a specific and simple result. We overcome the absence of this result by using the 2D result obtained in Proposition 5.14, through suitably applying it to the analysis of the 3D model. Indeed, this absence is a crucial difficulty in extending the analysis to the four-or higher-dimensional models. (4) Because of the aforementioned complexity of the energy landscape, the transition may encounter a dead-end with energy Γ, even in the bulk part of the transition; this is not the case in the 2D model. Therefore, another technical challenge is that of carefully characterizing these dead-ends and appropriately excluding them from the computation.
As explained above, the energy landscape of the 3D model is more complex than that of the 2D one, and we are unable to present a complete description of the energy landscape for the former. Nevertheless, we analyze the landscape with the precision required to prove our main results.
In Section 6, we introduce canonical configurations and paths. Their definitions are direct generalizations of those in the 2D model. Then, we explain several applications of these canonical objects. We first collect several notation which will be frequently used throughout the remainder of the article. For some cases when we only concern the shape of the cluster of spin b (e.g. in Figure 6.2), we omit the dotted box representing Λ.
Notation 6.1. We refer to Figure 6.1 4 for an illustration of the notation below.
If L = M , we can similarly define a bijection Θ (23) on X switching the second and third coordinates. Finally, for the case of K = L = M , we can even define the bijection Θ (13) on X switching the first and third coordinates. 4 In fact, this figure and all the 3D figures below contradict our assumption that K ≥ 2829. However, we believe that there will be absolutely no confusion with these figures which only provide simple illustrations of complicated notions. (if the orange boxes represent the sites with spin b as in Figure 6.1), since the 2D configurations at the 7-th floor are 2D canonical configurations ξ a, b, + 6, 7; 5, 2 and ξ a, b, − 3, 4; 2, 6 , respectively.
Then, for A ⊆ X , we define Υ(A) as  (1) We first introduce some building blocks in the definition of canonical and gateway configurations. For a, b ∈ S and P, Q ∈ S M with P ≺ Q, we define C a, b P, Q ⊆ X as where the 2D objects are defined in Section 5.2. Then, we set C a, b P, Q = Υ( C a, b P, Q ) . A configuration belonging to C a, b for some a, b ∈ S is called a canonical configuration between s a and s b .
In view of the definition above, the role of the map Υ is clear.  6.2. Energy of canonical configurations. One can compute the energy of canonical configurations readily by elementary computations, but we provide a more systematic approach which will be frequently used in later computations. To this end, we first introduce a notation. The energy of the one-dimensional (1D) configuration σ k, is denoted by In the following lemma, we decompose the 3D energy into lower-dimensional ones.
Lemma 6.6. For each σ ∈ X , it holds that Proof. We can write H(σ) as The first and second lines correspond to the first and second terms at the right-hand side of (6.10), respectively.
Based on the previous expression, we deduce the following proposition.
Proposition 6.7 (Energy of canonical configurations). The following properties hold.
Remark 6.8. In particular, we have H(σ) = Γ − 2K − 2 = 2KL for any σ ∈ R A, B i , i ∈ 1, M − 1 . Hence, a Γ-path at a regular configuration can evolve in a non-canonical way, since we still have a spare of 2K + 2 to reach the energy barrier Γ. Incorporating all these behaviors in the metastability analysis is a demanding part of the 3D model. For this reason, the regular configuration plays a crucial role. We remark that for the 2D case [7,25,40], any optimal path at a regular configuration does not have freedom, and that helped a lot simplifying the arguments. 6.3. Canonical paths. In this subsection, we define 3D canonical paths between ground states. They generalize the 2D paths recalled in Definition 5.4. Refer to Figure 6.3 for an illustration.  • for each i ∈ 0, M − 1 , there exists a 2D canonical path ( for all t ∈ KLi, KL(i + 1) .
If K < L < M , a path is called a canonical path if it is a pre-canonical path. If K = L < M , a path is called a canonical path if it is either a pre-canonical one or the image of a pre-canonical one with respect to the map Θ (12) . We can define canonical paths for the cases of K < L = M and K = L = M in a similar manner.
Remark 6.10. We emphasize that for a canonical path (ω t ) KLM t=0 , all configurations ω t , t ∈ 0, KLM , are canonical configurations, and hence any canonical path is a Γ-path by part (1) of Proposition 6.7.
Canonical paths provide optimal paths between two ground states, and hence we can confirm the following upper bound for the energy barrier. Proposition 6.11. For s, s ∈ S, we have that Φ(s, s ) ≤ Γ.
Proof. By Remark 6.10, it suffices to take a canonical path connecting s and s .
We prove Φ(s, s ) ≥ Γ in Section 8 to verify Φ(s, s ) = Γ. This reversed inequality requires a much more complicated proof.
6.4. Characterization of the deepest valleys. We show in this subsection that using the canonical paths, the valleys in the energy landscape, except for the ones associated to the ground states, have depths less than Γ. Note that Theorem 2.4, although not yet proved, indicates that the valleys associated to the ground states have depth Γ. This characterization of the depths of other valleys is essentially required since we have to reject the possibility of being trapped in a deeper valley in the course of transition. This fact is crucially used in the application of the pathwise approach to metastability. Notation 6.12. For the convenience of notation, we call (ω t ) T t=0 a pseudo-path if either ω t ∼ ω t+1 or ω t = ω t+1 for all t ∈ 0, T − 1 .
Proposition 6.13. For σ ∈ X \ S, we have Proof. Main idea of the proof is inherited from the proof of [40, Theorem 2.1]. Let us find two spins a, b ∈ S so that σ has spins a and b at some sites, which is clearly possible since σ / ∈ S. Let us fix a canonical path (ω t ) KLM t=0 connecting s a and s b . Then, we write in a way that Now, we define a pseudo-path (cf. Notation 6.12) ( ω t ) KLM t=0 connecting σ and s b as In other words, we update the spins in an exactly same manner with the canonical path (ω t ) KLM t=0 . We claim that It is immediate that this claim concludes the proof. To prove this claim, we recall the decomposition obtained in Lemma 6.6 and write ω t = ζ. Then, we can write H(ζ) − H(σ) as Let us first consider the first summation of (6.13). We suppose that t ∈ KLi, KL(i + 1) and write ω KLi = σ a, b P and ω KL(i+1) = σ a, b Q where P ≺ Q. Write Q \ P = {m }. Then, we have that since ζ (m) = s 2D b for m ∈ P and ζ (m) = σ (m) for m ∈ Q c . On the other hand, by Lemma 5.6, we have that By (6.14) and (6.15), we conclude that Now, we turn to the second summation of (6.13). Note that ζ k, is obtained from σ k, by flipping the spins in consecutive sites in Q to b. From this, we can readily deduce that Moreover, if x 0 = (k 0 , 0 , m 0 ), we can check that By (6.17) and (6.18), we get Now, the claim (6.12) follows from (6.13), (6.16), and (6.19).
6.5. Auxiliary result on saddle configurations. In the 2D case, in the analysis of the energy landscape, the collection R 2D 2 plays a significant role since to make an optimal transition (not exceeding the energy barrier 2K + 2), we may skip the collection R 2D 1 but must pass through R 2D 2 . Thus, the integer 2 worked as some kind of a threshold for metastable transitions. We expect a similar pattern in the 3D case, and we briefly explain this phenomenon in this subsection.
Let us define m K = K 2/3 . We strongly believe that this quantity does not depend on M , but we do not have a proof for it at the moment. Note that this number was just 2 in the 2D case. In the 3D model, we do not know this number exactly, since non-canonical movements at the early stage of transitions are hard to characterize. However, the upper bound n K, L, M ≤ m K = K 2/3 obtained from (6.21) is enough for our purpose, as we shall see later.
The main result of this subsection is the corresponding lower bound. This result will not be used in the proofs later, but emphasizes the complexity of the energy landscape near ground states.
We fix such an n and write σ = σ 1, 2 1, n . We now construct an explicit path from σ to s 1 without exceeding the energy Γ − 2. Note that σ 1, 2 1, n has spins 2 at T K × T L × 1, n and spins 1 at all the other sites. In this proof, we regard T K = 1, K and T L = 1, L in order to simplify the explanation of the order of spin flips in a lexicographic manner.
Here, each n × n matrix represents {i} × 1, n × 1, n for 1 ≤ i ≤ K, in which the numbers represent the variation of the energy which should be read in ascending lexicographic order. From this path, we obtain where the maximum of the energy is obtained right after flipping the spin at (2, 1, n − 1), which is denoted by bold font at the matrices above. • Next, starting from ζ, we change spins 2 to 1 in 1, K × {i} × 1, n in the ascending lexicographic order for i ∈ n + 1, L − 1 , from i = n + 1 to i = L − 1. Denote by ζ ∈ X the obtained spin configuration, which has spins 2 only on 1, K × {L} × 1, n . In each step, the variation of the Hamiltonian is represented by the n × K matrix Since H(ζ) = 2KL, we can verify that where the maximum is obtained right after flipping the spin at (1, n + 1, 1) (cf. bold font +2). • Finally, starting from ζ , we change spins 2 to 1 in the ascending lexicographic order. The variation of the Hamiltonian is represented by Hence, the Hamiltonian monotonically decreases from H(ζ ) = 2K(n + 1) to arrive at H(s 1 ) = 0. Hence, we have Φ(ζ , s 1 ) ≤ 2K(n + 1) . (6.25) Therefore, by (6.23), (6.24), and (6.25), we have Φ(σ, s 1 ) ≤ 2KL + 2n 2 + 2n − 2 .

Gateway Configurations
In the analysis of the 3D model, a crucial notion is the concept of gateway configurations. The gateway configurations of the 3D model play a far more significant role than those of the 2D model.
We fix a proper partition (A, B) of S throughout this section.
7.1. Gateway configurations. We refer to Figure 7.1 for an illustration of gateway configurations defined below.
Definition 7.1 (Gateway configurations). For a, b ∈ S and P, Q ∈ S M with P ≺ Q, we define G a, b P, Q ⊆ C a, b P, Q as where G a, b, 2D is defined in Definition 5.9. Then, we define (cf. Notation 6.2) Then, recall m K from (6.20) and define, for i ∈ 0, M − 1 , Notice that the crucial difference between (7.1) and (6.4) is the fact that the second union in (7.1) is taken only over i ∈ m K − 1, M − m K . This is related to (6.21), and we give a more detailed reasoning in Section 7.2. A configuration belonging to G a, b for some a, b ∈ S is called a gateway configuration.
P, Q ), n ∈ {1, 2, 3}. A configuration σ ∈ G A, B is called a gateway configuration of type n, n ∈ {1, 2, 3}, if σ ∈ G a, b, [n] P, Q for some a ∈ A, b ∈ B and P, Q ∈ S M with P ≺ Q.
The following proposition is direct from the definition of gateway configurations. Proof. Let σ ∈ G a, b P, Q for some a ∈ A, b ∈ B and P, Q ∈ S M with P ≺ Q, Q \ P = {m 0 }, and |P | ∈ m K − 1, M − m K . Then, by Lemma 6.6, we can write since H 2D (σ (m) ) = 0 for all m = m 0 and H 1D (σ k, ) = 2 for all k ∈ T K and ∈ T L . Hence, by definition, we have Since the Hamiltonian is invariant under Υ, the proof is completed. 7.2. Properties of gateway configurations. Next, we investigate several crucial properties of the gateway configurations which will be used frequently in the following discussions. The following notation will be useful in the remaining parts of the article.

Notation 7.4. For any integers
where K ∈ {C, G, R}. In particular, by (7.1) and (7.2), we can write In this section, we focus on the relation between gateway configurations and neighborhoods of regular configurations. We refer to Figure 7.2 for an illustration of the relations obtained in the current subsection.
The first one below states that we have to escape from a gateway configuration via a neighborhood of regular configurations, unless we touch a configuration with energy higher than Γ. (1) For a ∈ A, b ∈ B, and i ∈ m K − 1, M − m K , we suppose that σ ∈ G a, b i and ζ ∈ X \ G a, b i satisfy σ ∼ ζ and H(ζ) ≤ Γ. Then, we have ζ ∈ N (R a, b [i, i+1] ), and moreover σ is a gateway configuration of type 3.
(2) Suppose that σ ∈ G A, B and ζ ∈ X \ G A, B satisfy σ ∼ ζ and H(ζ) ≤ Γ. Then, we have , and moreover σ is a gateway configuration of type 3.
Proof. We first suppose that σ ∈ G a, b P, Q and ζ ∈ X \ G a, b P, Q for some a ∈ A, b ∈ B and P, Q ∈ S M with P ≺ Q and |P | ∈ m K − 1, M − m K . We write Q \ P = {m 0 }. Then, we claim that ζ ∈ N ({σ a, b P , σ a, b Q }), and σ is of type 3. Let us first show that σ is a gateway configuration of type 3. If σ is of type 1, then we have H(σ) = Γ − 2, H 2D (σ (m 0 ) ) = 2K, and σ (m 0 ) ∈ B a, b, 2D . To update a spin in σ without increasing the energy by 3 or more, it can be readily observed that we have to update a spin of σ at the m 0 -th floor to get ζ with H 2D (ζ (m 0 ) ) ≤ 2K + 2. In such a situation, Lemma 5.10 asserts that σ (m 0 ) / ∈ B a, b, 2D and we get a contradiction. A similar argument can be applied if σ is of type 2, and hence we can conclude that σ is of type 3. Now, since σ is of type 3, we have H(σ) = Γ, H 2D (σ (m 0 ) ) = 2K + 2, and σ (m 0 ) ∈ Z a, b, 2D ∪ Z b, a, 2D (cf. (5.9)). In order not to increase the energy by flipping a site of σ, it is clear that we have to flip a spin at the m 0 -th floor (cf. Figure 7.1). This means that, by Lemma 5.10, . Now, we suppose first that ζ (m 0 ) ∈ N 2D (s 2D a ). Then, there exists a 2D (2K + 1)-path (ω t ) T t=0 in X 2D = S Λ 2D such that ω 0 = s 2D a and ω T = ζ (m 0 ) . Define a 3D path ( ω t ) T t=0 as Then, ( ω t ) T t=0 is a (Γ − 1)-path connecting σ a, b P and ζ, and thus we get ζ ∈ N (σ a, b P ). Similarly, we can deduce that ζ (m 0 ) ∈ N 2D (s 2D b ) implies ζ ∈ N (σ a, b Q ). This concludes the proof of the claim. Now, we return to the lemma. For part (1), suppose that σ ∈ G a, b P, Q for some a ∈ A, b ∈ B and P, Q ∈ S M with |P | = i ∈ m K − 1, M − m K and P ≺ Q. If σ ∈ G a, b P, Q , then by the claim above, we get , and moreover σ is a gateway configuration of type 3. On the other hand, if σ ∈ Θ( G a, b P, Q ) for some permutation operator Θ that appears in Notation 6.2, then by the same logic as above, we obtain that , and that σ is a gateway configuration of type 3. This completes the proof of part (1). Part (2) is direct from part (1). Since ω t 0 −1 ∈ G A, B , ω t 0 / ∈ G A, B , and ω t 0 −1 ∼ ω t 0 , by Lemma 7.5, we have

Next, we establish a relation between G
. This contradicts the fact that (ω t ) T t=0 is a path in X \ N (R A, B [0, M ] ).

Energy Barrier between Ground States
The main objective of the current section is to analyze the energy barrier and optimal paths between ground states. In this section, we fix a proper partition (A, B) of S. The main result of the current section is the following result regarding the energy barrier between the ground states.
Proposition 8.1. The following statements hold.
(2) Let (ω t ) T t=0 be a path in X \G A, B connecting S(A) and S(B). Then, there exists t ∈ 0, T such that H(ω t ) ≥ Γ + 1.
Part (1) of the previous proposition gives an opposite bound of Proposition 6.11 and hence completes the proof of the characterization of the energy barrier. Moreover, in part (2), it is verified that any optimal path connecting S(A) and S(B) must visit a gateway configuration between them. Before proceeding further, we officially conclude the proof of Theorem 2.4 by assuming Proposition 8.1.
Proof of Theorem 2.4. The conclusion of the theorem holds by Proposition 6.11 and part (1) of Proposition 8.1.
We provide the proof of Proposition 8.1 in Sections 8.1 and 8.2. Then, in Section 8.3, we prove the large deviation-type results, namely Theorem 2.7, based on the analysis of energy landscape that we carried out so far. 8.1. Preliminary analysis on energy landscape. The purpose of this subsection is to provide a lemma (cf. Lemma 8.3 below) regarding the communication height between two far away configurations, which will be the crucial tool in the proof of Proposition 8.1.
Before proceeding to this result, we first introduce a lower bound on the Hamiltonian H which will be used frequently in the remaining computations of the current section. For σ ∈ X and a ∈ S, denote by D a (σ) ⊆ T K × T L the collection of monochromatic pillars in σ of spin a: Proof. Since H 1D (σ k, ) = 0 if (k, ) ∈ D(σ) and H 1D (σ k, ) ≥ 2 otherwise, we have that Hence, we can deduce (8.2) from Lemma 6.6. The conclusion on the equality condition is immediate from the argument above. Now, we proceed to the main result of this subsection. For the simplicity of notation, we write, for a ∈ S, so that we have the following natural decomposition of the set X 2D : Note that the set ∆ 2D is non-empty by the definition of N 2D . Recall m K ∈ N from (6.20). The following lemma, which is the main technical result in the analysis of the energy landscape, asserts that we have to overcome an energy barrier of Γ in order to change a 2D configuration at a certain floor from a neighborhood of a ground state to a neighborhood of another ground state.
Lemma 8.3. Suppose that a, b ∈ S. Moreover, let U and V be two disjoint subsets of T M satisfying |U |, |V | ≥ m K , and let σ ∈ X be a configuration satisfying Suppose that another configuration ζ ∈ X satisfies either ζ (m) ∈ V a 1 for some m ∈ U and a 1 = a or ζ (m) ∈ V b 1 for some m ∈ V and b 1 = b. Finally, we assume that σ satisfies Then, both of the following statements hold.
Proof. We first consider part (1). Let (ω t ) T t=0 be a path connecting σ and ζ. For convenience of notation, we define a collection (c m ) m∈U ∪V such that Then, we define where the existence of t ∈ 1, T − 1 such that H 2D (ω (m) t ) / ∈ V cm for some m ∈ U ∪ V is guaranteed by the conditions on σ and ζ. Now, we find m 0 ∈ U ∪ V such that By the definitions of V a and T 0 , we have that If H(ω T 0 ) ≥ Γ, there is nothing to prove. Hence, let us assume from now on that Then, by Lemma 8.2 with σ = ω T 0 and by recalling the definition (8.1) of d(σ), we have Since we get a contradiction to (8.9) if D n (ω T 0 ) = ∅ for all n ∈ S, there exists n 0 ∈ S such that D n 0 (ω T 0 ) = ∅. Suppose first that n 0 ∈ S \ {b}. For this case, we claim that Assume not, so that we have ω T 0 cannot be s 2D n 0 as b = n 0 . Therefore, we verified (8.12). Similarly, if n 0 ∈ S \ {a}, we obtain Since either (8.12) or (8.13) must happen, and since |U |, |V | ≥ m K , we get from (8.9) and (8.11) that 2 n∈S d n (ω T 0 ) + 2K + 2 > (2K + 2) + 4(m K − 1) , (8.14) and hence Thus, we have either Then for K satisfying the condition in Theorem 2.4, we have m K ≥ 200 and thus by the condition (8.6), we can take T 1 < T 0 such that We first suppose that n∈S\{a} d n (ω T 1 ) = h 2 K . Since (cf. (5.13)) ω (m) we can assert from (8.17) and (L2), (L3) of Proposition 5.14 that Therefore, by Lemma 8.2 with σ = ω T 1 , the definition of T 1 , and (8.18), we get where the last inequality holds for K ≥ 32. Of course, we get the same conclusion for the case of n∈S\{b} d n (ω T 1 ) = h 2 K by an identical argument. Therefore, we can conclude that H(ω T 1 ) > Γ, and thus part (1) is verified. Now, we turn to part (2). We now assume that, for some σ and ζ satisfying the assumptions of the lemma, there exists a path (ω t ) T t=0 in X \ G a, b connecting σ and ζ with H(ω t ) ≤ Γ for all t ∈ 0, T . (8.19) Without loss of generality, we can assume that the triple (σ, ζ, (ω t ) T t=0 ) that we selected has the smallest path length T among all such triples.
Recall T 0 from the proof of the first part. If D n (ω T 0 ) = ∅ for some n ∈ S, we can repeat the same argument with part (1) to deduce H(ω T 1 ) > Γ, where T 1 is defined in (8.16). This contradicts (8.19).
Next, we consider the case when D n (ω T 0 ) = ∅ for all n ∈ S. The contradiction for this case is more involved than that of the corresponding case of part (1). By Lemma 8.2, we have that From these observations, we can deduce the following facts: • By (8.22), (8.23), and Lemma 8.2, we have H(ω T 0 ) = Γ.
• By (8.21) and (8.23), we have ω (m) Moreover, the spins must be aligned so that (8.23) holds. Without loss of generality, we assume that m 0 ∈ U , since the case m 0 ∈ V can be handled in an identical manner. Starting from ω T 0 , suppose that we flip a spin at m-th floor, m = m 0 , without decreasing the 2D energy of the m 0 -th floor. Then, since each non-m 0 -th floor is monochromatic and (8.23) holds, the 3D energy of σ increases by at least four and we obtain a contradiction to the fact that (ω t ) T t=0 is a Γ-path. Thus, we must decrease the 2D energy of the m 0 -th floor before modifying the other floors. Define Then, by Proposition 5.14, it suffices to consider the following two cases: ∈ V a , then we obtain a contradiction from the minimality of the length of (ω t ) T t=0 , as we have a shorter path from ω T 2 to ζ where ω T 2 clearly satisfies the conditions imposed to σ.
Because there are exactly 2K such (k, ), by Lemma 8.2, we have where at the first inequality we used the fact that H 2D (ω (m 0 ) T 2 ) = 2K. This contradicts the fact that (ω t ) T t=0 is a Γ-path. Therefore, we must have b = b, which implies along with (8.23) that ω T 2 ∈ G a, b . Hence, we get a contradiction as we assumed that (ω t ) T t=0 is a path in X \ G a, b .
Since we get a contradiction for both cases, we completed the proof of part (2).
Remark 8.4. We remark that (8.16) is exactly the place from which the lower bound 2829 of K in Theorem 2.4 originates.
The following is a direct consequence of the previous lemma which will be used later.
Corollary 8.5. Suppose that P, Q ∈ S M and |P | ∈ m K , M − m K . Then for a, b ∈ S, we have Φ(σ a, b P , σ a, b Q ) = Γ. In particular, we have Φ(σ a, b P , s a ) = Γ.
Proof. We can apply Lemma 8.3 with σ = σ a, b P and ζ = σ a, b Q to get On the other hand, by taking a canonical path connecting s a and σ a, b P , we get Φ(s a , σ a, b P ) ≤ Γ. Similarly, we get Φ(s a , σ a, b Q ) ≤ Γ. Hence, we obtain We are now ready to prove Proposition 8.1. We first prove this proposition when q = 2. Then, the general case can be verified from this result via a projection-type argument.
Proof of Proposition 8.1: q = 2. Since q = 2, we only have two spins 1 and 2 and hence we let s = s 1 and s = s 2 . We fix an arbitrary path (ω t ) T t=0 connecting s and s , and take σ ∈ (ω t ) T t=0 such that σ 1 = KLM/2 + 1 .

(8.27)
Since there is nothing to prove if H(σ) ≥ Γ + 1, we assume that Then, we claim that there exists t ∈ 0, T such that H(ω t ) = Γ. Moreover, we claim that if (ω t ) T t=0 is a path in X \ G 1, 2 , there exists t ∈ 0, T such that H(ω t ) = Γ + 1. It is clear that a verification of these claims immediately proves the case of q = 2.
We recall the decomposition (8.5) of X 2D and write so that T M can be decomposed into T M = P 1 ∪ P 2 ∪ R. Write p 1 = |P 1 |, p 2 = |P 2 |, and r = |R| so that the previous decomposition of T M implies We also write d 1 = d 1 (σ), d 2 = d 2 (σ), and d = d(σ) so that d = d 1 + d 2 . The following facts are crucially used: • By Lemma 8.2 and (8.28), it holds that where the first two bounds follow from (L2) and (L3) of Proposition 5.14, while the last one follows from (L1) of Proposition 5.14. • By inserting (8.31) to (8.30), we get We consider four cases separately based on the conditions on p 1 , p 2 , and r. Recall that we assumed K ≥ 2829; several arguments below require K to be large enough, and they indeed hold for K in this range.
(Case 2: p 1 ≥ 1, p 2 = 0, r ≥ 1 or p 1 ≥ 1, p 2 = 0, r ≥ 1) By symmetry, it suffices to consider the former case. Similarly as in (Case 1), we can apply the first bound in (8.31) to deduce Again by the first bound in (8.31), we have for all m ∈ P 1 , and thus we get Therefore, there exists m 0 ∈ R such that where at the second line we used p 1 = M − r. Thus, we have Inserting this to (8.32), we get Reorganizing and applying a similar estimate as in (8.34), we get Now, we analyze two sub-cases separately.
• p 1 ≤ (2K + 1)/8: Then, we can rewrite (8.40) as Multiplying r/K in both sides, we reorganize the previous inequality as Since p 1 ≤ (2K + 1)/8, we have Multiplying both sides by r/K and reorganizing, we get Since the right-hand side is negative for K ≥ 9, we get a contradiction.
(Case 4: p 1 = p 2 = 0) For this case, we have σ (m) ∈ ∆ 2D for all m ∈ T M . Hence, H 2D (σ (m) ) ≥ 2K for all m ∈ T M by (L1) of Proposition 5.14, and thus by (8.30) we get If this is KL, then all floors should have the same configuration, which is impossible since σ 1 = KLM/2 + 1 cannot be a multiple of M . If this is KL − 1, then the equality in (8.43) must hold and thus we have H 2D (σ (m) ) = 2K for all m ∈ T M . Hence, by (L1) of Proposition 5.14, σ (m) 1 , m ∈ T M , is a multiple of K, and thus σ 1 = m∈T M σ (m) 1 is also a multiple of K. This is impossible since σ 1 = KLM/2 + 1 is not a multiple of K.
It remains to consider the case of M = L. For this case, (8.43) becomes so that we have |E(σ)| ≤ K + 1 by (8.44). We now have three sub-cases. We note that H 2D (σ (m) ) is an even integer for each m ∈ T M , as q = 2.
• First, we assume that H 2D (σ (m) ) = 2K for all m ∈ T M . Then, as in the previous discussion on the case of M = L + 1, we get a contradiction since σ 1 must be a multiple of K for this case. • Next, we assume that H 2D (σ (m) ) ≥ 2K + 2 for all m ∈ T M . Then, by (8.30), Hence, we have |E(σ)| = KL − d = 1. Write E(σ) = {(k 0 , 0 )}. By Lemma 5.13, we can deduce that the configuration σ (m) has at least L − 1 ≥ 3 monochromatic bridges, and thus we have at least one monochromatic bridge of the form T K × { } or {k} × T L that does not touch E(σ), so that it is a subset of either D 1 (σ) or D 2 (σ). Suppose first that this bridge is T K × { } for some ∈ T L . Then, the slab T K × { } × T M is monochromatic. Therefore, by replacing the role of the second and third coordinates, which is possible since K = L = M , the proof is reduced to one of (Case 1), (Case 2), and (Case 3) as there is a monochromatic floor so that either p 1 or p 2 is positive. This completes the proof. Similarly, if the monochromatic bridge is {k} × T L , then we replace the role of the first and third coordinates to complete the proof. • Now, we lastly assume that H 2D (σ (i 0 ) ) = 2K for some i 0 ∈ T M and H 2D (σ (j 0 ) ) ≥ 2K + 2 for some j 0 ∈ T M . By (8.30), we get and hence, we have |E(σ)| ≤ K (cf. (8.45)). Now, we consider two sub-sub-cases separately.
-|E(σ)| ≤ K − 1: First, suppose that K < L. By (L1) of Proposition 5.14, we have We further have This implies that all sites in the slab T K × { 1 } × T M have the same spin n under σ. Since L = M , we can replace the role of the second and third coordinates to reduce the proof to one of (Case 1), (Case 2) and (Case 3). This completes the proof. Next, if K = L, then since there further exists k 1 ∈ T K such that ({k 1 } × T L ) ∩ E(σ) = ∅, we can use the same argument as above to handle this case as well.
-|E(σ)| = K: The equality in (8.47) must hold, and thus we get H 2D (σ (j 0 ) ) = 2K + 2 and H 2D (σ (m) ) = 2K for all m ∈ T M \ {j 0 }. We first suppose that K < L. By (L1) of Proposition 5.14, we get σ (m) = ξ 1, 2 m, vm for some m ∈ T L and v m ∈ 2, L − 2 for all m ∈ T M \ {j 0 }. Then, since L is strictly bigger than K = |E(σ)|, we can always find a row in T K × T L which is either a subset of D 1 (σ) or D 2 (σ). Thus, by changing the role of the second and third coordinates, which is possible since L = M , we find a monochromatic floor and the proof is reduced to one of (Case 1), (Case 2) and (Case 3). Next, we handle the case K = L, so that for all m ∈ T M \ {j 0 }, σ (m) = ξ 1, 2 m, vm or Θ(ξ 1, 2 m, vm ) (cf. Definition 5.3) for some m ∈ T L and v m ∈ 2, L − 2 . First of all, assume that all of them are of the same direction. Without loss of generality, assume that σ (m) = ξ 1, 2 m, vm for all m ∈ T M \ {j 0 }. If σ (m 1 ) = σ (m 2 ) for some m 1 , m 2 ∈ T M \ {j 0 }, then E(σ) must be exactly the line where they differ and hence we can write E(σ) = T K × { 0 } for some 0 ∈ T L . Then, by taking any ∈ T L \ { 0 }, we notice that T K × { } is not only monochromatic in σ (m) with m ∈ T M \ {j 0 }, but also a subset of either D 1 (σ) or D 2 (σ); hence, T K × { } × T M is a monochromatic slab. By replacing the role of the second and third coordinates, which is possible since L = M , we find a monochromatic floor and the proof is reduced to one of (Case 1), (Case 2) and (Case 3). On the contrary, suppose that σ (m 1 ) = σ (m 2 ) for all m 1 , m 2 ∈ T M \ {j 0 }. If there exists a row or column which is disjoint with E(σ), then we can argue as above. If not, then we can easily deduce that for the j 0 -th floor, which contradicts the assumption that H 2D (σ (j 0 ) ) = 2K + 2. Finally, we consider the case when σ (m) = ξ 1, 2 , v and σ (m ) = Θ(ξ 1, 2 , v ) for some m, m ∈ T M \ {j 0 } simultaneously. In this case, we have Thus, we get a contradiction since where the second inequality holds since v, v ∈ 2, K − 2 .
Now, we consider the general case of Proposition 8.1.
Proof of Proposition 8.1: general case. We fix a proper partition (A, B) of S and then fix a ∈ A and b ∈ B. Let (ω t ) T t=0 be a path connecting s a and s b . For each σ ∈ X , we denote by σ the configuration obtained from σ by changing all spins in A to 1 and spins in B to 2. Thus, σ becomes an Ising configuration, i.e. a spin configuration for q = 2. Note that Now, we consider the induced pseudo-path ( ω t ) T t=0 of (ω t ) T t=0 (cf. Notation 6.12). Thus, by the proof above for q = 2, there exists t 1 ∈ 0, T such that H( ω t 1 ) ≥ Γ. Thus, we get from (8.48 and we complete the proof for part (1). For part (2), suppose that (ω t ) T t=0 is a path such that Then, by (8.48), we have H( ω t ) ≤ Γ for all t ∈ 0, T . Thus, by the proof above for q = 2, there exists s ∈ 0, T such that ω s ∈ G 1, 2 . We now claim that ω s ∈ G A, B . It is immediate that this claim finishes the proof.
Then, we have ω s (x) ∈ A for x ∈ U 1 and ω s (x) ∈ B for x ∈ U 2 . (8.50) Now, we assume that ω s (x) = ω s (y) for some x, y ∈ U 1 or x, y ∈ U 2 with x ∼ y .
We now express the energy H(ω s ) as where the summation is carried over x, y satisfying x ∼ y. Note that the second summation is equal to H( ω s ) by (8.50). On the other hand, we can readily deduce from Figure 7.1 that the first summation of (8.52) is at least 4 if ω s is a gateway configuration of type 1, and at least 2 if ω s is a gateway configuration of type 2 or 3 (cf. Notation 7.2). Thus, by Proposition 7.3, we can conclude that the right-hand side of (8.52) is at least Γ + 2; i.e., we get H(ω s ) ≥ Γ + 2. This contradicts (8.49) and hence, we cannot have (8.51). This finally implies that there exist a 0 ∈ A and b 0 ∈ B such that and thus we have ω s ∈ G a 0 , b 0 ⊆ G A, B as claimed.
8.3. Proof of Theorem 2.7. Theorem 2.7 is now a consequence of our analysis on the energy landscape and the general theory developed in [40,41].
Proof of Theorem 2.7. We have two results on the energy barrier; Theorem 2.4 and Proposition 6.13. The theory developed in [41] implies that these two are sufficient to conclude Theorem 2.7 6 . This implication has been rigorously verified in [40] for the case of d = 2, and this argument extends to the case of d = 3 without a modification. Hence, we do not repeat the argument here, and refer the readers to [40, Section 3] for a detailed proof.

Typical Configurations and Optimal Paths
In the previous sections, we proved large deviation-type results regarding the metastable behavior by analyzing the energy barrier in terms of canonical and gateway configurations. 6 We remark that the second convergence of (2.11) is not a consequence of an analysis of the energy barrier, but of the first convergence of (2.11) and the symmetry of the model. This argument is also given in [40,Section 3] for d = 2, and an identical one works for d = 3. In order to get precise quantitative results such as Theorems 2.10 and 2.12 or to get a characterization of optimal paths, we need a more refined analysis of the energy landscape based on the typical configurations which will be introduced and analyzed in the current section.
We fix a proper partition (A, B) of S throughout the section.
9.1. Typical configurations. Let us start by defining the typical configurations. We consistently refer to Figure 9.1 for an illustration of our construction. For a, b ∈ S and i ∈ 0, M , we define  • Bulk typical configurations: We define, for a, b ∈ S, , and then define where the second identity holds because of Remark 9.1. A configuration belonging to B A, B is called a bulk typical configuration between S(A) and S(B). • Edge typical configurations: We define In this subsection, we analyze some properties of the edge and bulk typical configurations. In fact, we have to take K large enough (i.e., K ≥ 2829) in order to get the structural properties of edge and bulk typical configurations given in the current section.
The first property asserts that E A and E B are disjoint.
Proposition 9.5. The two sets E A and E B are disjoint.
Proof. By part (2) Since σ ∈ R A, B [0, m K −1] , there exists a Γ-path in X \ G A, B (which is indeed a part of a canonical path) connecting σ and S(A). Similarly, there exists a Γ-path in X \ G A, B connecting σ and S(B). By concatenating them, we can find a Γ-path (ω t ) T t=0 in X \ G A, B connecting S(A) and S(B). This contradicts part (2) of Proposition 8.1. Now, we analyze the crucial features of the typical configurations. Note that this is a 3D version of Proposition 5.8. (1) It holds that Proof. (1) It suffices to prove the first identity, as the second one follows similarly. One can observe that the set G A, B m K −1 ⊆ G A, B is disjoint with B A, B from (9.2), and the set The proof is completed by (9.4) and (9.5).
(2) We will first prove that Since it is immediate that S is a subset of the right-hand side, We now take a subset I A of I A so that we can decompose I A into the following disjoint union: Consequently, we get the following decomposition of E A : Notation 9.7. For σ ∈ I A , we denote by σ ∈ I A the unique configuration satisfying σ ∈ N (σ).
By part (1) of Lemma 8.3, for σ, σ ∈ R A, B m K , the two sets N (σ) and N (σ ) are disjoint. By a similar reasoning, we know that for any σ ∈ R A, B m K and a ∈ A, the sets N (σ) and N (s a ) are disjoint. Thus, we can assume that • (Graph) We define the graph structure and σ ∼ ζ for some ζ ∈ N (σ ) .
• (Markov chain) We first define a rate r A : (9.10) We now let (Z A (t)) t≥0 be the continuous-time Markov chain on V A with rate r A (·, ·). Note that the uniform distribution on V A is the invariant measure for the chain Z A (·), and indeed this chain is reversible with respect to this measure.
• (Potential-theoretic objects) Denote by L A , h A ·, · (·), and cap A (·, ·) the generator, equilibrium potential, and capacity with respect to the Markov chain Z A (·), respectively.
We now give three important propositions regarding the objects constructed above. These propositions play fundamental roles in the construction of the test function on the edge typical configurations.
We remark from (9.9) that S(A), R A, B m K ⊆ I A ⊆ V A . Potential-theoretic objects between these two sets are crucially used in our discussion. We define .
For n ∈ 1, q − 1 (with a slight abuse of notation) we can write We refer to e.g., [12] for the flow structure and the Thomson principle.

Proof.
We recall that an anti-symmetric function φ : V A ×V A → R is called a flow associated with the Markov chain Z A (·), provided that φ(x, y) = 0 if and only if {x, y} ∈ E A . For each flow φ, the associated flow norm is defined by For each x ∈ V A , the divergence of a flow φ at x is defined by Finally, for two disjoint non-empty subsets U, Then, by the Thomson principle (cf. [12,Theorem 7.37]), for any unit flow ψ from S(A) to We shall construct below a unit flow ψ from S(A) to R A, B m K that satisfies (9.14) Then, by combining (9.13) and (9.14), we have (recalling the definition (6.20) of m K ) Recalling the definition (9.11), this completes the proof. Now, it remains to construct a unit flow ψ from S(A) to R A, B m K satisfying bound (9.14). To this end, let us first fix a ∈ A and b ∈ B. Define a ∈ A, and b ∈ B. Then, we first define a flow ψ P, Q connecting σ a, b P = σ a, b P and σ a, b Q = σ a, b Q (cf. Notation 9.7). First, we set or , , 1 and ξ a, b , L−1 that appear in (9.17) with σ a, b P and σ a, b Q , respectively, to get a flow connecting σ a, b P and σ a, b Q . We remark that we may have σ a, b P = σ a, b Q . We deduce from the definition of the flow norm that where K 2 L(L − 2) is the number of edges that appear in (9.17). Next, we define where KL is the number of configurations in C a, b P, Q connected to σ a, b P , and 2M is the number of possible choices of P and Q. Consequently, the flow ψ is a unit flow from S(A) to R A, B m K . Thus, it suffices to verify (9.14). Since the support of ψ P, Q (which is the collection of edges on which ψ P, Q is non-zero) for different pairs (P, Q) are disjoint, we deduce from (9.18) that and therefore ψ satisfies (9.14).
For simplicity, we write (cf. (9.9)) where h A is the equilibrium potential defined in Definition 9.8. This function is a fundamental object in the construction of the test function in Section 10.
Proof. We fix σ ∈ R A, B m K ∩ O A . It suffices to prove that any Γ-path (ω t ) T t=0 from σ to S(A) must visit N (R A, B m K ). Suppose first that the path (ω t ) T t=0 does not visit G A, B . Since σ ∈ R A, B m K , there exists a Γ-path in X \ G A, B connecting R A, B m K and σ, and therefore by concatenating this path with (ω t ) T t=0 , we get a Γ-path in X \ G A, B connecting R A, B m K and S(A). This contradicts part (2) of Lemma 8.3. Thus, the path (ω t ) T t=0 must visit G A, B and we let By part (2) of Lemma 7.5, we have avoiding G A, B , which contradicts part (2) of Lemma 8.3. Hence, we can conclude that ω t 0 −1 ∈ N (R A, B m K ), as desired.
Remark 9.11. The previous proposition implies that configurations σ that belong to R A, B m K ∩O A are dead-ends attached to N (R A, B m K ) (cf. grey protuberances attached to green boxes in Figures  7.2 and 9.1).
The next proposition highlights the fact that the auxiliary process Z A (·) defined in Definition 9.8 approximates the behavior of the Metropolis-Hastings dynamics at the edge typical configurations.
Proposition 9.12. Define a projection map Π A : E A → V A by (cf. Notation 9.7) Then, there exists C = C(K, L, M ) > 0 such that (2) for σ 1 ∈ O A and σ 2 ∈ I A , we have Proof.
9.4. Analysis of 3D transition paths. In this section, we finally define the collection of transition paths between ground states that appear in Theorem 2.9.

Construction of Test Function
We fix in this section a proper partition (A, B) of S. The main purpose of the current section is to construct a test function h = h β A, B : X → R that satisfies the two requirements of Proposition 3.2.  • For σ ∈ E A , we recall the decomposition (9.8) of E A and define (10.1) • For σ ∈ E B , we similarly define The test function h is defined on G A, B P, Q by where {m} = Q \ P so that σ (m) is a 2D gateway configuration between s 2D a and s   The remainder of this section is devoted to proving parts (1) and (2) We first consider the second summation. Observe first that, by part (2) of Proposition 9.6, we have E A, B ∪ B A, B = N (S) and thus we get H(ζ) ≥ Γ + 1 if σ ∼ ζ. Hence, by (2.6) and Theorem 2.2, we get From the fact that 0 ≤ h ≤ 1 (cf. part (2) of Remark 10.3), we can conclude that the second summation is o β (1) e −Γβ . The third summation is trivially 0 by the definition of the test function on (E A, B ∪ B A, B ) c . Therefore, it remains to show that  (1)) times 2κ 2D e −Γ 2D β . Therefore, display (10.8) equals Inserting this to (10.7) (and recalling (3.8)), we get Next, we deal with the second and third summations of (10.6). By (10.1) and Proposition 9.12, the second summation equals Similarly, we get Therefore, by (10.10), (10.11), and (10.12), we can conclude that the left-hand side of (10.5) is equal to This concludes the proof.
We are left to consider ψ(σ) for σ ∈ E A, B ∪ B A, B = N (S). To this end, we decompose as ψ = ψ 1 + ψ 2 where In fact, we can show that ψ 2 (σ) is negligible. Proof. This follows directly by the same argument presented in the proof of Lemma 10.5. Now, to estimate ψ 1 (σ), let us first look at the bulk typical configurations that are not the edge typical configurations. for some m ∈ Q \ P with P ≺ Q (cf. Definition 10.2), where σ (m) and ζ (m) are considered as 2D Ising configurations. Then, by Theorem 2.2 and (5.2), the last display equals Since σ (m) is a 2D gateway configuration, by part (1) of Proposition 5.12, the last summation equals o β (e −Γ 2D β ). Therefore, we conclude that . It remains to prove that for all a ∈ A, b ∈ B, and P ∈ S M such that |P | ∈ m K + 1, M − m K − 1 , Since we constructed the test function h between σ a, b P and σ a, b Q (P ≺ Q) and between σ a, b Q and σ a, b P (Q ≺ P ) in the same manner, the two summations above cancel out with each other, and thus we obtain (10.18).
Finally, for the last statement of the lemma, it suffices to see that if σ ∈ N (R A, B i ) and Next, we turn to the edge typical configurations.
Lemma 10.9. The following statements hold.
(2) First, we prove (10.19). Note that h is constant on N (σ). Thus, By part (2) of Proposition 9.12 and the definition of h, this is equal to .
Finally, for the last statement, the last display implies that for all ζ ∈ N (σ), where the first inequality holds since 0 ≤ h A ≤ 1 and the second inequality holds by Proposition 9.9, (9.10), and the fact that the number of such ζ ∈ O A with σ ∼ ζ does not depend on β. This concludes the proof. and that |ψ 1 (σ)| ≤ Ce −Γβ for all σ ∈ R A, B m K where C is a constant independent of β.
Proof. First, we consider the first statement. Proposition 9.10 and the definition of h on R A, B m K imply that ψ 1 (σ) = 0 for all σ ∈ R A, B m K \ N (R A, B m K ). Hence, it suffices to prove that where the identity follows from the definition of b in (3.8). Combining this with (10.21) and (10.23), we can prove the first statement of the lemma. For the second statement, from the discussion before (10.20) it is inferred that we only need to prove for σ ∈ N (R A, B m K ). For such σ ∈ N (R A, B m K ), the previous proof implies that where we used the fact that 0 ≤ h ≤ 1. By (10.22) and Proposition 9.9, the first summation in the right-hand side is bounded by Ce −Γβ . By (10.24) and (10.25), the second summation in the right-hand side is also bounded by Ce −Γβ . Therefore, we conclude the proof of the second statement. Moreover, it holds that |ψ 1 (σ)| ≤ Ce −Γβ for all σ ∈ N (S(A))∪N (S(B)) where C is a constant independent of β.
Proof. We concentrate on the claim for N (S(A)), since the corresponding claim for N (S(B)) can be proved in the exact same way. By the property of capacities (e.g., [12, (7 This proves the first statement. As before, the fact that |ψ 1 | ≤ Ce −Γβ on N (S(A)) is straightforward from the observations made in the proof.
Finally, we present a proof of Proposition 3.2 by combining all computations above.
Proof of Proposition 3.2. It remains to prove that h satisfies part (1) since we already verified in the previous subsection that it satisfies part (2). By the discussion at the beginning of the subsection, it suffices to prove (10.16). By the definition of ψ given in (10.15)

Remarks on Open boundary Condition
Thus far, we have only considered the models under periodic boundary conditions. In this section, we consider the same models under open boundary conditions. The proofs for the open boundary case differ slightly to those of the periodic case; however, the fundamentals of the proofs are essentially identical. Hence, we do not repeat the detail but focus solely on the technical points producing the different forms of the main results.
Energy Barrier. We start by explaining that for the open boundary case, the energy barrier is given by Γ = KL + K + 1 . (11.1) One can observe that the canonical path explained in Figure 6.3 becomes an optimal path (note that we should start from a corner of box in this case) with height KL + K + 1 between ground states. This proves that the energy barrier Γ is at most KL+K +1. Hence, it remains to prove the corresponding lower bound, i.e., of the fact that Γ ≥ KL + K + 1. Rigorous proof of this has been developed in [40] for the 2D model, and the same argument also applies to the 3D model as well using the arguments given in Section 8.
Sub-exponential prefactor. As mentioned earlier, the large deviation-type results (Theorems 2.7 and 2.9) hold under open boundary conditions without modification, except for the value of Γ. On the other hand, for the precise estimates (Theorems 2.10 and 2.12), the prefactor κ must be appropriately modified.
For simplicity, we assume that q = 2 and analyze the transition from s 1 to s 2 . To heuristically investigate the speed of this transition in the open boundary case via a comparison to the periodic one, it suffices to check the bulk part of the transition, because the edge part is negligible (as K → ∞) as in the periodic boundary case. The bulk transition must start from a configuration filled with m K lines of spin 2 at either the bottom or top of the lattice box Λ. In the periodic case, there are M choices for these starting clusters (of spins 2) of size KL × m K ; thus, we can observe that the speed of the transition is slowed by a factor of M/2 under this restriction. Now, let us suppose that we are at a configuration such that several floors of spin 2 are located at the bottom of the lattice, as in Figure 6.3. When we expand this cluster of spin 2 in the periodic case, there are 2 (namely, up and down) possible choices for the next floor to be filled; on the other hand, there is only one (namely, up) possible choice in the open boundary case. This further slows down the transition by a factor of 2. Next, when we expand the floor at the top of the cluster of spin 2, we may again look at the bulk part of the spin updates (cf. Definition 5.7). Thus, we suppose that there are two lines filled with spin 2 on that floor. There are L possible choices of the location in the periodic case, but just two possible choices in the open case. Thus, this gives us a factor of L/2. Moreover, we may choose one of two directions of growth of lines in the periodic case, which gives us additional factor of 2. Finally, there are K possible ways to form a protuberance in the periodic case; however, we now have only two (at the corners) possible choices. This further slows down the transition by a factor of K/2. Once the protuberance has been formed, we have only one direction in which to expand it, whereas we have two directions in the periodic case. This slows down the transition by a factor of 2. Summing up, the transition on the bulk is slowed by a factor of Turning this into a rigorous argument (via the same logic applied to the periodic case), we obtain the following Eyring-Kramers law with a modified (compared to the periodic case) prefactor. Recall that we assumed K ≤ L ≤ M . The constant κ can be defined in terms of new bulk and edge constants b (n) and e (n), in the exact same manner as done in Section 3.1.
Then, Theorem 2.12 also holds for open boundary conditions with modified limiting Markov chain X (·) with rate r X (s, s ) = (κ ) −1 for all s, s ∈ S.