Simplest random walk for approximating Robin boundary value problems and ergodic limits of reflected diffusions

A simple-to-implement weak-sense numerical method to approximate reflected stochastic differential equations (RSDEs) is proposed and analysed. It is proved that the method has the first order of weak convergence. Together with the Monte Carlo technique, it can be used to numerically solve linear parabolic and elliptic PDEs with Robin boundary condition. One of the key results of this paper is the use of the proposed method for computing ergodic limits, i.e. expectations with respect to the invariant law of RSDEs, both inside a domain in $\mathbb{R}^{d}$ and on its boundary. This allows to efficiently sample from distributions with compact support. Both time-averaging and ensemble-averaging estimators are considered and analysed. A number of extensions are considered including a second-order weak approximation, the case of arbitrary oblique direction of reflection, and a new adaptive weak scheme to solve a Poisson PDE with Neumann boundary condition. The presented theoretical results are supported by several numerical experiments.

1. Introduction. This paper is devoted to weak approximation of stochastic differential equations (SDEs) with reflecting boundary conditions in a multidimensional domain G ⊂ R d . Many models from physics, biology, engineering, and finance can be described using SDEs. In some of those scenarios reflected SDEs (RSDEs) can be used as a modelling tool. Let us list a few examples from different fields. Applications of reflected diffusion processes in stock management as well as in quality control were considered in [9]. Heavy traffic behaviour of queuing systems is modelled using the reflected Brownian motion in [5]. In [21] (see also [37]), it is demonstrated that the solution (known as the optimal portfolio process) of consumption investment problems with transaction cost are governed by RSDEs. In [43], the authors used RSDEs to model dynamics of reacting chemical species with the constraint that concentration of species cannot be negative. Related issues in sampling measures and modelling natural phenomena in constrained space using reflective boundaries arise in many different connections, for example in molecular modeling [4], biological models [35,39], continuum mechanics [25], chemistry [64,19], and statistical inference [1,22]. We expect our study therefore to be of wide interest.
The Feynman-Kac formula gives the probabilistic representation of solutions of parabolic and elliptic PDEs with Neumann/Robin boundary condition as the expectation of a functional of the reflected diffusion process. Solving such PDEs using deterministic methods, for example finite difference methods, requires approximating solutions in the whole domain, therefore the computational cost increases exponentially with increasing dimension. The use of the Monte Carlo method to find solutions of such PDEs is preferred when the solution is not needed in the whole domain but only at certain points. Further, in the case of the Monte Carlo methods, independent trajectories can be simulated using parallel computers.
Another important application of RSDEs is in making use of a stochastic gradient system with reflection for drawing samples from higher dimensional distributions with compact support (see [14] and Section 4 here). This application is one of the main objectives of this paper in the setting of ergodic limits (Section 4).
Let G ⊂ R d be a bounded domain with boundary ∂G, Q := [T 0 , T ) × G be a cylinder in R d+1 , and S be the lateral surface of Q. Let b :Q → R d and σ :Q → R d×d . Consider the RSDEs dX(s) = b(s, X(s))ds + σ(s, X(s))dW (s) + ν(X(s))I ∂G (X(s))dL(s), (1.1) where W (s) is a d-dimensional standard Wiener process defined on a filtered probability space (Ω, F , (F s ) s≥0 , P); ν(z), z ∈ ∂G, is the inward normal vector to the boundary ∂G; and I ∂G (z) is the indicator function of z ∈ ∂G. Further, L(s) is the local time of the process X(s) on the boundary ∂G adapted to the filtration (F s ) s≥0 . A local time is a scalar increasing process continuous in s which increases only when X(s) ∈ ∂G (see the precise definition in e.g. [46,36,27]): I ∂G X(s) dL(s), a.s.
We also note [46] that in the integral form of (1.1) the term ν(X(s))I ∂G X(s) dL(s) is a d-dimensional bounded variation process which increases only when X(s) ∈ ∂G.
The following questions are considered in this work: • How to numerically (in the weak sense) solve RSDEs and the related linear parabolic equation with Robin boundary condition?
The proposed simple-to-implement weak approximation of (1.1) numerically solves a linear advection-diffusion equation with Neumann boundary condition, and its extension solves an advection-diffusion equation with a decay/growth term and with the nonhomogeneous Robin (in other words, third) boundary condition (see .
• How to compute ergodic limits in the domain G as well as on the boundary ∂G?
We introduce time-averaging and ensemble-averaging estimators to numerically calculate expectations with respect to the invariant density of a reflected diffusion X(t) which lies inḠ. We also propose estimators to compute integrals with respect to the normalised restriction of the invariant density of X(t) on ∂G (see Section 4). • How to sample from distributions with compact support using Brownian dynamics?
This directly follows from the previous point. Drawing samples from compactly supported targeted distributions has many applications, especially in machine learning and molecular dynamics. The proposed algorithm applied to a stochastic gradient system (Brownian dynamics) with reflection efficiently samples from distributions whose support is a compact setḠ as well as from distributions whose support is a d − 1 dimensional hyper-surface ∂G (see Subsection 4.1.4). • How to numerically solve a linear elliptic equation with Robin boundary condition?
The probabilistic representation of the solution of the elliptic Robin problem involves integration of functionals of X(t) on [0, ∞). The weak method of Section 2 is applied and analysed in the case of the elliptic problem (see Section 5). The special case of the Poisson equation with Neumann boundary condition is treated separately, for which a new adaptive time-stepping algorithm is proposed (see Subsection 5.2).
The approaches to numerically approximate the solutions of RSDEs driven by Wiener processes have taken three directions. The first two approaches are penalty methods [63,68,55] and projection methods [47,18,55,7]. Introduce the projection map ontoḠ : Π(x) = arg min y∈Ḡ |x − y| , x ∈ R d . We note that if x ∈Ḡ then Π(x) = x. In projection schemes, the map Π(x) is applied at every step of a numerical scheme (e.g., the Euler method) approximating the RSDEs with the local time term omitted. To construct penalty schemes, one replaces the reflection term in the RSDEs with β λ (X(s))ds, where β λ (x) := (x − Π(x))/λ , x ∈ R d , and λ is a positive constant. We note that β λ (x) = 0 for x ∈Ḡ. Then these resulting SDEs are approximated, e.g. by the Euler scheme (see a brief description of penalty and projection methods in [55,Section 5.6] and the references therein). The convergence of penalty schemes is mainly shown in the mean-square sense (see [63,68]). In this paper, we are interested in weak approximation.
Liu [47] proposed a weak scheme by combining projection and orthogonal transformation of the diffusion matrix and obtained weak first-order convergence. Costantini, Pacchiarotti and Sartoretto [18] obtained weak order 1/2 through a projection scheme. Expectations of complex functionals of local time can be evaluated using their scheme.
Methods which are neither penalty methods nor projection methods, we term reflection methods [6,52,33,34,12,55]. Milstein [52] (see also [55]) proposed a weak scheme with first order of accuracy to solve the Robin boundary value problem for parabolic PDEs. The scheme is not easy to implement as it requires to change the local coordinates when the discretized sample path reaches the proximity of the boundary (see its implementation in [10]). Gobet [33,34] suggested a reflected scheme in half space which locally approximatesḠ and gained a half order of weak convergence for any oblique direction of reflection. The order is improved to one for the co-normal reflection. Bossy, Gobet and Talay [12] analyzed an Euler scheme combined with a symmetrized procedure for simulation of reflected diffusion with oblique reflection and obtained first order of accuracy. Since their scheme is based on Gaussian increments, they propose to restart the simulation if a discretized sample path takes a very large increment outside the domain. Approximating ergodic limits inside the domain G using the scheme of [12] was considered in [16], where the corresponding convergence in time step was proved with an order lower than 1. The other possible limitation of the method is that it is not known how to adapt it to compute expectations of integrals with respect to local time.
In this paper, we propose a new reflection method to numerically approximate RSDEs. The method does not require any orthogonal transformation of diffusion matrix or change of local coordinates thus it is easy to implement. This new method is based on the idea of symmetrized reflection on the boundary, however we apply the weak Euler scheme which uses bounded random variables making sure that discretized sample paths cannot move beyond the boundary outside the domain by more that O( √ h), where h is the time step. We note that although our method is based on symmetrized reflection like [12], our approach to prove convergence is entirely different -we use appropriate PDEs as in typical proofs of weak convergence [52,51,55,69]. Further, the path we take for the analysis allows us to compute expectations of functionals of local time accompanied by optimal error estimates. Moreover, our approach works for time averaging estimation of ergodic limits, both in the domain G and on the boundary ∂G, and the corresponding first-order convergence in the time step is proved in both cases. We also extend our algorithm to approximate SDEs with reflection in any oblique inward direction with first order of convergence. In addition, we modify our algorithm by introducing a new procedure near the boundary which results in a second-order method.
As already highlighted earlier, our method can be used to solve elliptic PDEs with Robin boundary condition, where the two cases are considered separately: the first with a decay term (equations (5.1)-(5.2)), where we achieve first order accuracy; the second case (equations (4.1)-(4.2)) is without a decay term (the Poisson problem), which causes additional difficulties. We introduce an adaptive time-stepping scheme based on a novel idea of double time-discretization to solve the Poisson problem with first order accuracy.
In, for example, molecular dynamics (see [41] and references therein) and Bayesian statistics (see e.g., [72,3,42] and references therein), it is usually necessary to compute the expectation of a given function with respect to the invariant law of the diffusion (ergodic limit). There are two common approaches to computing ergodic limits. One is to simulate a numerical trajectory over a long period of time and take the average at discretized points (time-averaging estimation, see e.g. [69,56,51,16] and references therein). The other is to simulate many independent numerical realizations of the associated SDEs and average them at a sufficiently large finite time (ensemble-averaging estimation, see e.g. [56]).
In this work we are interested in finding how close numerical time-averaging and ensemble-averaging estimators are to the corresponding expectations with respect to the RS-DEs' stationary measure. In [51] it was shown how an appropriate Poisson equation can be used to obtain the long time average of the functional of diffusion processes. In the same spirit, here we make use of the Poisson equation with Neumann boundary condition for analysis of numerical time-averaging estimators. For proving accuracy of the ensemble-averaging estimators, we exploit a parabolic PDE with Neumann boundary condition.
In many applications it is required to sample from distributions with compact support, G (see e.g. [11,38,17]). Although there are methods which have been proposed to sample from distributions subject to certain constraints, most of them are not well studied theoretically except e.g. [14,13], where [14] uses a projection scheme and [13] exploits a penalty method for sampling from log-concave distributions with compact support. Both works establish bounds on a distance (in total variation norm or Wasserstein distance) between the stationary measure of a Markov chain generated by a numerical method and the corresponding stationary measure of the underlying reflected Brownian dynamics. We are interested in establishing closeness (including convergence order) of estimators to the ergodic limits with respect to generic ergodic RSDEs allowing us to sample from arbitrary distributions with compact supportḠ.
Finally, in this paper, we also develop a methodology to sample from a targeted probability distribution which lies on the hyper-surface ∂G that can be considered as the boundary of a bounded domain G. There are geometric Monte Carlo methods for sampling from distributions lying on ∂G (see e.g. [32,15] and the references therein). Our approach is different and is based on weak approximation of the local time of the reflected diffusion X(t) on the boundary ∂G (see Subsection 4.1.4 for details).
2. Numerical method to approximate reflected SDEs. In this section, we first discuss the existence and uniqueness of solutions of RSDEs (1.1).
In the paper we will use the following functional spaces. Let C p+ 2 ,p+ (Q) (or C p+ (Ḡ)) be a Hölder space containing functions u(t, x) (or u(x)) whose partial derivatives Hölder norm by writing | · | p instead of | · | p+ . We also denote by C(Q) the set of functions which are continuous inQ.
We make the following assumptions regarding the problem (1.1).  [36, p. 149]. We note that the unique solution of (1.1) exists when b(t, x) and σ(t, x) are just Lipschitz continuous in x and the domain G is either a convex set or C 2 smooth [70,46,66]. Assumptions 2.1-2.2 are used in subsequent sections to prove optimal (highest possible) order of convergence for the proposed algorithm. At the same time, we note that the algorithm can be used in practice under weaker assumptions. Now, we construct a new algorithm which approximates RSDEs (1.1). Let (t 0 , x) ∈ Q. We introduce the uniform discretization of the time interval [t 0 , T ] so that t 0 < · · · < t N = T , h := (T − t 0 )/N and t k+1 = t k + h.
We consider a Markov chain (t k , X k ) k≥0 with X 0 = x approximating the solution X t0,x (t) of the RSDEs (1.1). Since X(t) cannot take values outsideḠ, the Markov chain should remain inḠ as well. To this end, the chain has an auxiliary (intermediate) step every time it moves from the time layer t k to t k+1 . We denote this auxiliary step by X k+1 . In moving from X k to X k+1 , we apply the weak Euler scheme . . , d, k = 0, . . . , N − 1, are mutually independent random variables taking values ±1 with probability 1/2. Taking this auxiliary step X k+1 while moving from X k to X k+1 represents cautious behaviour and gives us an opportunity to check whether the realized value of X k+1 is inside the domain G or not (see Fig. 2.1). If X k+1 ∈Ḡ then on the same time layer we assign values to X k+1 as However, if the realized value of X k+1 goes outside ofḠ then we need an additional construction so that X k+1 ∈ G (see Fig. 2.2). First, we find the projection of X k+1 onto ∂G which we denote as X π k+1 and we calculate r k+1 = dist(X k+1 , X π k+1 ) which is the shortest distance between X k+1 and X π k+1 . Note that dist(X k , X k+1 ) = O(h 1/2 ), therefore under Assumption 2.1 and for sufficiently small h, the projection X π k+1 of X k+1 on ∂G is unique [12,Proposition 1]. Moreover, the projection X π k+1 and the shortest distance r k+1 satisfy the following equation, X π k+1 = X k+1 + r k+1 ν(X π k+1 ), where ν(X π k+1 ) is the inward normal vector to the boundary ∂G at the projection X π k+1 . Thereafter, we add r k+1 ν(X π k+1 ) to X π k+1 to arrive at a point which we take as X k+1 . This transition from intermediate step X k+1 to X k+1 makes sure that X k+1 ∈ G. We also highlight that X k+1 and X k+1 are symmetric around X π k+1 along the direction ν(X π k+1 ). Therefore, combining the above steps of calculating X π k+1 from X k+1 and then X k+1 from X π k+1 , we have X k+1 = X k+1 + 2r k+1 ν(X π k+1 ). 2: One step transition in two dimensions from X k+1 to X k+1 using projection X π k+1 of X k+1 on ∂G.
We formally write our algorithm as Algorithm 2.1.
Step 4: If k + 1 = N then stop, else put k := k + 1 and return to Step 2. REMARK 2.1. To approximate the RSDEs inside the domain G in Algorithm 2.1, we exploit a particular method with discrete random variables used for approximating the Wiener increments, the weak Euler scheme (2.1). Instead of (2.1), one can use any approximation with local weak order 2 and bounded increments (e.g., the walk over spheres as in [52,55] (see also [10]) or any other method with discrete random variables) and complement it with the symmetric reflection (2.3). Such versions of Algorithm 2.1 have the same convergence properties as the ones proved in this paper for Algorithm 2.1. We restrict ourselves here to the weak Euler scheme (2.1) because it is the simplest scheme and also for definiteness.
3. Solving parabolic PDEs with Robin boundary condition. In Subsection 3.1, we introduce a parabolic PDE with Robin boundary condition along with assumptions required for the existence of its solution and with the link to the reflected diffusion process via the Feynman-Kac formula. Subsection 3.2 describes an extension of Algorithm 2.1 to solve the Robin parabolic problem. We state the main convergence theorem of the proposed algorithm in Subsection 3.3 and prove it in Subsection 3.4.
3.1. Probabilistic representations. Consider the parabolic PDE and Robin boundary condition where ν = ν(z) is the direction of the inner normal to the surface ∂G at a point z ∈ ∂G. We can write equation (3.1) in a more compact form as where (:) denotes the Frobenius product of two matrices, (·) denotes the scalar product of two vectors, c = c(t, x) and g = g(t, x) are scalar functions, and a(t, In addition to Assumptions 2.1-2.2, we also make the following assumptions. , and ψ(t, z) ∈ C 1.5,3 (S).
ASSUMPTION 3.3. The symmetric matrix a = {a ij } satisfies the condition of uniform ellipticity inQ, i.e., there exists a positive constant a 0 such that for all y ∈ R d : Define a sequence of functions v k recursively as ASSUMPTION 3.4. The compatibility condition of order 1 is fulfilled for the problem (3.1)-(3.3), i.e., the following relationship holds: It is known [40] that if Assumptions 2.1-2.2 and 3.1-3.4 are satisfied then the problem (3.1)-(3.3) has a unique solution u(t, x) ∈ C 2,4 (Q) satisfying the inequality where C(T ) is a positive constant dependent on T . We also note that u(t, x) satisfies the PDE (3.1) inQ under Assumptions 2.1-2.2 and 3.1-3.4. Let a matrix σ(t, x) be found from the equation As is known [30,27,36], the probabilistic representation of the solution of problem (3.1)-(3.3) is given by where X t0,x (s), Y t0,x,y (s), Z t0,x,y,z (s), s ≥ t 0 , is the solution of the Cauchy problem for the system of RSDEs dX(s) = b(s, X(s))ds + σ(s, X(s))dW (s) + ν(X(s))I ∂G (X(s))dL(s), 3.2. Numerical method. In this subsection we modify our Algorithm 2.1 to construct a Markov chain (t k , X k , Y k , Z k ) k≥0 with X 0 = x, Y 0 = 1, Z 0 = 0 to approximate the solution u(t 0 , x) of (3.1)-(3.3) at (t 0 , x) ∈Q. We approximate RSDEs (3.8) according to Algorithm 2.1 and complement it by an approximation of (3.9) and (3.10). If the intermediate step X k+1 introduced in Algorithm 2.1, belongs toḠ then we use the Euler scheme: where X π k+1 is the projection of X k+1 on ∂G and r k+1 = dist(X k+1 , X π k+1 ) which is the shortest distance between X k+1 and X π k+1 . The approximation (3.13)-(3.14) is derived via numerical analysis (see Lemma 3.3 and Theorem 3.1) aimed at obtaining first-order weak convergence of the proposed algorithm. We write this modified algorithm as Algorithm 3.1.
Step 4: If k + 1 = N then stop, else put k := k + 1 and return to Step 2.
We note that Algorithm 3.1 can be applied to the Robin problem (3.1)-(3.3) in the layerwise manner. Recall (see [55, and also references therein) that layer methods are deterministic numerical methods for PDEs which are constructed using probabilistic representations of the PDEs' solutions together with the weak-sense approximation of SDEs. Based on Algorithm 3.1, we can write a layer method which is a deterministic method with fictitious nodes for (3.1)-(3.3). In the one-dimensional case such a layer method was proposed in [57] (see also [55,Section 8.4.4]). It was applied to a one-dimensional semilinear parabolic PDE with Neumann boundary condition. Thanks to the results of our paper, we now have a probabilistic representation of that layer method from [57] and hence can prove its global order of convergence. Furthermore, based on Algorithm 3.1, we can construct a layer method for a multi-dimensional semilinear Neumann problem, which is simpler than the layer method proposed in [57] (see also [55,Section 8.5]) for the multi-dimensional case which used the weak approximation from [52]. This connection of Algorithms 2.1 and 3.1 with layer methods also provides further intuition for the approximation of X(t) near the boundary: from the weak-sense perspective, we replace the directional derivative in the Robin boundary condition with the central finite-difference using a fictitious node outside the domain.
3.3. Finite-time convergence theorem. We state the main theorem of this section which gives the estimate for the weak-sense error of Algorithm 3.1 at finite time.
where u(t, x) is solution of (3.1)-(3.3) and C is a positive constant independent of h.
We will prove this theorem in the next subsection. The scheme of the proof is roughly as follows (cf. [52,55]). We first prove two lemmas on weak local errors of Algorithm 3.1: Lemma 3.2 gives order O(h 2 ) for the one-step approximation for the intermediate step X k+1 (i.e., of the Euler approximation) and Lemma 3.3 gives local order O(h 3/2 ) for X k+1 when X k+1 goes outsideḠ. The number of steps when X k+1 ∈Ḡ is obviously O(1/h). The sense of Lemma 3.4 is that the average number of steps when X k+1 / ∈Ḡ is O(1/ √ h). Appropriately combining the three lemmas, we get first order convergence as stated in the theorem.
3.4. Proof of Theorem 3.1. This subsection is devoted to analysis of the error incurred while numerically solving the Robin problem (3.1)-(3.3) using Algorithm 3.1 and the probabilistic representation (3.7). In Subsection 3.4.1 we prove two lemmas regarding the one-step approximation corresponding to Algorithm 3.1. In Subsection 3.4.2 we prove a lemma on the average number of steps when X k+1 / ∈Ḡ. Theorem 3.1 itself is proved in Subsection 3.4.3. We introduce the additional notation to be used in the future analysis: It is not difficult to see that in Algorithm 2.1, under Assumptions 2.1-2.2, r k := dist(X k , X π k ) = O(h 1/2 ) whenever X k+1 ∈Ḡ c . Under Assumption 2.1, we can introduce a d − 1-dimensional surface outside G parallel to ∂G whose distance to ∂G is r which is large enough so that for all k, we have r k ≤ r and r = O(h 1/2 ). We denote this surface as S −r , and we denote the layer between the two surfaces ∂G and S −r as G −r and also Q −r := [T 0 , T ) × G −r .
3.4.1. One-step approximation. In this subsection, we will prove two lemmas. Lemma 3.2 estimates the error of the one-step approximation in moving from X k at time layer t k to X k+1 at time layer t k+1 , i.e. it is about the local weak error of the Euler scheme. Lemma 3.3 estimates the error of the one-step approximation at the same time layer t k+1 in moving from X k+1 to X k+1 given that X k+1 goes outsideḠ.
We will use the following result. It is known [29,Proposition 1.17] that under Assumption 2.1 the solution u(t, x) ∈ C 2,4 (Q) can be extended to a function u(t, x) ∈ C 2,4 (Q ∪ Q −r ). This extension of u(t, x) and its derivatives will be used in proofs where we need to expand u(t, x) around x π ∈ ∂G when x ∈Ḡ c . We pay attention to the fact that u(t, x) and its derivatives are uniformly bounded for (t, x) ∈Q ∪Q −r . LEMMA 3.2. Under Assumptions 2.1-2.2 and 3.1-3.4, the one-step error of Algorithm 3.1 associated with moving from X k to X k+1 is estimated as where C is a positive constant independent of h.
PROOF. This is a standard result (see e.g. [55]) for the weak Euler approximation (2.1), (3.11)-(3.12) used in Algorithm 3.1 but we include its proof here as we will refer to it later (in Section 4), where we will need to take into account dependence of the error on time.
PROOF. It is obvious that the left-hand side of (3.16) is equal to 0 when X k+1 ∈Ḡ.

3.4.2.
Lemma on the number of steps when X k / ∈Ḡ. Consider the Markov chain (t k , X k ) generated by Algorithm 3.1. Recall that X k can take values outsideḠ. If X k ∈Ḡ c , to calculate X k+1 we first determine X k according to (2.3) and then simulate ξ k+1 to find X k+1 as in (2.1).
Let P h =P be the one-step transition operator for the Markov chain (t k , X k ), k = 0, . . . , N : Consider the boundary value problem associated with the Markov chain (t k , X k ): where g(t, x) ≥ 0 and q(x) > 0. The solution to this problem starting from (t, x) = (t k , x) is given by [73,53,55]: The next lemma is related to an estimate of the number of steps which the Markov chain X k spends in the layer G −r that lies outside theḠ. It is used in proving the main convergence theorem (Theorem 3.1). LEMMA 3.4. Under Assumptions 2.1 -2.2, for any constant K > 0 and for sufficiently small h, the following inequality holds where C is a positive constant independent of h.
PROOF. Let r 0 (x) be the distance of x from boundary ∂G. If we take g(t, x) = r 0 (x)I G−r (x) and q(x) = (1 + Kr 0 (x)I G−r (x)) in (3.19), then the solution to (3.19 Introduce the function Since Algorithm 3.1 takes steps according to (2.1)-(2.3), we can write We have the following approximation when x ∈Ḡ l ∪ G −r (cf. [52,55]): where x π l is the projection of x onto the surface S l . Hence, we have where K 1 and K 2 are positive constants which choice will be discussed later in the proof.
Applying the one step operator P to U , for t < T we get We note that if x as well as X 1 ∈Ḡ l ∪Ḡ −r , then due to (3.25) and (3.26) we ascertain Thereafter, we calculate (1 + Kr 0 (x)I G−r (x))P U (t, x) − U (t, x) at points (t, x) lying in different regions identified by four different cases discussed below with the aim of finding g(t, x) satisfying the inequality (3.22). Introduce the region S h = {x| dist(x, S l ) < K 3 h 1/2 } where K 3 is chosen so that for x ∈ G l \S h , in one step transition, any of the 2 d realizations of X 1 cannot cross S l . We also note that for x ∈ S h , in one step transition X 1 may or may not cross the surface S l . In the first case, i.e. in Case 1, we discuss the scenario when x ∈Ḡ −r . In Case 2, we choose x ∈ G l \S h so that X 1 cannot cross S l and remains in G l ∪ G −r . In Case 3, we take x ∈Ḡ\(Ḡ l ∪ S h ) so that all realizations of X 1 also belong toḠ\Ḡ l . We examine the scenario when x ∈ S h in Case 4.
In this case dist(x, In this case ∆X = δ, where δ = b(t, x)h + σ(t, x)h 1/2 ξ. Using (3.24) and (3.28), we obtain where |O(h k )| ≤ Ch k with C > 0 being independent of h, K 1 and K 2 , β 1 (K 2 ) is a function of K 2 , and β 2 (K 1 , K 2 ) is a function of K 1 and K 2 (note that the functions β i here are different than the ones in Case 1).
where |O(h k )| ≤ Ch k with C > 0 being independent of h, K 1 and K 2 and β 2 (K 1 , K 2 ) is a function of K 1 and K 2 .
Now firstly we analyze Case 1. We take K 2 > K+1 4l which ensures that term 1 is always greater than r 0 (x). Then we choose K 1 in a manner that not only term 2 in Case 1 is positive but also f (t, is positive in Case 2 as well as in Case 4. It is evident from second and fourth case that such a choice of K 1 is dependent on K 2 . As one can see, Case 3 trivially satisfies the condition that f (t, and the lemma is proved. COROLLARY 3.1. Under Assumptions 2.1 -2.2, for any constant K > 0 the following inequalities hold and also where C is a positive constant independent of h. PROOF. Again consider the boundary value problem (3.19)-(3.20) related to the Markov chain (t k , X k ) with the solution given by (3.21). If we chose g(t, x) = 1 and q(x) = (1 + r 0 IḠ −r (x)) then the solution of the problem is Keeping this in mind, one can easily check if we take g(t, x) = f (t, x)/h where f (t, x) is constructed according to the cases discussed in Lemma 3.4 then our aim to get , which together with (3.30) yields (3.29).

3.4.3.
Convergence theorem. In Lemma 3.2 we have shown that the order of the onestep approximation in moving from ( and in Lemma 3.3 we have obtained the order O(hr k+1 ) of the one-step approximation in the case X k+1 / ∈Ḡ. Now, we combine these two lemmas along with Lemma 3.4 and Corollary 3.1 to obtain the weak order of convergence of Algorithm 3.1.
Now we apply Lemma 3.2 and Lemma 3.3 to get From (3.11) and (3.13), we have which, using the uniform boundedness of γ(t, x) and c(t, x), (t, x) ∈Q, and recalling where C 1 and C 2 are some positive constants. Then substituting (3.34) . This implies that Assumption 3.3 can be replaced by an appropriate hypoellipticity condition, but we do not pursue such a refinement of our results here. We also note that Milstein's algorithm [52] (see also [55]) does intrinsically require the strong ellipticity (i.e., Assumption 3.3) as its construction rests on changing local coordinates near the boundary.  [40,34] and C(Q) [48]. Under these relaxed conditions, we can also prove the result of Theorem 3.1 (cf. Lemmas 4.9 and 4.11).
4. Computing ergodic limits. In Subsection 4.1, we set the scene required for introducing and analyzing time-averaging and ensemble-averaging estimators for computing ergodic limits. In Subsection 4.2, we introduce continuous time-averaging and ensemble-averaging estimators, and in Subsection 4.3, we present their numerical counterparts, which is computationally the main part of this section. Subsections 4.4 and 4.5 are devoted to error analysis of numerical time-averaging and ensemble-averaging estimators, respectively. Subsection 4.6 addresses the question how close the stationary measure µ of the RSDEs' (4.7) solution X(t) and a stationary measure µ h of the Markov chain (X k ) k≥0 (constructed according to Algorithm 2.1) are.
Here we consider computing ergodic limits using the simple-to-implement Algorithm 2.1. At the same time, we note that for this purpose one can also use Milstein's algorithm [52] (see also [55,Chapter 6]). As we mentioned in the introduction, weak first-order convergence of this algorithm at finite time was proved in [52]. The theoretical results of this section on computing ergodic limits by Algorithm 2.1 can be transferred without any additional ideas required to the use of Milstein's algorithm for computing ergodic limits (see also Remark 3.1).

Ergodic RSDEs and Poisson PDE.
This subsection is divided into four parts. As it was mentioned in the introduction, the main tool for obtaining continuous time-averaging estimators and analysis of errors of numerical time-averaging estimators is the Poisson equation with Neumann boundary condition. We discuss the existence and uniqueness of its solution in Subsection 4.1.1. We consider ergodic limits with respect to the invariant density of the solution X(t) of RSDEs in a bounded domain G in Subsection 4.1.2 and integrals with respect to the normalised restriction of the invariant density of X(t) on the boundary ∂G in Subsection 4.1.3. We showcase our methodology by demonstrating how to sample from a given measure on G or on ∂G using Brownian dynamics (in other words, stochastic gradient system, which is also called Langevin equations in the fields of statistics and machine learning) with reflection on the boundary in Subsection 4.1.4.

Poisson PDE with Neumann boundary condition. Consider the Neumann problem for the Poisson equation
We will need the following assumptions in addition to Assumption 2.1.

Ergodic limits in G. Consider the RSDEs
If Assumptions 2.1, 4.1 and 4.2 hold then there exists a unique invariant probability measure of the process X(t) governed by the RSDEs (4.7). Moreover, this measure is absolutely continuous with respect to Lebesgue measure. We denote this invariant measure by µ(x) and its density by ρ(x), x ∈Ḡ, and we state that ρ(x) > 0 and ρ(x) ∈ C 2 (Ḡ) (see [8] for more details). Indeed, ρ(x) is the solution of the stationary Fokker-Planck equation (4.3), and the restriction ρ(z), z ∈ ∂G, of the invariant density ρ(x), satisfies the boundary condition (4.4). We are interested in computing ergodic limits inside the domain, i.e., in calculating the following integral for some function ϕ(x) ∈ C 2 (Ḡ): We consider the corresponding approximations in Subsection 4.3.

Ergodic limits on ∂G.
There are two aspects related to computing ergodic limits on the boundary ∂G using RSDEs as explained below.
(i) Consider the (restricted) function ρ(z) defined on ∂G. We denote ∂G ρ(z)dz as κ and define One can notice that ρ (z) is a probability density on the boundary ∂G. We can calculate, for some function ψ(z) ∈ C 3 (∂G), To findψ , we first need to calculate (ii) Using L(t) as the random time change process allows us to build a Markov process . We obtain this process X(t) on the boundary ∂G by puttingX(t) = X(L −1 (t)), where L −1 (t) is the right continuous inverse of L(t). Note that lim t→∞ L(t) = ∞ (cf. (4.29)). Denote byF andF t the smallest sigma algebras which make {X(t), 0 ≤ t < ∞} and {X(s), 0 ≤ s ≤ t} measurable, respectively. Further, letP be the probability measure defined byP(B) = P X L −1 (·) ∈ B , where B ∈F (see [67] where f is a bounded measurable function on ∂G, x ∈ ∂G,Ẽ(·) is expectation with respect to the probability measureP. Under Assumptions 2.1 and 4.1-4.2,X(t) has a unique stationary distribution on ∂G [27, p. 174], which we denote asμ and its density we denote asρ. Hence, computing ergodic limits on the boundary ∂G also means evaluating the integral ∂G ψ(z)ρ(z)dz, which is expectation with respect to the invariant law ofX(t): We will discuss the relationship between the densitiesρ(z) and ρ(z) in Subsection 4.2.1.
We consider approximations ofψ andψ in Subsection 4.3. 4.1.4. Gradient system for sampling from a given measure with compact support. This subsection highlights one of the major applications of this paper. Consider the simplified RSDEs dX(s) = b(X(s))ds + σdW (s) + ν(X(s))I ∂G (X(s))dL(s), (4.14) where σ > 0 is a constant.
Under Assumptions 2.1 and 4.2, the above RSDEs has a stationary distribution whose density, ρ(x), solves the following stationary Fokker-Planck equation with boundary condition (note thatb(z) reduces to b(z) in (4.4), cf. [58, pp. 13-15]): Suppose we are given a probability density ρ(x) ∈ C 3 (Ḡ), ρ(x) > 0 for x ∈Ḡ from which we want to sample/with respect to which we would like to compute some integrals. As we mentioned in the introduction, such problems often occur in statistics [72,3,42] and in molecular dynamics [41]. To solve this problem, we can use the RSDEs (4.14) analogously to how SDEs (Brownian dynamics and Langevin equations) are used for this task in R d (see e.g. [65,56,41] and references therein). Indeed, if we take in (4.14) then the equations (4.15)-(4.16) are trivially satisfied. The RSDEs (4.14), (4.17) are a stochastic gradient system with reflection, or, in other words, Brownian dynamics with reflection. Long time simulation of the gradient system (4.14), (4.17) can be used for sampling from a given distribution having density ρ(x) with the compact supportḠ.
As we have emphasized in the introduction, using the methodology developed in this paper we can also sample from the distribution having density ρ (z) with support on the boundary ∂G. To illustrate the normalised restricted density ρ (z), let us look at a simple example. Consider G = {x 2 1 + x 2 2 < 1}, ∂G = {z 2 1 + z 2 2 = 1}, and the density function , where I 0 (β) is the modified Bessel function of order 0. Then the normalised restriction ρ (z) is given by ρ (z) = e βz 1 2πI0(β) , z ∈ ∂G. We write ρ (z) in the polar coordinates to get which is the probability density function of von Mises' distribution. The motivation for considering this particular example lies in the fact that this distribution has a number of applications in directional statistics (see e.g. [49]). Algorithm 2.1 described in Section 2 as well as the estimators of Subsection 4.3 can be used to sample from von Mises' distribution. The same approach is applicable to von Mises-Fisher, Bingham and Kent distributions or any other distribution on d − 1 dimensional hyper-surface ∂G (see a related numerical experiment, Experiment 7.3, in Subsection 7.2), which are widely used in bioinformatics, computer vision, geology and astrophysics (see, e.g. [50]). To conclude, the application of Algorithm 2.1 to the gradient system with reflection (4.14), (4.17) allows us to efficiently sample from any given distribution with the density ρ(x) defined in G and from its normalised restriction ρ (z) on ∂G.

4.2.1.
Time-averaging estimators. Time-averaging estimator forφ from (4.8). Consider the Neumann problem It is not difficult to verify that the compatibility condition (4.6) is satisfied for the problem  .7): Rearranging the terms in (4.20), dividing by t, and using equation (4.19), we get where M (t) = t 0 ∇u(X(s)) · σ(X(s))dW (s) is a martingale with respect to (F t ) t≥0 . Then (4.18) implies We know from Ito's isometry that E M (t) 2 ≤ Ct, where C is some positive constant independent of t. As a consequence of uniform boundedness of u(x) inḠ, we obtain Kronecker's lemma [23,Theorem 3.3] implies M (t) t → 0 a.s. as t → ∞. Again uniform boundedness of u(x), x ∈Ḡ, yields that the last term on the right hand side of (4.22) goes to 0 as t → ∞. Therefore, we have  If we take expectation on both sides of (4.22), we get Consequently, it is natural to take 1 T T 0 ϕ(X(s))ds as a time-averaging estimator ofφ. We can also say based on (4.25) that One may notice that by combining (4.23) and the above bound on the bias, we get Time-averaging estimator forψ from (4.10). To obtain results related to ergodic limits on the boundary, we will consider the PDE problem (4.1)-(4.2) with φ 2 (z) = 1 α(z) or ψ(z)−ψ α(z) , where α(z) depends on a(z) (a(z) is the restriction of a(x) on ∂G), and hence we need a new assumption on a(x) to ensure that φ 2 (z) ∈ C 3 (∂G) and therefore to guarantee that the solution u(x) ∈ C 4 (Ḡ) (cf. Assumption 4.3).

and 4.7 and Lemma 4.4) that the bias of all the above numerical timeaveraging estimators is O(h + 1/T ) and that the second moment of the error of the estimators (Theorems 4.3 and 4.6 and Lemma 4.5) is O(h 2 + 1/T ).
Further, we take the expectation E(ϕ N ) as a discretized ensemble-averaging estimator for computingφ. For calculatingψ , we take the following as discretized ensemble-averaging estimators: IḠc X k+1 I (0,∞) (κ N ) . In Subsection 4.5 we prove (Theorem 4.10) that the error of the estimator E(ϕ N ) is O(h + e −λT ) for some λ > 0 and (Theorem 4.12) that the error of the estimator (4.55) forψ is O(h + 1/T ). The upper index of the sums in (4.55) is N − 2 due to the error analysis (see Remark 4.5). The error of (4.56) is same as the bias of (4.54). L(t) t 0 ψ(X(s))dL(s) was introduced as the continuous time-averaging estimator to calculate the expectation with respect to the invariant law ofX(t), i.e.ψ. To approximateψ, we take the discrete time-averaging estimatorψ N asψ and discretized ensemble averaging estimators as

4.4.
Error analysis for numerical time-averaging estimators. In this subsection our aim is to establish the closeness of numerical time-averaging estimators obtained through the approximation of RSDEs (4.7) by Algorithm 2.1 to their corresponding ergodic limits. Our approach to the error analysis of numerical time-averaging estimators is analogous to the one used in [51] in the case of the usual SDEs.
For brevity, we write , where X k and X π k are as introduced in Algorithm 2.1. From Proposition 1.17 in [29], it is known that under Assumption 2.1, u(x) ∈ C 4 (Ḡ) can be extended to a function in C 4 (Ḡ ∪Ḡ −r ), where G −r was introduced in the beginning of Subsection 3.4. This also implies that u(x) and its derivatives up to fourth order are uniformly bounded onḠ ∪Ḡ −r .
The next lemma is related to an estimate of the number of steps which the Markov chain X k spends in the layer G −r . We note that in comparison with the analogous Lemma 3.4 the estimate in the below lemma has explicit (linear) dependence on time T , which is important for the error analysis of the numerical time-averaging estimators introduced in the previous subsection.
and with an appropriate choice of K. Here w(x) is from (3.23) and l is the distance between S l and ∂G.
We will first consider estimates for bias of the estimatorφ N (see (4.51)) and second moment of its error followed by respective error estimates ofκ N ,ψ N andψ N (see (4.52)-(4.54)).
whereφ N is from (4.51),φ is from (4.8), and C > 0 is independent of T and h.
where C is a positive constant independent of T and h.
PROOF. The proof is based on two steps. We first square both sides of (4.65) and use the estimates (4.62)-(4.63). The second step is to apply the following inequality obtained by using (4.59) and (4.61): where R k+1 := R 5,k+1 + R 6,k+1 and C is a positive constant independent of T and h. Then one can obtain the desired result using Lemma 4.1 and combining the above stated two steps. Assumption 4.2 suffices for proving the previous two theorems. However, in the subsequent lemmas and theorems we will need Assumption 4.2 .
Now we proceed to error analysis for time-averaging estimators related to ergodic limits on the boundary. The next lemma is proved analogously to Theorem 4.2.
whereκ N is from (4.52), κ is from (4.9),ψ N is from (4.53),ψ is from (4.11), and C is a positive constant independent of T and h.
The next lemma is proved analogously to Theorem 4.3.
where C is a positive constant independent of T and h.
Using Chebyshev's inequality and (4.68), we have where C > 0 is independent of T and h.
whereψ N is from (4.54),ψ is from (4.10), and C is a positive constant independent of T and h.

By the Cauchy−Bunyakovsky−Schwarz inequality, we get
where C > 0 is independent of T and h.

4.5.
Error analysis for numerical ensemble-averaging estimators. In this subsection our aim is to estimate errors when we approximate the stationary averagesφ andψ using discretized ensemble-averaging estimators introduced in Subsection 4.3. We split our discussion in two parts. Firstly, we will consider the error of the ensemble-averaging estimator forφ. The proof exploits the backward Kolmogorov (parabolic) equation (cf. the similar approach in [69] for the case of SDEs in R d ). In the second part, we will estimate the error of the numerical ensemble-averaging estimator forψ . We take t 0 = 0, therefore Q = [0, T ) × G.
Consider the parabolic equation with non-zero terminal condition and homogeneous boundary condition:  [40,34]. Further, u(t, x) ∈ C(Q) [48]. The probabilistic representation of the solution is given by (see Section 3.1): where X(s) is the solution of the RSDEs (4.7). Introduce the norm for t ∈ [0, T ) ..x j d and j is a multi-index. We make the following natural assumption for the solution of the Neumann problem (4.73)-(4.75).  A bound of the form (4.76) but with supremum over G is given in Theorem 4.1 of [33] under Assumptions 2.1, 4.1 and 4.2. We also mention that such a decaying bound for the gradient of the solution was proved for the heat equation with homogeneous Neumann boundary condition in [62]. A bound similar to (4.76) is proved for the Cauchy case in [69].
where C > 0 is independent of T and h. The above bounds directly follow from Lemmas 3.2 and 3.3 together with Assumption 4.4 if we note that under Assumption 4.4, the terms appearing in R 1,k+1 in Lemma 3.2 will be bounded by Ch 2 e −λ(T −tk) and the terms R 3,k+1 and R 4,k+1 in Lemma 3.3 by Cr 3 k+1 e −λ(T −tk+1) . Next, we state an appropriate lemma related to the average number of steps which the chain X k spends in G −r . LEMMA 4.8. Under Assumptions 2.1 and 4.2, the following inequality holds for any λ > 0: where C is a positive constant independent of T and h.
PROOF. We take U (t, x) in Lemma 3.4 as (cf. (3.27)) where u(0, x) is the solution of (4.73)-(4.75) and C > 0 is independent of T and h.
PROOF. Analogously to the proof of Theorem 3.1, we have (cf. (3.32)): then using (4.77) and (4.78), we obtain Notice that due to Taylor expansion of u(t N , Using Lemma 4.8, we obtain The proof of the next theorem directly follows from combining (4.50) and (4.81).
where C is a positive constant independent of T and h.
Now we move to estimating the error of the discretized ensemble-averaging estimators corresponding to (4.55). Notice that the error bound for the discretized ensemble-averaging estimator (4.56) has already been proved in Theorem 4.7.
Consider the parabolic equation with zero terminal condition and non-zero Neumann boundary condition:  [40,34]. Further, u(t, x) ∈ C(Q) [48]. The probabilistic representation of the solution is given by (cf. (3.7)): where X(s) is the solution of the RSDEs (4.7). We make the following natural assumption for the solution of (4.83)-(4.85).
ASSUMPTION 4.5. The following inequality holds uniformly for (t, x) ∈ [0, T ) ×Ḡ: where C is a positive constant independent of T .
With respect to Assumption 4.5, we note that the solution of the problem (4.83)-(4.85) converges to the solution of a stationary problem [28,45] when time T goes to infinity (cf. the case of the problem (4.73)-(4.75) where the solution converges to a constant). Also, note that the solution u(t, x) can be expanded asymptotically in powers of 1/(T − t) (see Theorem 3 and Section 7 in [28] for more details).
where u(0, x) is from (4.86) and C is a positive constant independent of T and h.
Then, using (4.87) and (4.66), we ascertain REMARK 4.6. For the discretized ensemble-averaging estimators introduced in Remark 4.3 forψ, we can get the following error bound in the same manner as we did in Theorem 4.12 forψ , where C > 0 is independent of T and h.

4.6.
Error analysis for a stationary measure of Algorithm 2.1 . We can express Algorithm 2.1 (its part for X k ) as the following recursive formula (F (x, ξ), ∂G), ν(F (x, ξ)) = ν(F π (x, ξ)), and F π (x, ξ) is the projection of F (x, ξ) on ∂G for all realizations of ξ, x ∈Ḡ, and F (x) ∈Ḡ ∪Ḡ −r . Under Assumptions 2.1 and 4.2, F (x, ξ), F π (x, ξ) and ν(F (x, ξ)) are continuous functions onḠ for every realization of ξ. Note that for y = F (x, ξ) ∈Ḡ ∪Ḡ −r , IḠc(y) is discontinuous but r 0 (y)IḠc(y) is continuous onḠ ∪Ḡ −r provided Assumption 2.1 holds. This implies that H(x, ξ) ∈Ḡ is a continuous function for all x ∈Ḡ and for every realization of ξ. Consider a function g(x) ∈ C(Ḡ), where C(Ḡ) denotes the class of continuous functions onḠ. Now take a sequence {x n } n∈N such that x n → x as n → ∞. From the above discussion, it is clear that g(H(x n , ξ)) → g (H(x, ξ)) a.s. as n → ∞. Using the bounded convergence theorem, we obtain Eg(H(x n , ξ)) → Eg (H(x, ξ)) as n → ∞. This shows that P g(x) ∈ C(Ḡ), where P g(x) is the one-step transition operator defined as P g(x) = E(g(X 1 )|X 0 = x). Hence (X k ) k≥0 is a Feller chain [61]. Since the state spaceḠ is compact, it follows from the Krylov-Bogoliubov theorem [20] that there exists a stationary measure µ h of X k . We note that µ h is, as a rule, not unique.
In this subsection, our focus is on how close µ and µ h are. To this end, introduce the metric D between two probability measures µ 1 and µ 2 : where H = {f :Ḡ → R and | f | where C is a positive constant independent of h.
PROOF. Denote by E x the conditional expectation with conditioning on the initial point where X k is from Algorithm 2.1 and f ∈ H. Therefore, we can write

Using Theorem 4.2, we get
wheref is the expectation of f with respect to the invariant measure µ and (in comparison with Theorem 4.2) the constant C does not depend on f as here we consider only f ∈ H. By letting N → ∞ in the above equation, we obtain the stated result.
5. Solving elliptic PDEs with Robin boundary condition. In Subsection 5.1, we introduce elliptic PDEs with Robin boundary condition and discuss their link to reflected diffusion processes via the Feynman-Kac formula. We use Algorithm 3.1 to numerically solve elliptic PDEs with decay. However, the same algorithm does not work (we highlight the reason for that later) when employed to solve the Poisson PDE with Neumann boundary condition. To deal with that, we propose a new adaptive time-stepping algorithm in Subsection 5.2. We state the two convergence theorems for the two cases (with decay and without decay) in Subsection 5.3 and prove them in Subsection 5.4.

Probabilistic representation.
Consider the following elliptic equation with Robin boundary condition where A was introduced in Subsection 4.1.1. We make the following assumptions in addition to Assumptions 2.1, 4.1 and 4.2.

5.2.
Numerical method for the Poisson equation. We will numerically solve elliptic PDE (5.1) with Robin boundary condition (5.2) using Algorithm 3.1 and will prove its first order of convergence when T → ∞ in Subsection 5.4. However, if c(x) = 0 and γ(z) = 0, i.e. if we consider the Poisson problem (4.1)-(4.2), and use Algorithm 3.1 to numerically solve it then the algorithm's error increases linearly with N for fixed h ∈ (0, 1), i.e., Algorithm 3.1 diverges for this case when T → ∞. Therefore, this case needs a different approach and a new algorithm. This new algorithm (Algorithm 5.1) constructed in this subsection is based on double partitioning of the time interval [0, T ] and is convergent in h and T , while its computational cost grows with decrease of tolerance only slightly higher than linear. This idea is of potential interest in other stochastic numerics settings.
We discretize the interval [0, T ] as follows where N j is the number of steps and h j is the time step size in the interval (T j−1 , T j ]. The h j and N j are given by where 0 < ≤ β ≤ 1, h > 0 is a fixed sufficiently small number, and Υ > 0 is a constant chosen independently of T . One can see that ∆ j T ≤ Υ/j . Now we define a constant Λ as the smallest natural number so that the following inequality is satisfied: We note that Λ is well defined because of ∆ j T ≥ Υ/j − h j and the condition 0 < ≤ β ≤ 1.
The total number of steps until time T j is equal to and set N 0 = 0 and N = N Λ . Note that ∆ j T is independent of h.

and Goto
Step 3, else Stop.
Step 5: Put k := k + 1 and if k < N j then return to Step 3 else return to Step 2.
where Z N is calculated according to Algorithm 3.1 approximating the solution u(x) of (5.1)-(5.2), and C and λ are positive constants independent of T and h.
The proofs follow the same procedure as in Subsection 3.4, i.e. we first obtain error estimates for local approximation and prove a lemma related to the number of steps the Markov chain X k spends inḠ −r , and then based on them we prove the main convergence theorems. Computational complexity of Algorithm 5.1 together with an optimal choice of and β is discussed at the end of Subsection 5.4.2.
We need the following notation, where C is positive constant independent of h and T . With the change of notation as introduced in the beginning of the current subsection, we can obtain (5.12) and (5.13) by following exactly the same procedure as in the proof of Lemmas 3.2 and 3.3, respectively. The independence of C from T trivially follows from the fact that we are dealing with the elliptic equation.
The next lemma gives a bound related to the number of steps which the chain X k spends inḠ c . LEMMA 5.3. Under Assumptions 2.1, 4.1 and 4.2, the following bound holds for λ > 0 and sufficiently small h > 0: where C is a positive constant independent of T and h.
PROOF OF THEOREM 5.1. Using the notation introduced earlier in this section, we begin the analysis with an inequality analogous to (3.32) from the proof of Theorem 3.1 combining which with (5.12) and (5.13), we get From (3.11) and (3.13), we have Under the assumption γ(z) ≤ 0 and sufficiently small h > 0, we have 2r where λ = min x∈Ḡ |c(x)| > 0. Then substituting (5.18) in (5.17), we obtain IḠc(X k+1 ) a.s., (5.21) where C is a positive constant independent of T and h.
Recall that the layerḠ −r was introduced in Subsection 3.4. We now state a lemma which is related to the number of steps of X k in G −r . For that purpose, we first consider a surface S lj which belongs to G and is parallel to the boundary ∂G. The distance between ∂G and S lj is where R is the radius of the uniform interior sphere inside G. We take R/2 in (5.22) so that a surface S lj , which is parallel to ∂G, exists when R is not sufficiently large. We assume that h > 0 is sufficiently small so that l j >> h 1/2 j . By virtue of Assumption 2.1, l j -neighbourhood of any point x ∈ S lj entirely belongs to G. The layer between S lj and ∂G will be denoted as G lj . LEMMA 5.4. Under Assumptions 2.1 and 4.2, the following inequality holds where C is a positive constant independent of ∆ j T and h.
PROOF. Let us fix j∈{1, . . . , Λ} and hence consider t ∈ [T j−1 , T j ]. Define a function U (t, x) as where w(x) is from (3.23) but with distance from the surface S lj . Introduce the region S hj = {x | dist(x, S lj ) < K 1 h 1/2 j }. We choose K 1 so that for any point x ∈ G lj \S hj , there is zero probability that in one step transition any of the 2 d realizations of X j,1 cross the surface S lj , where X j,1 is constructed according to Algorithm 5.1 given X j,0 = x. Note that the region S hj contains points x starting from which one-step realizations of X j,1 may or may not cross the surface S lj . As we did in Lemma 3.4, we calculate P U (t, x) − U (t, x) at points (t, x) lying in different regions identified by four cases: (I) x ∈ G\(G lj ∪ S hj ), when X j,1 remains in G\G lj ; (II) x ∈ S hj ; (III) x ∈ G lj \S hj ; (IV) x ∈Ḡ −r . We analyze P U (t, x) − U (t, x), calculated according to the above four cases, analogously as in the proof of Lemma 3.4 to obtain the desired bound.
The next lemma gives an error estimate for Algorithm 5.1 applied to (4.1)-(4.2) accumulated over an interval (T j−1 , T j ].
where C > 0 is independent of h j and ∆ j T .
PROOF. We can write Using Using Lemma 5.5, we get From (5.22), we know that either l j = (∆ j T ) 1/2 or l j = (R/2) ∧ 1. Therefore, we have Now, consider the parabolic problem (4.73)-(4.75) whose solution we denote as v(t, x) instead of u(t, x) and whose terminal condition at t = T Λ is u(x) instead of ϕ(x). In the current setting we have assumed that the solution v(t, x) satisfies the following bound where C > 0 is independent of T Λ . Then, analogously to the previous lemma, we can prove where C > 0 is independent of T Λ (hence also of T ) due to (5.26). Note that to obtain (5.27) in the interval (T Λ−1 , T Λ ], we use the facts that Eu(X N ) = Eu( Lemma 4.9). Therefore, we have (cf. (5.25)) which together with (4.50) (though with the appropriate change of notation) gives whereū is from (5.3), and C and λ are positive constants independent of T . Using (5.3), (5.25), (5.28) and the fact that 1 < /2 + β ≤ 3/2, we obtain We now discuss the cost of Algorithm 5.1. Recall that the cost of Algorithms 2.1 and 3.1, in all applications considered in this paper, is proportional to 1/h. First, let us define (cf. (5.11)) the tolerance of Algorithm 5.1 as tol := 1 2 (h + e −λT ). Taking h = e −λT (i.e., equal contributions from the two sources of the total error), we get tol = h and T = − ln h λ . Further, we choose Λ to guarantee T ≤ T Λ ≤ Λ j=1 Υ j . Let = 1. Then T ∼ ΥΛ − +1 . Therefore, we need Λ ∼ | ln h| 1/(1− ) (we dropped λ and Υ because here our only concern is to know how the cost grows with decrease of the tolerance tol via decrease of h). The cost of Algorithm 5.1 is appropriate to measure via the number of steps of a single trajectory of the algorithm which is equal to We need to choose and β so that factor β 1− is small in the cost while the constant 1 /2+β−1 in the error (5.29) is not too big under the constraints /2 + β − 1 > 0 and 0 < ≤ β ≤ 1. In this case it is optimal to take β = 1 and small . For example, = 1/10 gives β 1− = 10/9, h . Hence, we infer that with appropriate choice of and β, the cost of Algorithm 5.1 is only slightly higher than linear in 1/h.
We note that instead of using the two level partitioning of the time interval [0, T ] we could use the following single partition of [0, T ]: T = Λ k=1 h k β with 0 < β < 1. For this single partition, by similar arguments as in the proof of Theorem 5.2, we would only prove Here again it is natural to define the tolerance tol := 1 2 (h 3/2 + e −λT ) and again to take h 3/2 = e −λT . Then the cost in terms of tol is tol 2/(3(1−β)) , 2/3 < β < 1, which is significantly worse than linear increase in 1/tol (e.g., for β = 2/3, we have Cost ∼ | ln tol| 3 /tol 2 ). This emphasizes the importance of the idea of double partitioning of the time interval in Algorithm 5.1.
6. Extensions. In Section 6.1, we present an algorithm to approximate RSDEs (3.8) with second order of weak convergence based on adaptive time stepping near the boundary. In Section 6.2, we generalise Algorithm 2.1 to approximate RSDEs in weak sense when reflection is in an inward oblique direction.
6.1. Second-order approximation. In this subsection we modify Algorithm 2.1 to construct a second-order weak approximation of the solution X(t) of the RSDEs (1.1) on an interval [T 0 , T ] (for simplicity, we do not include here a second-order approximation of (3.9)-(3.10)). To obtain a second-order method for (1.1), we need an approximation of weak local order 3 for SDEs without reflection, which we will use inside the domain G, and we need a more accurate approximation of the reflection than in Algorithm 2.1. To achieve the latter, we will introduce an adaptive (random) time step so that the auxiliary chain X k belongs tō G ∪Ḡ −r with r = Θ(h) , where the notationḠ −r was introduced in Subsection 3.4.2 and the Landau Big Theta notation, Θ(h), implies that there are constants C 1 , C 2 > 0 independent of h such that C 1 h ≤ r ≤ C 2 h. When X k ∈Ḡ −r , we will use the same reflection as in Algorithm 2.1, but because the layerḠ −r is of size Θ(h) in this subsection, the one-step error of reflection will be O(h 3 ). Note that in Algorithm 2.1 X k ∈Ḡ ∪Ḡ −r with r = O(h 1/2 ) and the one-step error of reflection was O(h 3/2 ).
Let us write a generic approximation of weak local order 3 in the form suitable for this section: where h > 0 is a time step, ξ is a random vector the components of which are some bounded mutually independent random variables and δ is such that for all x ∈Ḡ with X t,x (t + h) being a solution of the SDEs (1.1) from which the reflection term is excluded, and f being an arbitrary sufficiently smooth function with bounded derivatives. Note that δ and ξ in (6.1) used in this subsection are different from δ and ξ associated with Algorithm 2.1 which have been used everywhere else in this paper except this Subsection 6.1. There are various numerical approximations satisfying (6.1)-(6.2) (see e.g. [55] and references therein). Introduce a layer S t,h ∈Q such that if (t, x) ∈Q\S t,h then X from (6.1) belongs toḠ and if (t, x) ∈ S t,h then at least one realization (i) X of X does not belong toḠ. Consequently, if (t, x) ∈Q\S t,h , we can use (6.1) to approximate (1.1).
If X ∈Ḡ, we set X = X , otherwise where as usual r 0 = dist(X , X π ) and X π is the projection of X on ∂G.
Having the above two ingredients, we now can construct a Markov chain X k approximating the solution of (1.1) in the weak sense, which is formulated as Algorithm 6.1.
Step 6: If τ k+1 ≥ T − h 2 then the random number of steps κ = k + 1, the final state of the chain Xκ = X k+1 and STOP; else k = k + 1 and Goto Step 2.
Using the ideas introduced to study first order convergence of Algorithm 2.1 in Section 3, one can prove convergence of Algorithm 6.1 with weak order 2, i.e.
where X t,x (T ), T 0 ≤ t ≤ T , is the solution of (1.1) and C > 0 is a constant independent of h. It can be shown that the average number of steps of Algorithm 6.1 is O(1/h). The Markov chain (τ k , X k ) on average spends O(1/h) steps in the domainQ\S τk,hk and also O(1/h) steps when (τ k , X k ) ∈ S τk,hk . The proof of the latter exploits the rule how θ k is chosen in the algorithm and also that, thanks to the definition of S t,h , if (τ k , X k ) ∈ S τk,hk then at least one realisation of X k+1 / ∈Ḡ.

6.2.
Oblique reflection. Consider the RSDEs dX(s) = b(s, X(s))ds + σ(s, X(s))dW (s) + η(X(s))I ∂G (X(s))dL(s), where η is a unit vector field belonging to class C 4 , and we assume that there exists a constant c 0 > 0 such that for all z ∈ ∂G. The uniqueness and existence of the strong solution of SDEs with reflection in the oblique direction was proved in [46] under weaker assumptions than Assumptions 2.1-2.2, but we will continue using Assumptions 2.1-2.2 to ensure first-order of weak convergence of our algorithms. We note that η in this subsection is an arbitrary oblique direction, while in Section 4 η was the co-normal, however this should not lead to any confusion. For applications of (6.3) see e.g. [4].
We denote r k+1 as the distance of X k+1 from the boundary ∂G along the direction η(X π k+1 ), i.e. r k+1 = dist(X k+1 , X π k+1 ). If X k+1 goes outsideḠ, the following symmetric step is taken: . This algorithm is presented in a formalized form as Algorithm 6.2. One can prove its firstorder of weak convergence analogously to proofs for Algorithm 3.1 considered in earlier sections.
Step 4: If k + 1 = N then stop, else put k := k + 1 and return to Step 2.
7. Numerical experiments. In this section we perform numerical experiments to support the theoretical results obtained in Sections 3-5.
To evaluate the expectation EΓ where Γ is a generic random variable with some finite moments, we use the Monte Carlo technique in the usual fashion: The 95% confidence interval for EΓ isΓ M ± 2 D M .
with terminal condition and Neumann boundary condition The solution of the above problem is given by u(t, x 1 , ). The exact solution at (t, x 1 , x 2 ) = (0, 0, 0) with R = 2 and T = 1 is 34.1970 (4 d.p.). We note that the construction of the model problem in this experiment follows the same path as in [54] (see also [55,Chapter 6]).
We consider the absolute error e = |u(0, 0, 0) −ǔ M |, whereǔ M is the Monte Carlo estimator forū(0, 0, 0) = E[u(T, X N )Y N + Z N ] which approximates u(0, 0, 0). Here (X N , Y N , Z N ) is due to Algorithm 3.1 applied to the problem (7.2)-(7.4). The results are presented in Table 7.1 and Figure 7.1, which demonstrate that the numerical integration error incurred in solving the parabolic problem (7.2)-(7.4) is of order O(h) and hence it is consistent with the prediction of Theorem 3.1.  There are three types of errors incurred while computingφ andψ via numerical timeaveraging estimatorsφ N andψ N [56,51]: (i) the error due to discretization of RSDEs (4.7) (the numerical integration error) which is estimated by Ch, (ii) the error incurred because integration is over finite time T (the time truncation error) which is estimated by C/T , and (iii) the statistical error. The statistical error is also controlled by the final finite time T (see e.g. [51] for further discussion on the statistical error).
It is not difficult to verify that the density function ρ(x 1 , x 2 ) satisfies the stationary Fokker-Planck equation (4.3) with Neumann boundary condition (4.4) corresponding to the RSDEs (7.6).