Variability of paths and differential equations with $BV$-coefficients

We define compositions $\varphi(X)$ of H\"older paths $X$ in $\mathbb{R}^n$ and functions of bounded variation $\varphi$ under a relative condition involving the path and the gradient measure of $\varphi$. We show the existence and properties of generalized Lebesgue-Stieltjes integrals of compositions $\varphi(X)$ with respect to a given H\"older path $Y$. These results are then used, together with Doss' transform, to obtain existence and, in a certain sense, uniqueness results for differential equations in $\mathbb{R}^n$ driven by H\"older paths and involving coefficients of bounded variation. Examples include equations with discontinuous coefficients driven by paths of two-dimensional fractional Brownian motions.


Introduction
We prove new results on the existence and regularity of generalized Lebesgue-Stieltjes integrals t 0 ϕ(X u )dY u , t ∈ [0, T ], (1.1) as in [81,98,99], where X : [0, T ] → R n and Y : [0, T ] → R are Hölder continuous functions with sum of Hölder orders greater than one and ϕ : R n → R is a function locally of bounded variation, [5,102], possibly discontinuous. We then employ these results to study equations in R n of form where Y is a given path in R n , Hölder of order γ > 1 2 , and σ is a (bounded) matrix valued function of locally bounded variation. We implement a Doss transform, [27,95], and use it to construct Hölder continuous solutions X to (1.2), unique in a certain class. This produces novel first results for discontinuous coefficients σ in dimensions n ≥ 2. The main difficulties are to provide a meaningful definition of the compositions ϕ(X) (resp. σ(X)) and to show they are regular enough for the integrals in (1.1) or (1.2), respectively, to make sense and for the Doss transformation method [27,95] to work. Our main tool is a quantitative condition which ensures that X spends little time in regions where the gradient measure of ϕ (resp. σ) is very concentrated.

Related literature
To study equations of type (1.2) for deterministic integrators Y of low regularity or for probabilistic integrators Y lacking semimartingale or other good distributional properties, the use of Stieltjes type integrals, [81,97,98,99], and much more generally, the theory of rough paths, [47,73,74,75], have become established tools. However, rather little is known about equations with irregular diffusion coefficients σ, and we are only aware of the few references mentioned below. In view of possible applications it seems particularly desirable to obtain results for discontinuous diffusion coefficients. They become necessary if one wants to model sharp interfaces between different media at which the solution X abruptly changes its speed. If n = 1 and X solves (1.2) with σ = a + b1 (−c,c) , where a > 0, b > 0 and c ∈ R, then the movement of X, dictated by the driver Y , is faster inside the sharply bounded strip (−c, c) than outside. If n = 2 and Y = (Y 1 , Y 2 ), then a (2 × 2)-matrix σ, with each entry being such a discontinuous function, could be used to determine polygonal regions inside of which the accelerating effect of Y 1 or Y 2 on the respective components X 1 and X 2 of X = (X 1 , X 2 ) is amplified or damped; this may be of interest for mixed market models [10,11,22].
Stochastic differential equations with respect to Brownian motion involving non-Lipschitz (drift or diffusion) coefficients can be discussed in several different ways, [83]. Most classical and recent results for singular, irregular, or degenerate coefficients, and notions of uniqueness such as [25,33,37,64,69,91,101], and results specific to the onedimensional case, [29,30,68,79], are built upon the connection to diffusion theory and partial differential equations. For equations (1.2) driven by rough deterministic or fractional Gaussian signals Y , such tools are not available.
For an integrator Y that is Hölder of order γ > 1 2 , Peano existence for solutions to (1.2) in R n is well known for coefficients σ that are s-Hölder continuous provided that γ > 1 1+s . Moreover, Picard existence and uniqueness holds if the coefficient is C 1,s (with the same s), [47,75]. See [76,81,98,99] for the more classical Lipschitz resp. C 2 -cases. In [70], new results have been obtained for equations (1.2) in R n for the case γ < 1 1+s . For n = 1 and continuous coefficients σ whose reciprocal is integrable on compact intervals around zero, the authors of [70] constructed solutions to (1.2) by means of a Lamperti transform [66], see [70,Theorem 3.7]. For n ≥ 1 they can solve (1.2) if the components of the coefficient σ are bounded from below by |x| s , their gradients are Hölder continuous of order 1 γ − 1 away from zero, and the integral is understood in terms of Riemann sum approximations, [70,Theorem 4.15]. The first results on the existence of Stieltjes integrals with discontinuous coefficients (in the case n = 1) were obtained in [20] (see also [19,Chapter 5]). There the authors proved the existence of (1.1) if ϕ is of locally finite variation and X is a sufficiently active path, [20,Theorem 3.1 and Remark 3.3]. For random X an integrability assumptions on its probability densities ensures this condition. They also prove a change of variable formula and several results on the approximation of (1.1) by Riemann-Stieltjes sums. A first study for differential equations (1.2) was provided in [43], where the authors prove existence and uniqueness of solutions to (1.2) for n = 1 if Y is a fractional Brownian motion with Hurst index greater 1 2 and ϕ is a (scaled) Heaviside step function. The authors of [43] used a Lamperti transform and smoothing arguments. Merging the assumptions from [20] and the transform used in [43], the authors of [89] were able to prove existence and uniqueness for (1.2) in the case n = 1 and in a probabilistic setup. Finally, we mention [96], where an alternative existence proof for integrals of type (1.1) was given using Riemann-Stieltjes approximations and suitable controls (avoiding fractional calculus), extending the results of [20].

Brief description of our approach
In the present paper we use a quantitative condition on the given individual path X w.r.t. the given coefficient ϕ (or σ), which we call (s, p)-variability, Definition 2.1. It may be seen as an deterministic version of the probabilistic Assumption 2.1 of [89], and as a higher dimensional analog of a condition in [96,Corollary 3]. Our first main result is Theorem 2.12, where we state that the composition ϕ(X) of ϕ with a Hölder path that is (s, 1)-variable w.r.t. ϕ is well defined and a member of a certain fractional Sobolev space, ensuring the existence of (1.1). The stronger assumption of (s, p)-variability with large p guarantees that (1.1) is Hölder continuous. A key step to obtain these results is a multiplicative estimate for Gagliardo seminorms of ϕ(X), Proposition 4.29, that can be viewed as a generalization of [20,Proposition 4.6] ( [89, Proposition 4.1]) to higher space dimensions. To obtain this estimate, one bounds differences of type |ϕ(X t ) − ϕ(X u )| in terms of a fractional maximal functions of the total variation Dϕ of the gradient measure of ϕ, Proposition C.1; this is a fractional version of a prominent argument, [23,Lemma A.3]. Then one estimates further using the fact that the fractional maximal functions of order 1 − s are trivially bounded by Riesz potentials of Dϕ of order (1−s), evaluated at X t (resp. X u ). The (s, p)-variability condition just means that these functions have the desired integrability in time. The (s, 1)-variability of X w.r.t. ϕ is tantamount to saying that the total variation Dϕ of the gradient measure of ϕ and the occupation measure µ [0,T ] X on [0, T ] have a finite mutual Riesz energy of order 1 − s, see Remark 4.2. Phenomenologically this means that these two measures are sufficiently disperse with respect to each other to make the singular (repulsive) interaction kernel of order −n + 1 − s integrable, which is a polarized version of well known arguments, see [31,67,77] for background and [49] for a related application. Mutual Riesz energies are not necessarily easy to handle, but they encode dimensional properties in a neat way and this permits to easily connect to the well known scaling properties of gradient measures of BV -functions, [5,Section 3.9], and well known scaling properties of fractal curves, [31], such as realizations of prominent stochastic processes, [7,31,58,94]. If in a certain region of space ϕ has a jump or strong oscillation so that Dϕ is too concentrated, this can be compensated if X is so fast moving in that part of space that the Hausdorff dimension of µ [0,T ] X is sufficiently high to guarantee sufficient integrability, see Corollary 4. 16. The idea that increased activity of a path can compensate low coefficient regularity is also central in regularization by noise, [18,25,35,37,42,45,48,64], see Remark 4.3. It is closely related to the notion of irregularity studied in [18] and [41], see Subsection 4.5. However, while irregularity is a property of the path alone, variability is a property of a path relative to a given coefficient. We begin our discussion of differential equations (1.2) by showing that a uniform boundedness condition on the Riesz potential of order 1 − s of the total variations of the gradient measures naturally takes us back to the case of γ-Hölder coefficients with γ > 1 1−s , so that a well known Peano existence argument applies, Theorem 3.6. This is the extreme case, where the activity of the solution path is not used. To obtain existence results taking into account the activity of the solution path, we implement a Doss transform for BV -coefficients σ under the main assumptions that σ is invertible, Assumption 3.12, its inverse has curl-free columns, Assumption 3.15, and an angle condition holds, (3.10). Of course in particular the curl free condition is quite restrictive, but as in classical implementations of the Doss method, [95], it is inevitable. In lack of other existence results for equations with BV -coefficients it seems reasonable to establish a BV -variant of Doss' transformation under these assumptions. They guarantee the existence of a Lipschitz function f so that, roughly speaking, a solution X is obtained as an image of the driver Y under f . This uses the fact that dimensional lower bounds for Y are stable under Lipschitz transformations and produces our second main result, Theorem 3.24, which states the existence of Hölder continuous solutions X to (1.2) with BV -coefficients σ. A one-dimensional version of this result, partially under less restrictive assumptions, is formulated in Theorem 3.8. In Theorem 3.25 we assume that the occupation measure of Y satisfies a kind of weighted upper regularity condition and that the gradient measures of the coefficient obey a specific moment condition 'at the starting point'. Under these assumptions we can again observe the existence of Hölder solutions. These theorems are purely deterministic, the regularization effect of the irregular path is rather mild. Our third main result is Corollary 3.26. It is a probabilistic variant of Theorem 3.25, in which we assume that Y is a stochastic process satisfying the weighted upper regularity condition in a mean value sense and obtain Hölder continuous solutions for almost every realization of Y . It may be applied to fractional Brownian motions Y in R n with n ≥ 2 and Hurst index H > 1 2 . One can regard Corollary 3.26 as (a partial) extension of the probabilistic [89, Theorem 2.1]. Our fourth main result is a related uniqueness result, Theorem 3.28. It shows that Assumptions 3.12 and 3.15 guarantee uniqueness in the class of variability solutions.
It would be desirable to replace the Doss transform by standard fixed point arguments. The main open problem to be settled is to prove that -under reasonable assumptions -the integral process itself will be variable. Another goal for future research is to target equations that, in addition to a BV loc -diffusion coefficient, involve a drift vector field of low regularity. First results on variability and compositions involving discontinuous paths can be found in [55].

Structure of the article
The structure of the article is as follows: In Section 2 we introduce the notion of (s, p)variability, define compositions ϕ(X) (σ(X), respectively), and state our results on existence and properties of (1.1). We also provide a change of variable formula and a result on Riemann sum approximation. Section 3 contains our results on existence and uniqueness of variability solutions to (1.2). In Section 4 we provide a systematic discussion of (s, 1)-variability, some of its immediate consequences, conditions sufficient to ensure it, and some probabilistic examples. We briefly compare variability to irregularity, verify the mentioned multiplicative estimate, the properties of (1.1) and the change of variable formula; we also point out links to currents. The Doss transformation and the claimed existence and uniqueness results for (1.2) are proved in Section 5. Basic facts on Riesz kernels, mollification, maximal functions, and fractional calculus are collected in Appendices.
By | · | we denote the Euclidean norm in R n . We write B(x, r) for the open ball of radius r > 0 centered at x ∈ R n . The symbol L n stands for the n-dimensional Lebesgue measure and the symbol H d for the d-dimensional Hausdorff measure on R n . For spaces of R m -valued functions we use notations like BV (R n ) m (to stay close to reference [5]) or L 1 loc (R n , R m ) (because it is more practical for other function spaces). For m = 1, we suppress R m from notation and write L 1 loc (R n ). For a Borel measure ν on R n , we denote its (topological) support by supp ν.

Compositions of paths with BV -functions
Recall that a function ϕ ∈ L 1 loc (R n ) is of locally bounded variation, denoted ϕ ∈ BV loc (R n ), if its distributional partial derivatives D i ϕ are signed Radon measures, i = 1, ..., n. We write Dϕ = (D 1 ϕ, ..., D n ϕ) for its R n -valued gradient measure, and Dϕ for the total variation of Dϕ. If ϕ ∈ L 1 (R n ) and Dϕ (R n ) < +∞, then ϕ is said to be of bounded variation, ϕ ∈ BV (R n ).
Let T > 0. We consider continuous paths from [0, T ] into R n , that is, continuous functions X = (X 1 , . . . , X n ) : [0, T ] → R n . The following definition is our key tool to provide a meaningful and sufficiently regular definition of the composition ϕ • X of a BV loc -function ϕ and a path X. As usual, L p (0, T ) denotes the Lebesgue space of classes of p-integrable functions on (0, T ). Definition 2.1. Let ϕ ∈ BV loc (R n ), p ∈ [1, +∞] and s ∈ (0, 1). We say that a path We write V (ϕ, s, p) for the class of paths X that are (s, p)-variable w.r.t. ϕ and use the short notation V (ϕ, s) := V (ϕ, s, 1).
Note that V (ϕ, s, p) ⊂ V (ϕ, s, q) for any q < p and V (ϕ, s, p) ⊂ V (ϕ, r, p) for any r < s. The (s, p)-variability condition (2.1) is a quantitative and relative condition on the path X and the function ϕ. Roughly speaking, it ensures that X varies sufficiently around sites where ϕ has strong oscillations or jumps, encoded in the requirement that the Riesz potential of order 1 − s of the restriction of Dϕ to U is in L p (0, T ), see Section 4 for a systematic discussion. The use of an open neighborhood U of X([0, T ]) in (2.1) simplifies several arguments (e.g. mollification). We admit a component-wise point of view upon functions with values in R m .
Recall the following classical definition.
In this situation, the unique value λ ϕ (x) is called the approximate limit of ϕ at x. The set of points x ∈ R n for which this property does not hold is called approximate discontinuity set (or exceptional set) and is denoted by S ϕ .
The set S ϕ does not depend on the choice of the representative for ϕ. If ϕ is a representative of ϕ ∈ L 1 loc (R n , R m ) then a point x ∈ S ϕ with ϕ(x) = λ ϕ (x) is called a Lebesgue point of ϕ, and the set of all Lebesgue points of ϕ is called the Lebesgue set of ϕ. See for instance [5,Definition 3.63]. The set S ϕ is Borel and of zero Lebesgue measure, [5,Proposition 3.64]. If ϕ = (ϕ 1 , ..., ϕ m ) ∈ BV (R n ) m then by the Federer-Vol'pert theorem, [5,Theorem 3.78], the set S ϕ is countably H n−1 -rectifiable.
We say that a Borel function ϕ : Using Definition 2.3 and the equivalence of norms on R n it is easy to see that In particular, if for any i the function ϕ i : R n → R is a Lebesgue representative of ϕ i then ( ϕ 1 , ..., ϕ m ) is a Lebesgue representative of ϕ, and we refer to such representatives as component-wise Lebesgue representatives.
The following observation will be proved in Section 4.
Lemma 2.4. Let ϕ = (ϕ 1 , ..., ϕ m ) ∈ BV loc (R n ) m and X ∈ V (ϕ, s) for some s ∈ (0, 1). Then for any component-wise Lebesgue representatives ϕ (1) and ϕ (2) of ϕ we have Lemma 2.4 could be rephrased by saying that under the (s, 1)-variability condition the equivalence class ϕ has a well defined trace on the range X([0, T ]) ⊂ R n of X, endowed with a suitable measure. See [2] or [57] for pointwise redefinitions of functions and traces to closed subsets of R n in other contexts.
Definition 2.5. Let ϕ ∈ BV loc (R n ) m and suppose that X ∈ V (ϕ, s) for some s ∈ (0, 1). We define the composition ϕ • X to be the L 1 -equivalence class of t → ϕ(X t ) on [0, T ], where ϕ is a component-wise Lebesgue representative of ϕ. Given p ∈ [1, +∞] we say that ϕ is p-integrable w.r.t. X, in symbols ϕ ∈ L p (X, R m ), if ϕ • X is an element of L p (0, T, R m ). In the case n = 1 we write L p (X) instead of L p (X, R).
Thanks to Lemma 2.4 the composition and the notion of p-integrability w.r.t. X are well defined. The component-wise choice of representatives is not essential, but it is convenient in conjunction with Definitions 2.1 and 2.2.
We discuss (s, 1)-variability in some examples.
Example 2.6. If ϕ : R n → R is locally Lipschitz, then for any s ∈ (0, 1) any path X is in V (ϕ, s, ∞). This follows from the fact that Dϕ = |∇f | · L n with |∇f | ∈ L ∞ (U ) on any relatively compact open set U ⊂ R n .
Example 2.7. Let C ⊂ [0, 1] be the classical middle third Cantor set and ν C the unique self-similar probability measure with support C , see [31]. Let ϕ C : Then ϕ C ∈ BV loc (R n ), and on [0, 1] n we have Dϕ C = Dϕ C = ν C ⊗ H n−1 . Writing d C = log 2 log 3 for the Hausdorff dimension of C , we find that for s ∈ (0, d C ) any path X in R n is in V (ϕ C , s, ∞). Now suppose s ∈ (d C , 1). The constant path X ≡ ( 1 2 , 0, ..., 0) in R n is in V (ϕ C , s, ∞), but the constant path X ≡ (0, 0, ..., 0) is not in V (ϕ C , s). For n = 1 any smooth function X : (0, T ) → (0, 1) with a finite number of critical points is in V (ϕ C , s, ∞). For n = 2 a smooth curve X : (0, T ) → (0, 1) 2 , parametrized to have unit speed, does not have to be in V (ϕ C , s). On the other hand, a path of Brownian motion is in V (ϕ C , s, ∞) with probability one. For n ≥ 3 paths of fractional Brownian motions with Hurst index H ∈ (0, 1 n−1+s ) are in V (ϕ C , s, ∞) with probability one. See Example 4.18 and Subsection 4.4 for details.
The function ϕ in Example 2.7 is Hölder continuous. The next example discussed variability with respect to discontinuous functions.
Let s ∈ (0, 1) be arbitrary. If a smooth unit speed curve X : [0, T ] → R n hits ∂O in finitely many points then we have X ∈ V (1 O , s), but if n ≥ 2 and X spends L 1 -positive time in ∂O then it cannot be an element of V (1 O , s), see Example 4.19. For n = 1 or n = 2 the path of a Brownian motion is in V (1 O , s, ∞) with probability one. For arbitrary n ≥ 1, the path of a fractional Brownian motion with with probability one. For arbitrary n ≥ 1 it also follows that if H ∈ (0, 1 s ) and the fractional Brownian motion is started in (∂O) c then it is in V (1 O , s) with probability one, see Subsection 4.4.

Existence and properties of Stieltjes integrals
As mentioned, we are interested in generalized Lebesgue-Stieltjes integrals defined in terms of fractional calculus, [84], introduced in [98,99] and used e.g. in [81].
We introduce suitable function spaces to discuss the existence and the continuity properties of the integral. Let p ∈ [1, +∞) and 0 < θ < 1. The Gagliardo seminorm of order θ with exponent p of a measurable function f : (0, T ) → R m is defined as Recall that for m = 1 we agreed to suppress R m from notation and simply write W θ,p (0, T ), which we do similarly for the spaces in the sequel. The Hölder seminorm of order 0 < θ < 1 of a measurable function f : and we write C θ ([0, T ], R m ) to denote the space of Hölder continuous functions f : Remark 2.9. The spaces W θ,p (0, T ) and C θ ([0, T ]) are classical Besov spaces of type B θ p,p and B θ ∞,∞ , see for instance [90]. Because they appear naturally in connection with Stieltjes integrals we also consider the following more specific types of spaces, which (in this or a similar form) were introduced in [81]. Accepting a slight abuse of notation we use the symbol W θ,∞ (0, T, R m ) to denote the space of all measurable f : (0, T ) → R m such that It is well known and easily seen that We write W θ,p 0 (0, T, R m ) for the space of measurable functions f : (0, T ) → R m such that We emphasize that in the present paper the symbols W θ,∞ (0, T ) and W θ,p 0 (0, T ) do not have the standard meaning.
The following definition is due to [98,99], see also [81]. By D θ 0+ and D 1−θ T − we denote the (left and right sided) fractional Weyl-Marchaud derivatives of orders θ and 1 − θ, respectively, see formulas (D.1) and (D.2) in Appendix D. Background information on fractional derivatives can be found in [84].
The right hand side of (2.6) is a real number; the complex prefactor (used in [98] to ensure natural formulas) compensates with another complex prefactor in the right sided Weyl-Marchaud derivative, cf. (D.2). The definition is correct: for f and g that satisfy the respective hypotheses, the value of the integral in (2.6) is independent of the choice of θ, [98,Proposition 2.1]. If f and g are as in the definition and g has bounded variation, then (2.6) equals to the classical Lebesgue-Stieltjes integral of f w.r.t. g, [98,Theorem 2.4]. The following duality estimate and restriction property are well known, see [98,99] or [81]. By Γ we denote the Euler gamma function. Proposition 2.11. Assume that f ∈ W θ,1 0 (0, T ) and g ∈ W 1−θ,∞ T (0, T ) for some θ ∈ (0, 1). Then the integral in (2.6) admits the bound Hence, for every t ∈ [0, T ] the restriction 1 [0,t] f belongs to W θ,1 0 (0, T ) and the integral Our first new contribution is the following Theorem 2.12 on the regularity and integrability of the composition ϕ • X of a Hölder path X with a BV -function ϕ, the existence of the generalized Lebesgue-Stieltjes integral of ϕ(X) with respect to a given Hölder path Y in the sense of (2.6), and the Hölder regularity of this integral, seen as a function of t ∈ [0, T ].
(iii) Moreover, if X is (s, p)-variable with respect to ϕ for some p ∈ (1, +∞], then for and · 0 ϕ(X)dY In Subsections 4.6 and 4.7 of Section 4 we provide a proof of Theorem 2.12, along with quantitative estimates for the integral (2.8) involving (2.1).
Variability permits the following change of variable formula.
for L 1 -a.e. t ∈ [0, T ], provided thatx ∈ R n \ S F . If in addition F is continuous, then (2.10) holds for all t ∈ [0, T ] and no matter wherex ∈ R n is located.
The proof of this theorem can be found in Subsection 4.9 of Section 4.
The following result on the coincidence of (2.8) and the corresponding Riemann-Stieltjes integral under the (s, p)-variability condition with large enough p is immediate from Proposition 4.29 and Lemma 4.31 below together with [98, Theorems 4.1.1 and 4.2.1]. We provide this result for systematic reasons while it will not be used in the sequel.

Systems of differential equations
In this section we discuss equations of the form where T > 0,x ∈ R n , σ : R n → R n×n is a coefficient function σ = (σ jk ) 1≤j,k≤n and Y = (Y 1 , ..., Y n ) : [0, T ] → R n is a given Hölder path. As usual, (3.1) is to be understood in the sense that all components X j of X = (X 1 , ..., X n ) : [0, T ] → R n satisfy the equations wherex = (x 1 , ...,x n ). Each integral in these sums is a generalized Lebesgue-Stieltjes integral as in (2.8).
Examples 3.2. Let n = 1 and let σ = ϕ C be as in Example 2.7 (with n = 1). If Y is a typical realization of fractional Brownian motion with Hurst index H > log 3 log 3+log 2 then by Theorem 3.6 below, variability solutions for σ and Y started at zero exist. Also the zero function X ≡ 0 is a variability solution.
Not every solution is a variability solution as the following example shows. Examples 3.3. If n = 1, κ ∈ (0, 1), σ(x) = |x| κ for x ∈ (−1, 1) and σ(x) ≡ 1 for x ∈ R \ (−1, 1) and Y ∈ C γ ([0, T ]) is nowhere constant then by Theorem 3.8 there are (non-constant) variability solutions for σ and Y started at zero. Also the zero function X ≡ 0 is a solution, but it can be a variability solution only if γ + κ > 1: Since in a neighborhood of zero Dσ (dx) = |x| κ−1 dx the zero solution is a variability solution only if we can find s ∈ (1 − γ, 1) such that |x| −s+κ−1 is integrable around the origin.
Remark 3.4. In [70] the authors considered monotone and continuous σ having powertype non-linearity at the origin similarly as in the preceding example. The assumptions of [70, Theorem 3.6] ensure that their solution candidate is actually an (s, 1)-variable solution. As pointed out in [70], one cannot expect unique solutions, because also the zero function is a solution.

Upper regularity and Hölder continuity of coefficients
To put our results in perspective we begin with a case that encodes a known result.
Recall that, given d ≥ 0, a Borel measure µ on R n is said to be upper d-regular on a Borel set B ⊂ R n if there are constants c > 0 and r 0 > 0 such that We call a function ϕ ∈ BV loc (R n ) upper d-regular on B if Dϕ is upper d-regular on B. We say that a function ϕ = (ϕ 1 , ..., ϕ m ) ∈ (BV loc (R n )) m is upper d-regular on B if each of its components ϕ i is. If B = R n we simply say upper d-regular.
The following is verified in Proposition 4.9.
Proposition 3.5. Let σ = (σ jk ) 1≤j,k≤n be such that σ jk ∈ BV loc (R n ) for all j and k, let s ∈ (0, 1), and assume that σ is upper d-regular with d > n − 1 + s. Then σ has a (unique) Borel version which is Hölder continuous of order s and extends any component-wise Lebesgue representative. Moreover, any path is (s, ∞)-variable w.r.t. σ.
Concerning (3.1) we are led back to a well known Peano type existence result.
We also consider upper regularity conditions for paths. If for a given number s > 0, a Borel set B ⊂ R n and a path Y : then there are constants c > 0 and r 0 > 0 such that Remark 3.7.
(i) No non-constant ϕ ∈ BV loc (R n ) can be upper d-regular with d > n, as a Lebesgue differentiation argument shows.

Solutions in dimension one
For the case n = 1 we obtain the following slight modification of the constructive existence results [43,Theorem 3.3] and [89, Theorem 2.1], see Example 3.9 below. It allows to compensate the failure of σ to be Hölder of sufficiently high order in some space region by sufficient activity of Y in some other space region, stated in terms of (3.3).
is absolutely continuous and strictly increasing on R.
is a variability solution for σ and Y started atx.
In the case that in Theorem 3.8 no upper regularity of σ is used but Y is assumed to be upper d Y -regular, the Hölder continuity and upper regularity of Y together can produce sufficient regularity: The hypotheses in Theorem 3.8 force the condition and for s ∈ ( 1 γ − 1, d Y ) the theorem yields solutions. If the Hölder order γ of Y is higher (lower), lower (higher) upper regularity of Y is needed.
T ] of arbitrary Hurst index H ∈ (0, 1) is almost surely γ-Hölder for any γ < H. On the other hand, it is upper 1-regular almost surely, as can for instance be concluded from the joint continuity of its local times, see [13, p. 1271], [14,Theorem] or [92], which entails their local boundedness, see Proposition 4.14 below. If H > 1 2 then for any γ ∈ ( 1 2 , H) condition (3.5) holds and for almost every realization (seen as a deterministic path) in place of Y , Theorem 3.8 yields variability solutions for any starting pointx and no matter whether σ is upper regular of any order or not. This recovers the fractional Brownian motion special case of [89, Theorem 2.1], which roughly speaking made use of the facts that, for any γ < H, the paths of B H are γ-Hölder almost surely and that . Let g be as in Theorem 3.8 and suppose that Y is such that on each of the intervals [0, 1 2 it equals a scaled down copy of Z, and on [g(1), +∞) it equals a typical path of fractional Brownian motion with Hurst index H started at zero. Then Y is Hölder of any order γ < H ∧ δ, upper d-regular on (g(−1), g(1)) and upper 1-regular on R \ (g(−1), g (1)). If d = 1 2 and δ = H = 3 4 then Theorem 3.8 yields variability solutions for all starting points, note that on (−2, 2) we can use the upper d σ -regularity of σ and on R \ (−g(1), g(1)) the upper 1-regularity of Y .

Doss' transformation
To understand analogs of Theorem 3.8 for n ≥ 2 we implement a multidimensional Doss transformation, [27,95]. Let σ : R n → R n×n be a matrix coefficient as before, and let σ j = (σ j1 , ..., σ jn ) denote its jth row. Suppose that f = (f 1 , . . . , f n ) : R n → R n is a function with components f k : R n → R which satisfies the deterministic equation where we have used the symbol for the Jacobian matrix of f to avoid confusion with the symbol for gradient measures.
The jth row of ∇f is the gradient ∇f j = (∂ 1 f j , ..., ∂ n f j ) of f j , and (3.7) states that ∇f j = σ j (f ) for all j. If we set provided that we can justify the use of the change of variable formula (2.10) with f j in place of F and (3.7). As in the classical case, [95], one needs strong assumptions to find solutions f to (3.7), and in our case an additional difficulty is that the components of σ are BV loc only. Our first main assumption on σ is as follows.
Remark 3.13. Since all paths are continuous and T is finite, it suffices to have σ with the respective properties defined on a bounded domain in R n . To save notation and have shorter proofs we formulate Assumption 3.12 as stated.
The following lemma is a direct consequence of Assumption 3.12 and the Cayley-Hamilton theorem, see Subsection 5.1.
Lemma 3.14. Let Assumption 3.12 be satisfied. Then there exists a matrixσ = (σ jk ) 1≤j,k≤n of functionsσ jk ∈ BV loc (R n ) ∩ L ∞ (R n ) so that σσ =σσ = I L n -a.e. and this matrix is unique up to L n -a.e. equivalence. Moreover, there exists someε > 0 such that det(σ) >ε L n -a.e. on R n . By Assumption 3.12 and Lemma 3.14 our second main assumption makes sense. We write D i to denote the partial differentiation in direction of the ith coordinate in the sense of tempered distributions.
Assumption 3.15 (curl-free assumption). For all i, j, and k we have in the sense of tempered distributions, that is, R nσkj ∂ i ϕ dx − R nσki ∂ j ϕ dx = 0 for any i, j and k any Schwartz function ϕ ∈ S (R n ).
e. on R for some ε > 0 and for all i = 1, 2, . . . , n. Then Assumption 3.12 is clearly satisfied. If also Assumption 3.15 is satisfied, it follows that σ ii depends only on x i . Consequently, we are led back to one-dimensional equations as discussed in Subsection 3.2.
Assumptions 3.12 and 3.15 allow the following result for solutions to (3.7).
Proposition 3.18 (angle condition). Suppose that σ satisfies Assumption 3.12. For n ≥ 2 suppose also that it satisfies Assumption 3.15 and that there exists δ > −1 such that for L n -a.e. x ∈ R n we have Then there exists a bi-Lipschitz function f : R n → R n which solves (3.7).
Condition (3.10) ensures the applicability of a global inversion result, Proposition 5.4, which yields a global Lipschitz solution f to (3.7).
Example 3.21. Let n = 2, 0 < a < b and consider the cone satisfies Assumptions 3.12 and 3.15 and (3.10). A solution to (3.7) is given by Example 3.22. Let n = 2, let ϕ C : R → R be as in Example 2.7 (with n = 1), and satisfies Assumptions 3.12 and 3.15 and (3.10), the fact that σ 22 ∈ BV loc (R n ) follows using [5,Theorem 3.16]. For this choice of σ the function Remark 3.23. The existence of almost everywhere defined local inverses of Sobolev functions is treated in [39,40]. In

Solutions in arbitrary dimensions
The following is a version of Theorem 3.8 for arbitrary n ≥ 1.
Theorem 3.24. Suppose that σ is as in Assumption 3.12 and f : R n → R n is a bi-Lipschitz function which solves (3.7). Let s ∈ (0, 1), (3.11) and suppose the same is true for B c in place of B. Then the path By Proposition 3.18, the first sentence in Theorem 3.24 could be replaced by requiring σ to satisfy Assumptions 3.12 and, if n ≥ 2, also Assumption 3.15 and (3.10). Note that for n = 1, Theorem 3.8 gives the same result under less restrictive assumptions on σ.
If only the upper regularity of σ is used, Theorem 3.24 complements Theorem 3.6 by constructing a solution. If also the upper regularity of Y is used one can, in some cases, obtain solutions for discontinuous σ. However, the upper regularity condition on Y is quite restrictive: Already if n = 2 and σ has jumps one cannot hope to use Theorem 3.24 to obtain solutions to (3.1) when Y is a typical path of a fractional Brownian motion B H with Hurst index H >

13)
and for all j and k we have Then (3.12) defines a variability solution X ∈ C γ ([0, T ], R n ) ∩ V (σ, s) for σ and Y started atx.
In comparison with (3.6) conditions (3.3) and (3.13) are quite restrictive. They ensure a rather weak regularization solely due to dimensional effects. Condition (3.6) encodes a strong additional regularization by randomness. Also Theorem 3.25 becomes efficient in a probabilistic context. The following corollary may be seen as a generalization of [43,Theorem 3.3], and [89, Theorem 2.1].

15)
and for all j and k we have (3.14). Then for P-a.e. ω ∈ Ω the path At any time t > 0 the averaging effect of the process Y leads to a regularization. The moment condition (3.14) excludes a too bad behavior of σ at the starting pointx at time t = 0 when this effect is not yet present. To apply Corollary 3.26 suppose that s ∈ (0, 1) and that ε above also satisfies 0 < ε < 1 − s. If σ satisfies Assumptions 3.12 and 3.15, (3.10), andx is such that then for P-a.s. realization of Y the path X as in Corollary 3.26 is a variability solution for σ and Y started atx, and X ∈ C γ ([0, T ], R 2 )∩V (σ, s) for any γ < H. Note that if all σ jk are actually in BV (R n ) then (since obviously s < 1 H ) condition (3.17) is automatically satisfied for H n−1 -a.e.x ∈ R 2 , see Remark 4.6.

A uniqueness result
We establish a uniqueness result for variability solutions.
Theorem 3.28. Suppose that σ = (σ jk ) 1≤j,k≤n satisfies Assumptions 3.12 and 3.15, Y ∈ C γ ([0, T ], R n ) for some γ ∈ (0, 1), andx ∈ R n . Then there exists at most one variability solution of Hölder order greater 1 2 for σ and Y started atx. The proof of Theorem 3.28 shows that the only solution candidate is (3.12). We can conclude the following results.
Corollary 3.29. Suppose that Assumptions 3.12 and 3.15 hold and let f be the solution to (3.7). If assumptions of Theorem 3.24 (or Theorem 3.8 in the case n = 1) are satisfied, then for anyx ∈ R n there exists a unique variability solution X ∈ C γ ([0, T ], R n ) for σ and Y starting atx, given by (3.12).
Corollary 3.30. Suppose that Assumptions 3.12 and 3.15 hold and let f be the solution to (3.7). If assumptions of Corollary 3.26 are satisfied, then for anyx ∈ R n and P-a.e. ω ∈ Ω there exists a unique variability solution X(ω) ∈ C γ ([0, T ], R n ) for σ and Y (ω) starting atx, given by (3.16).
Remark 3.31. Following the ideas of [89], we could replace det(σ) > ε in Assumption 3.12 with det(σ) ≥ 0. In this case we would obtain a uniqueness result similar to Theorem 3.28 up to the first time when det(σ(X t )) = 0. However, the existence of the solution becomes more complicated.

Variability and consequences
We provide a potential theoretic interpretation of the (s, 1)-variability. This allows to prove that ϕ • X is a well defined class in W β,1 0 (0, T ).

Riesz potentials and occupation measures
The Riesz potential of order 0 < γ < n of a nonnegative Borel measure ν on R n is defined by where c γ > 0 is a well-known constant depending only on n and γ, see for instance [67, Chapter I, Section 3] or [17,Section V.4]. The mutual Riesz energy of order 0 < γ < n of two nonnegative Borel measures ν 1 and ν 2 on R n is defined by This quantity takes values in [0, +∞], is symmetric in ν 1 and ν 2 , and we have  (i) Discussions of absolutely continuous occupation measures in terms of their densities, referred to as local times, are a classical subject in probability theory, [12,13,44,94]. As mentioned in [44,Section 3], applications of occupation densities to nonrandom functions are much less discussed. Results on possibly singular occupation measures are rather sparse and have mainly been used to obtain results on dimensions of images or graphs of stochastic processes, see for instance [31,Section 16] and [93,Section 4] and the references cited there.
(ii) In the special case that X is an absolutely continuous curve parametrized to have unit speed the right hand side of (4.3) is just the line integral of g along X, and formula (4.3) may be seen as a special case of the area formula for the path X, [32, Theorem 3.2.6]. Another interpretation of formulas of type (4.3) in the spirit of geometric measure theory have been established in [16]. There the authors proved an occupation time formula for R n -valued semimartingales which has features of a coarea formula for C 2 (R n )-functions. They observed that the fluctuations of the semimartingale can lead to the existence of certain 'transversal' (in a Rokhlin sense) densities which may be seen as generalizations of local times of one-dimensional semimartingales, see [16, Theorems 1 and 3].
The definitions (4.1) and (4.2) and the identity (4.3) now immediately allow to rewrite (2.1) in terms of Riesz energies. Given a function ϕ ∈ BV loc (R n ) and a set U ⊂ R n we use µ ϕ,U := Dϕ | U to abbreviate the restriction of Dϕ to U .
In particular, X is (s, 1)-variable with respect to ϕ if and only if there is a relatively compact open neighborhood of X([0, T ]) such that the mutual Riesz energy of order 1−s of µ ϕ,U and µ Remark 4.3. (i) Both ideas, the variability of curves and the use of Riesz energies, also appear in connection with regularization properties of operators in harmonic analysis, see for instance [88] for the first idea and [46,50] for the second.
(ii) Occupation measures appear naturally in connection with regularization by noise, [18,25,35,37,42,45,48,64,80,91]. The addition (in the simplest case) of a 'fast moving' perturbation to an equation can provoke well-posedness for otherwise non well-posed equations. A typical key step is to observe the Hölder (or even C 1+α -) regularity of integrals of functions (with certain continuity or integrability properties) w.r.t. occupation measures, seen as a function of the starting point of the path. See for instance [35, Section 2.1 and in particular, Theorems 2.5 and 2.6].
The following consequence of Remark 4.2 allows to define compositions of paths and BV -functions. It views (4.5) as a condition on X relative to ϕ. Recall the meaning of the symbols S ϕ and L ∞ (X) from Definitions 2.3 and 2.5.
Proposition 4.4. Let ϕ ∈ BV loc (R n ) and suppose that X is (s, 1)-variable with respect to ϕ for some s ∈ (0, 1). Then the discontinuity set S ϕ of ϕ is a zero set for the occupation measure µ In particular, we have and Proof of Lemma 2.4. If ϕ (1) and ϕ (2) both are (component-wise) Lebesgue representatives of ϕ = (ϕ 1 , ..., ϕ m ) ∈ BV loc (R n ) m and X is (s, 1)-variable w.r.t. each ϕ i , then by (2.2), (2.3) and (4.7) we have Recall that for any γ ≥ 0 the upper γ-density Θ * γ ν(x) of a Borel measure ν on R n at a point x ∈ R n is defined by as in (4.5). Since the integral is finite we can find a Borel set N ⊂ R n such that µ for all such x. Since µ ϕ,U = D(ϕ| U ), where ϕ| U ∈ BV (U ) is the restriction of ϕ to U , (4.10) trivially implies and, with suitable r 0 (x) > 0, for all x ∈ U \ N . However, from (4.11) and (4.12) it then follows that ϕ| U has an approximate limit at all x ∈ U \ N , see [5,Remark 3.82]. In other words, S ϕ ∩ U ⊂ N , X (R n \ U ) = 0, (4.6) follows. By monotone convergence (4.3) extends to all nonnegative Borel function g : R n → [0, +∞], and for g = 1 Sϕ we obtain (4.7). Estimate (4.8) follows from (2.2): For any x ∈ R n \ S ϕ we have Remark 4.5. Condition (4.5) may also be seen as a requirement on ϕ relative to the path X: The measure µ Remark 4.6. The density comparison theorem also implies that for any ϕ ∈ BV loc (R n ) the set of points x ∈ R n at which Θ * n−1 Dϕ (x) = +∞ is a H n−1 -null set, see [5, proof of Lemma 3.75].
The following example gives a coarse sufficient condition for (4.5).

Upper regularity and bounded potentials
The first maximum principle, [67, Chapter I, Theorem 1.10], states that, for any s ∈ (0, 1) and any Borel measure ν, a bound valid with some M > 0 for ν-a.e. x ∈ R n , implies (4.14) for all x ∈ R n . X -a.e. x ∈ R n . By the maximum principle it then holds for all x ∈ R n and as a consequence, (4.5) holds with any ϕ ∈ BV loc (R n ) and all relatively compact open U . The second part of the statement follows as in Remark 4.5.
Note that (4.15) implies (4.13), but in contrast to Example 4.7 no assumption is made on the measures µ ϕ,U .
We prepare a counterpart of Proposition 4.8 in terms of Dϕ . The gradient measure Dϕ of a function ϕ ∈ BV (R n ) does not charge any H n−1 -null set, [  Proof. Again by the maximum principle (4.17) holds for all x ∈ R n . Taking into account (4.16) we can conclude that if ϕ satisfies (4.17) for all compact K, then ϕ cannot have jumps: Since the set J ϕ of approximate jump points has Hausdorff dimension n − 1 it has zero H n−1+s -measure, and as before we can see such sets cannot be charged by Dϕ. The statement on Hölder continuity follows using large closed balls in place of K, a simple cut-off argument and Corollary C.3. Remark 4.11. Given a Borel measure ν on R n the pointwise lower Hausdorff dimension of ν at x ∈ R n is defined as where Θ * γ ν(x) is as in (4.9). Its lower Hausdorff dimension is defined as dim H ν = ess inf x∈R n dim H ν(x), and it is well known that dim H ν = inf{dim H A : A ⊂ R n Borel and ν(A) > 0}. See for instance [9,78]. Conditions (4.15) and (4.17) imply that dim H µ [0,T ] X ≥ n − 1 + s and, respectively, dim H Dϕ ≥ n − 1 + s.
The following is well known.
Proposition 4.12. Let ν be a finite Borel measure on R n with compact support. If with some M > 0 we have R n |x − y| −s ν(dy) < M for all x ∈ supp ν, then there is some c > 0 such that ν(B(x, r)) ≤ cr s for all x ∈ supp ν and r > 0. If there is some r 0 > 0 such that ν(B(x, r)) ≤ cr d for all x ∈ supp ν and 0 < r < r 0 , then for any s < d there is some M > 0 such that R n |x − y| −s ν(dy) < M for all x ∈ supp ν.
We briefly sketch the folklore proof for the convenience of the reader.
Proof. The first statement is immediate from (C.2). To see the second claim note that since ν is finite, we can readjust c to obtain ν(B(x, r)) ≤ c r d for all x ∈ supp ν and r > 0. If R > 0 is such that B(0, R/2) contains supp ν, then a classical argument, [77, p. 109], shows that for any x ∈ supp ν we have    of X is absolutely continuous with density L X T ∈ L p (R n ) for some p ∈ [1, ∞]. Then X is upper n q -regular, where 1 p + 1 q = 1 with the agreement 1 +∞ = 0. In particular, if the local times are bounded, L X T ∈ L ∞ (R n ), then X is upper n-regular.
Proof. We have Remark 4.15. Existence and regularity of local times are well-studied in the case of Gaussian processes, see e.g. [7,12,13,44]. A key property is the so-called local nondeterminism which guarantees that, roughly speaking, the increments of the process are not too degenerate.
In some situations it may be useful to localize the conditions on ϕ and X.
Corollary 4.16. Let ϕ ∈ BV loc (R n ), let X be a path and s ∈ (0, 1). If B ⊂ R n is a Borel set such that (4.15) holds for µ Proof. The claim follows from  We discuss details of Example 2.7 and 2.8.
Example 4.18. Let ϕ C : R n → R be as in Example 2.7. If s ∈ (0, d C ), then the fact that Dϕ C (B(x, r)) ≤ cr n−1+d C together with Corollary 4.13 (ii) shows that any path X is in V (ϕ C , s, ∞). From now on assume that s ∈ (d C , 1). For a constant path X ≡ x with fixed x ∈ R n we have µ where δ x is the point mass probability measure at x. If x = ( 1 2 , 0, ..., 0), then x has distance 1 6 to supp Dϕ C , and hence U 1−s Dϕ C (x) is bounded and X ∈ V (ϕ C , s, ∞). On the other hand, if x = (0, ..., 0), then for any open neighborhood U of the origin, as follows from Dϕ C (B(y, r)) ≥ cr n−1+d C and standard arguments ( [77, p. 109]). Consequently X is not in V (ϕ C , s) in this case. For a non-constant smooth function X on R µ [0,T ] X ((x − r, x + r)) ≤ cr implies that X ∈ V (ϕ C , s, ∞), in the case n = 1. For n = 2 we have Dϕ C (B(y, r)) ≥ cr 1+d C , so that, again by standard arguments, U 1−s Dϕ C (y) = +∞ for any y ∈ which is summable over k = 0, 1, 2, ... For n = 1 the integral over t ∈ [−1, 1] is even bounded. It follows that the mutual energy of µ and ν is finite, and hence for such O, X and s we have X ∈ V (1 O , s). If n = 2 and X spends L 1 -positive time in ∂O, then we can find 0 ≤ a < b ≤ T such that X([a, b]) ⊂ ∂O and µ    |ξ| s−1 dξ already implies square integrability, giving (4.13). (Alternatively, a Riemann-Lebesgue argument shows that integrability implies (4.15), which in turn implies (4.13).) The integral in (4.5) is a polarized version of (4.18),

Fourier transform and trading of regularity
hereμ ϕ,U denotes the Fourier transform of the measure µ ϕ,U . Formulas (4.5) and (4.19) suggest a trading of regularity between the measures.
For N = 1 conditions (4.21) and (4.22) (applied to the components of the coefficient) are exactly (3.13) and (3.14). In Corollary 4.25 below the mechanism of Corollary 4.21 is used efficiently in a probabilistic context.

Probabilistic examples
We connect to probabilistic examples. A condition of type (3.6) can ensure variability w.r.t. any BV -function.
Proof. Using Fubini's theorem and the above bound,  For Brownian motion -the special case H = 1 2 -we have (4.24) in dimensions n = 1 and n = 2 for any s ∈ (0, 1). For n = 3 a fractional Brownian motion with Hurst index H < 1 2 satisfies (4.24) for 0 < s < 1 H − 2. A higher dimension n of the ambient space forces a higher irregularity of the path in order to have an occupation measure supported on a set of sufficiently small codimension. The arguments for (4.25) are rather standard, [31,58]. The expectation equals The first summand is bounded by The second summand does not exceed The above idea can be generalized to link (s, 1)-variability into upper regularity of the underlying probability measures. Suppose that s ∈ (0, 1) and that there exists some g ∈ L 1 (0, T ) such that for L 1 -a.e. t ∈ (0, T ) we have ν t (B(x, r)) ≤ g(t)r d , x ∈ R n , r > 0, (4.26) with some d > n − 1 + s. Then (4.23) holds.
Proof. By Fubini's theorem and as in Proposition 4.12 it follows that for L 1 -a.e. t ∈ (0, T ) the inner integral is bounded by A probabilistic version of (4.21) gives similar results.

Proof. Under condition (4.27) identity (4.20) and Fubini's theorem give
which by (4.22) and the arguments in the proof of Corollary 4.21 is seen to be finite.
Example 4.26. For n ≥ 2, the fractional Brownian motion with H > 1 n satisfies (4.27) with N = 1, any 0 < ε < n − 1 H and δ 1 = 1 H + ε: We have Substituting u = 1 2 |y| 2 t −2H we see that for any y ∈ R n \ {0} the inner integral equals Using the convolution identity for Riesz kernels, The following generalization of this example is immediate. Suppose that s ∈ (0, 1), ε, c, x k and δ k are as in Corollary 4.25, and that there exists some g ∈ L 1 (0, T ) such that for L 1 -a.e. t ∈ (0, T ) we have Similar arguments can be applied also to more general bridges conditioned, e.g., on several time marginals, cf. [86, Remark 3.6(ii)].

A brief discussion of variability versus irregularity
In [18] the authors studied ODEs of type dx t = b(t, x t )dt + dw t , where b is an irregular vector field and w is a continuous fast moving perturbation. Although our goal and methods are different from theirs, our point of view upon irregular paths and occupation measures follow a similar spirit. In cases where b = b(x) is a bounded continuous function and w is 'active enough' they can prove existence and uniqueness of a continuous solution x, [18,Theorem 1.9]. If b is a distribution only, b(x · ) must be defined appropriately, and in [18] this was done for paths x that differ from the sufficiently fast moving perturbation w only by a Hölder signal, [18, Definition 1.10 and Theorem 1.11]. The needed activity of w is encoded in the boundedness and (temporal) Hölder continuity in a certain (spatial) Hölder norm of the image . Increased activity of w implies higher regularity (diffusivity) of its occupation measure, and hence better decay of its Fourier transform (larger ̺), which encodes a stronger regularization effect of T w . A refined and very systematical analysis of (̺, γ)-irregularity is provided in [41]. Condition (4.28) is a condition for single paths, and it is later connected to the individual coefficient b via the mapping properties of T w . In contrast, (2.1) is a condition on X relative to a given BV -function ϕ. In (4.18) and (4.19) the interval [0, T ] is fixed, and the decay of the Fourier transforms at infinity is quantified in terms of integrability properties. It might be interesting to investigate quantities that 'interpolate' between (2.1) and (4.28), for instance a version of (4.19) that incorporates time dependencies. It might also be interesting to see whether a concept of variability relative to both a low regularity diffusion coefficient σ and a low regularity drift vector field b could be useful.  [20,Proposition 4.6], which ensures that compositions ϕ • X are elements of W β,p (0, T ). We first provide a bound for the Gagliardo seminorm part in the norm ϕ • X W β,p (0,T ) . Proposition 4.29. Let ϕ ∈ BV loc (R n ). Let X : [0, T ] → R n be a path which is Hölder continuous of order α ∈ (0, 1]. Suppose that s ∈ (0, 1), p ∈ [1, +∞) and X ∈ V (ϕ, s, p). Then for any β ∈ (0, αs) there is a constant c > 0, depending only on α, β, n, p and s, such that

Compositions of BV loc -functions and Hölder paths
(4.29) If ϕ, X, α and s are as before but X ∈ V (ϕ, s, ∞), then for any β ∈ (0, αs) there is a constant c > 0, depending only on α, β, n and s, such that ess sup (4.30) The proof makes use of maximal functions and some of their basic properties, the necessary definitions and results can be found in Appendix C.
Proof. By (4.7) we have for any nonnegative Borel function g : by (4.31), and viewing this as a nonnegative function of t, also (4.32) Let µ := Dϕ | U denote the restriction of Dϕ to U . By Proposition C.1 and the α-Hölder continuity of X the right hand side of (4.32) is seen to be bounded by here M 1−s,R µ denotes the fractional maximal function of µ of order 1 − s (and truncated at radius R > 0), see (C.1). The trivial estimate (C.2) implies for any t ∈ [0, T ] and with c > 0 depending only on n and s, and therefore Using Fubini's theorem we obtain the same upper bound for the summand with µ(X τ ) in place of µ(X t ). Combining the estimates and using the symmetry of the integrand in the Gagliardo seminorm, we arrive at (4.29). The estimate (4.30) follows similarly.  3] it has been shown that for discontinuous ϕ ∈ BV (R) one cannot expect ϕ(X) to have finite p-variation for any p ≥ 1, if X visits a point of discontinuity of ϕ infinitely many times. In particular, ϕ(X) cannot be Hölder continuous of any order in this case. This motivates to use Sobolev norms and generalized Stieltjes type integrals rather than p-variation and Young integrals.
If X ∈ V (ϕ, s, p), a pinning argument shows that ϕ must be in L p (X). This entails that, for appropriate β and p, the composition ϕ • X is in W β,p (0, T ) and, for large enough p, even Hölder continuous.
Lemma 4.31. Let ϕ ∈ BV loc (R n ). Let X : [0, T ] → R n be a path which is Hölder continuous of order α ∈ (0, 1). Suppose that s ∈ (0, 1), p ∈ [1, +∞] and X ∈ V (ϕ, s, p). Then ϕ ∈ L p (X) and ϕ • X ∈ W β,p (0, T ) for any β ∈ (0, αs). Moreover, if αs > 1 p , then ϕ • X has a (unique) Borel version which is Hölder continuous of any order smaller than αs − 1 p . (We again use the agreement that 1 +∞ := 0.) Proof. Suppose first that p ∈ [1, +∞). Choose t 0 ∈ [0, T ] such that X t 0 ∈ R n \ S ϕ . By Proposition 4.4 such t 0 clearly exists. Then we have for a.e. t ∈ [0, T ]. Using (4.33) we obtain In order to show that ϕ • X ∈ L p (0, T ) it now suffices to prove that there is some (4.35) and U Dϕ (dy) |X t 0 − y| n−1+s < +∞. (4.36) Let N ⊂ [0, T ] be the set of all t such that X t ∈ S ϕ . As seen before, N is a Lebesgue null set, and by Definition 2.5 and (2. Remark 4.32. By Remark 4.30 and Lemma 4.31 one cannot expect X to be (s, p)-variable with respect to ϕ for p > 1 αs if X visits discontinuity points of ϕ infinitely often. In the rest of this subsection we derive an estimate for the weighted L p -term in the norm ϕ • X W β,p 0 (0,T ) . Only the special case p = 1 will be used later on. Proposition 4.33. Let ϕ ∈ BV loc (R n ). Let X : [0, T ] → R n be a path which is Hölder continuous of order α ∈ (0, 1] and X ∈ V (ϕ, s, p) for some s ∈ (0, 1) and p ∈ [1, +∞). Then for any β ∈ (0, αs ∧ 1 p ) there is a constant c > 0, depending only on α, β, n, p and s, such that where c > is a constant depending only on β and q.
Lemma 4.34 is an slight adaption of the following result in [21] and [28]: Suppose D ⊂ R d is a bounded Lipschitz domain and let δ D (x) = inf{|y − x| : y ∈ D c } denote the distance of x to its complement D c . Then, by [28,Equation (17)], we have, for any q > 0 and α ∈ (0, 1), that for all u ∈ C c (D).
Proof. An application of (4.39) to the case d = 1, α = βq and D = (0, T ) yields For each n let u n be the continuous function on [0, T ] such that u n = u on 1 n , T − 1 n , u n is linear on 1 2n , 1 n ∪ T − 1 n , T − 1 2n , and u n ≡ 0 on 0, 1 2n ∪ T − 1 2n , T . Then obviously u n ∈ C c (0, T ) and we have On 0, 1 n the function u n obeys the Lipschitz bound |u n (t) − u n (τ )| ≤ nS|t − τ |, which implies This goes to zero as n → ∞. Similarly, using (4.40), and writing c for positive constants depending only on q and β and possibly changing from line to line, Treating the regions involving intervals T − 1 n , T similarly concludes the proof.
A function ϕ ∈ L 1 loc (R n , R m ), ϕ = (ϕ 1 , ..., ϕ m ), is locally of bounded variation, denoted ϕ ∈ BV loc (R n ) m , if locally its distributional partial derivatives D i ϕ are R mvalued vector measures in the sense of [5, Definition 1.4 (a)]. We write again Dϕ for the total variation of the gradient measure Dϕ of ϕ. Elementary norm comparison in R m implies that where Dϕ i is the total variation of the gradient measure Dϕ i of ϕ i . We record a consequence of the chain rule for BV -functions, [5,Theorem 3.96]. See [4,71] for more general chain rules. ..., ϕ m ) and Φ : R m → R is a C 1 -function with bounded gradient and Φ(0) = 0, then Φ • ϕ ∈ BV loc (R n ) and Proof. By [5, Theorem 3.96 and its proof] we have Φ•ϕ ∈ BV loc (R n ) and (4.43). Together with (4.42) this implies that for any compact K ⊂ R n we have Now the second statement follows from (4.4).
Proof of 4.33. For any N ≥ 2 let Φ N ∈ C 1 (R) be an increasing function with Φ ′ sup ≤ 1 and such that Φ N (y) = −N for y ≤ −N , Φ N (y) = N for y ≥ N and Φ N (y) = y for −(N − 1) < y < N − 1. Then, by Lemma 4.35 (with m = 1) and the hypotheses of Proposition 4.33, we have Φ N (ϕ) ∈ BV loc (R n ) and X is (s, p)-variable w.r.t. each Φ N (ϕ). Suppose that (4.37) holds for all Φ N (ϕ) in place of ϕ. Then, by (4.43), and (4.37) for ϕ follows using Fatou's lemma. Consequently it suffices to prove (4.37) under the assumption that sup x∈R n |ϕ(x)| ≤ N , and we do so in the sequel.
Let (η ε ) ε>0 be a (radially symmetric) flat mollifier as in Appendix B. For each ε > 0 the composition ϕ ε •X of the mollified function ϕ ε := ϕ * η ε with the path X is continuous on [0, T ], so that by (4.38), with a constant c > 0 depending only on β and p. By (B.2) we have lim ε→0 ϕ ε (y) = ϕ(y), y ∈ R n \ S ϕ , (4.45) where S ϕ is the approximate discontinuity set of ϕ. Since is finite, we can use (4.45) together with Proposition 4.4 and bounded convergence to conclude that where c(n, s) > 0 is a constant depending only on n and s. Let next 0 < τ < t ≤ T be distinct and such that X t , X τ ∈ R n \ S ϕ . Then Proposition C.1 and bound (4.33) imply Combining with (4.47) gives Therefore we have for any such t and τ , and as at the end of the proof of Proposition 4.29 we see that, thanks to (s, p)-variability, the right hand side is integrable over [0, T ] 2 . Since by (4.45) we have for any such t and τ , symmetry and dominated convergence imply Applying Fatou's lemma to the left hand side of (4.44) and using (4.46) and (4.49), we obtain Using Proposition 4.29 on the first integral on the right hand side and readjusting constants, we arrive at (4.37).

Proof of existence and regularity of the integral
Now the proof of Theorem 2.12 follows easily.
Proof of Theorem 2.12. To show that ϕ • X ∈ W β,1 0 (0, T ) as claimed in (i) we have to show that Here, the finiteness of the weighted L 1 -term follows from Proposition 4.33 and Lemma 4.31, and the finiteness of the Gagliardo seminorm follows from Proposition 4.29. Since T by (2.5), provided that 1 − β < γ. Thus the existence of the integral (2.8) as claimed in (ii) follows from Proposition 2.11 by choosing β ∈ (1 − γ, αs). To conclude the Hölder regularity claimed in (iii) we can follow [81, Proposition 4.1 (II)] and [100, Proposition 6.2 (i)] and note that, with β and γ as stated and 0 ≤ τ < t ≤ T , we have If X ∈ V (ϕ, s, p), then ϕ • X ∈ W β,p (0, T ) by Propositions 4.29 and 4.33 and Lemma 4.31. For p = +∞ we have as in [81]. For 1 ≤ p < +∞ we can proceed similarly as in [100] and use Hölder's inequality to see that Using β < β ′ < αs and 1 p + 1 q = 1 we also have The following quantitative estimates are a byproduct of the above proof, Propositions 4.29 and 4.33 and (2.7).  If ϕ, X, and Y satisfy the hypotheses of Theorem 2.12 (ii), then for any 1 − γ < β < αs and any t ∈ [0, T ] we have · 0 ϕ(X u )dY u with straightforward modification for p = +∞.

Interpretation as currents
Although it will not be used in the sequel, we briefly comment on an alternative interpretation of (4.51) which is close to the concept of stochastic currents investigated in [38] and [36]. Given a path X : [0, T ] → R n and a number s ∈ (0, 1), set [ϕ] X,s := U 1−s Dϕ L 1 (X) , ϕ ∈ BV (R n ).
Obviously this defines a seminorm on BV (R n ). Recall that a sequence (ϕ n ) n ⊂ BV (R n ) is said to strictly converge to ϕ ∈ BV (R n ) if lim n ϕ n = ϕ in L 1 (R n ) and lim n Dϕ n (R n ) = Dϕ (R n ), [5,Definition 3.14].
Proposition 4.37. For any path X and any s ∈ (0, 1) the seminorm [·] X,s is lower semicontinuous on BV (R n ) w.r.t. strict convergence. Moreover, is a subspace of BV (R n ), closed w.r.t. strict convergence.
Although it is too strict for most applications, let us mention that the norm ϕ BV := ϕ L 1 (R n ) + Dϕ (R n ) makes BV (R n ) a Banach space, and convergence in this norm implies strict convergence. Consequently (4.52) is also closed w.r.t. ϕ BV , hence itself Banach with this norm.
We write V X,s (R n , R n ) for the Banach space of all ϕ = (ϕ 1 , ..., ϕ n ) ∈ (BV (R n )) n with [ϕ i ] X,s < +∞ for all i with norm ϕ X,s := n i=1 ϕ i BV (R n ) + [ϕ i ] X,s . The following interpretation of the integral as a bounded linear functional on V X,s (R n , R n ) is a special case of (4.51) and seems close to [36,Remark 12].

Proof of the change of variable formula
We provide a proof of Theorem 2.13 which follows by mollification of the coefficient and taking the limit. The following result yields convergence in W β,1 0 (0, T ).
Lemma 4.39. Let ϕ ∈ BV loc (R n ) and X ∈ C α ([0, T ], R n ). Suppose that s ∈ (0, 1), p ∈ [1, +∞), and X ∈ V (ϕ, s, p). Set ϕ ε = ϕ * η ε , where (η ε ) ε>0 is a mollifier. Then for any β ∈ (0, αs) we have Proof. Compare also with [34]. Consider first the seminorm part Since X ∈ V (ϕ, s, p), (B.2) and Proposition 4.4 imply that for a.e. t ∈ [0, T ]. Thus it suffices to find an integrable upper bound in order to apply dominated convergence theorem. For this we use For the first summand on the right hand side we can use Proposition 4.29. For the second, Proposition C.1, the bound (4.33), and Corollary B.3 imply that for ε sufficiently small and L 1 -a.e. t, τ ∈ [0, T ], relatively compact open as in Definition 2.1. The quantity (4.54) goes to zero as ε → 0 by the dominated convergence theorem. As in (4.44) and (4.50), we obtain (4.57) Recycling the pinning argument, we choose t 0 ∈ [0, T ] such that X t 0 ∈ R n \ S ϕ to see that and treat the difference |ϕ ε (X t ) − ϕ ε (X t 0 )| using the maximal inequalities and potential bounds as in the proof of Proposition 4.29, which gives an integrable upper bound. By (4.55) and dominated convergence we have and combining with the above, we see that also (4.57) goes to zero as ε → 0.
We prove Theorem 2.13.

Existence and uniqueness proofs
In this section we prove Theorems 3.8, 3.24, 3.25 and 3.28 and Corollary 3.26.
We make repeated use of the elementary facts that a function f = (f 1 , ..., f m ) is in the space W β,p (0, T, R m ) if and only if all its components f i are in W β,p (0, T ), and that, by the norm equivalence in R m , the norm f W β,p (0,T,R m ) is comparable to m i=1 f i W β,p (0,T ) (similarly for other function spaces). We apply the fact that estimates like (2.7) or (2.9) remain valid for vector valued functions of compatible dimensions at the expense of having a different multiplicative constant in front. We also use the symbol L p (X, R m ) for a vector valued version of L p (X).

Invertibility and BV -regularity
In this subsection we verify Lemma 3.14. Recall that by the Cayley-Hamilton theorem the inverse A −1 of an invertible (n × n)-matrix A satisfies where I n is the (n × n)-identity matrix and, for k = 0, ..., n − 1, one has with B k denoting the k-th complete exponential Bell polynomial and s l = Tr(A l ) being the trace of A l . We prove Lemma 3.14.
Proof of Lemma 3.14. Suppose that σ satisfies Assumption 3.12. Recall that we always consider fixed Lebesgue representatives of the components σ ij . Let N ⊂ R n be a L n -null set such that det(σ(x)) > ε and |σ ij (x)| ≤ σ L ∞ (R n ,R n×n ) for all x ∈ R n \ N and any i and j. (5.2) For such x the matrix σ(x) is invertible, and we setσ(x) := (σ(x)) −1 . For x ∈ N we can setσ(x) := 0. For x ∈ R n \ N the matricesσ(x) of σ(x) satisfy (5.1) in place of A −1 and A. In particular, since for all quadratic matrices A the elements of the matrix products A k , traces tr(A k ), and the determinant det(A) are polynomials of the elements of A, we observe that the coefficientsσ ij (x) ofσ(x) are rational functions of the coefficients σ ij (x) of the formσ where for each i and j the function P ij (σ) is a polynomial of degree n−1 in the coefficients σ kl , k, l = 1, ..., n, of σ. By the boundedness of the σ kl we have P ij (σ) ∈ L ∞ (R n ) and by (5.2) and (5.3) alsoσ ij ∈ L ∞ (R n ). For fixed i and j let p := P ij − P ij (0) and let As in Lemma 4.35 the chain rule, [5,Theorem 3.96], now shows thatσ ij ∈ BV loc (R n ). To see the last statement of the lemma, note that for x ∈ R n \ N we have det σ(x) ≤ n! σ L ∞ (R n ,R n×n ) , so that det(σ(x)) = 1 det(σ(x)) ≥ 1 n! σ L ∞ (R n ,R n×n ) .

Solutions to the deterministic equation
In this subsection we provide a proof for Proposition 3.18. Assumptions 3.12 and 3.15 allow to conclude thatσ has a Lipschitz potential.
Proposition 5.1. Suppose σ satisfies Assumptions 3.12 and 3.15. Then there exists a Lipschitz function g : R n → R n such that its Jacobian matrix ∇g, defined a priori in distributional sense, satisfies ∇g =σ L n -a.e. (5.5) In particular, g ∈ W 1,∞ (O) for any bounded domain O ⊂ R n .
We record a short helpful argument to conclude local continuity and boundedness. By S (R n , R n ) and S ′ (R n , R n ) we denote the spaces of R n -valued Schwartz functions and tempered distributions, respectively.
where ∆ denotes the Laplacian on R n and (−∆) −1 the Newton potential. Then ψ ∈ S (R n , R n ) and by the L p ′ -boundedness of the Riesz transform, [87, Chapter II, Section 4, Theorem 3 and Chapter III, Section 1], and the fractional Sobolev inequality, [87, Chapter V, Section 1, Theorem 1]. Consequently G ∈ L q (R n ) by duality. Proof. In the case that n = 1 we can simply integrateσ to find a locally bounded Lipschitz function g satisfying (5.5). Suppose n ≥ 2. By Assumption 3.15 and a distributional version of Poincaré's Lemma, there exists g = (g 1 , ..., g n ) ∈ S ′ (R n , R n ) such that ∇g =σ holds in distributional sense. That is, the jth row ∇g j = (D 1 g j , ..., D n g j ) of ∇g equals the jth rowσ = (σ j1 , ...,σ jn ) ofσ in S ′ (R n , R n ). See [56,Chapter 4, Section 3, Proposition 9] of [85, Chapter II, Section 6, Théorème VI] (the result is stated for the dual of smooth compactly supported functions, but the proof does not change for Schwartz functions and tempered distributions). For any j and k we have D k g j =σ jk ∈ L 1 (R n ) ∩ L ∞ (R n ) by Assumption 3.12 and Lemma 3.14. Now suppose that O ⊂ R n is a bounded domain. By Lemma 5.2 and the boundedness of O we can conclude that g ∈ W 1,q (O, R n ) for arbitrarily large q < +∞, and Remark 5.3 implies that g is continuous on O, and hence also bounded on O. As a consequence, we have g ∈ W 1,∞ (O, R n ), and a standard mollifier argument yields |g(x) − g(y)| ≤ ∇g L ∞ (R n ,R n×n ) |x − y| for all x, y ∈ O, see [52,Theorem 4.1].
Our proof of Proposition 3.18 is based on an inverse function theorem for Sobolev functions proved in [62,Theorem 1]. We quote a special case of this result sufficient for our purposes. See also [63].
Remark 5.5. Condition (5.7) is usually rephrased by saying that g is κ-quasiregular or of bounded distortion, see for instance [54,Section 1.2] or [82]. A result similar to Proposition 5.4 is [24, Theorem 1]. There condition (5.6) is replaced by a L n -a.e. nonnegativity condition on the sums of principal minors of σ.
Proof of Proposition 3.18. By Assumption 3.12, Lemma 3.14, Assumption 3.15, and Proposition 5.1 there exists a Lipschitz function g ∈ W 1,n loc (R n , R n ) such that (5.5) holds. In the case n = 1, Assumption 3.12 and (5.5) imply that σ −1 L ∞ (R) ≤σ(x) = g ′ (x) for L 1 -a.e. x ∈ R, and hence g is strictly monotone and bi-Lipschitz with inverse f = g −1 having Lipschitz constant bounded by σ L ∞ (R) . Assume n ≥ 2. Again by Assumption 3.12 and Lemma 3.14 we have for L n -a.e. x ∈ R n , and by (5.5) also |∇g(x)| n ≤ σ n L ∞ (R n ,R n×n ) . Hence (5.7) holds with κ = n! σ n L ∞ (R n ,R n×n ) σ n L ∞ (R n ,R n×n ) . Condition (5.6) is immediate from (3.10). Consequently g is a homeomorphism by Proposition 5.4, and we denote its inverse by f = g −1 . For any bounded domain O the restriction g| O is a homeomorphism with inverse f | g(O) , and by Proposition 5.1 also g| O ∈ W 1,n−1 (O, R n ). This implies that f ∈ W 1,1 loc (g(O), R n ), see [54,Theorem 5.2]. Consequently the chain rule can be applied to g • f = id, and taking into account (4.1) we obtain I = ∇(g • f )(y) = (∇g)(f (y))∇f (y) = σ −1 (f (y))∇f (y) for L n -a.a. y ∈ R n . The invertibility of σ yields ∇f = σ(f ) L n -a.e. and since σ ∈ L ∞ (R n , R n×n ), it follows that f is Lipschitz with Lipschitz constant bounded by σ L ∞ (R n ,R n×n ) .

Existence of solutions in the invertible case
We observe the following stability property of conditions of upper regularity type.
and consequently the claimed estimate holds with M = Lip(g) d .
In the next lemma we justify the application of the change of variable (2.10).
Let g and f be as in Lemma 5.6. Let s ∈ (0, 1), (5.8) and similarly for B c . Then the path X : Proof. If Y satisfies (5.8), then by Lemma 5.6 we have and a similar conclusion is true for B c . Therefore the hypotheses on σ, together with Corollary 4.16 and Remark 4.2, imply that X is (s, 1)-variable w.r.t. σ. If f is bi-Lipschitz on R n it is proper, and applying [5, Theorem 3.16] component-wise, we may conclude that for each j and k the function ∂ k f j = σ jk (f ) is in BV loc (R n ), see also [53]. By the same theorem and the bijectivity of f we also have with the notation g # from [5,Theorem 3.16], which is the pushforward operation on measures, however, defined differently when operating on functions. Using |f (Y − g(z)|, the fact that g is proper, and the fact that That is, Y g(x) is (s, 1)-variable w.r.t. ∂ k f j . Consequently, by Theorem 2.13, The stated Hölder regularity is clear since f is Lipschitz. Finally, if n = 1 and g is as in Theorem 3.8, we can use the fact that f ′ (Y t ) = σ(X t ) for L 1 -a.e. t ∈ [0, T ] and arrive at the same conclusion.
Proof. By Lemma 5.6 and the hypotheses we have Consequently, Corollary 4.25 implies that for P-a.e. ω ∈ Ω the path X(ω) is (s, 1)variable w.r.t. σ. One can now follow the arguments in the proof of Lemma 5.7.
Proofs of Theorems 3.8 and 3.24. Statement (i) in Theorem 3.8 is clear, the first part of statement (ii) follows as in [89, Theorem 2.1], the second part of (ii) is easily seen using Lemma 5.6. By Proposition 3.18 also the hypotheses on σ in Theorem 3.24 guarantee the existence of f and g are as required by Lemma 5.6. Consequently Lemma 5.7 applies and yields the desired statement.

Uniqueness of solutions in the invertible case
In this subsection we prove Theorem 3.28. We follow the basic idea of [  Moreover, the integrals can be understood also as Riemann-Stieltjes integrals.
To prove Proposition 5.11 we can follow the strategy of [98,Theorem 4.4.2] and proceed by Riemann sum approximation. We make use of the following elementary observation. We prove Proposition 5.11.
Proof. To show that the integral is well-defined we first claim that for any β < αs we have σ δ (X · )σ(X · ) W β,1 0 (0,T,R n×n ) < ∞. (5.16) To see this, note thatσ k is bounded on a locally compact neighborhood O containing X([0, T ]), and since X is a variability solution, Proposition 4.33 implies that Using (4.8) we obtain Proposition 4.29 implies the boundedness of the first integral on the right hand side, and the second integral is bounded by Hölder continuity of X and differentiability ofσ δ . Thus we have (5.16), and as a consequence, the integral in (5.14) is well-defined. To verify equation (5.14) we first note that, since g δ andσ δ are smooth and X is Hölder continuous of order α > 1 2 , Proposition 5.10 yields The integrals can be approximated by Riemann-Stieltjes sums where (π m ) m is a refining sequence of finite partitions of [0, t] with subinterval endpoints t (m) j and mesh |π m | = max j |t we have lim Π m X (u)σ(X u )dY u .
Proof. Recall (5.4), i.e., that each componentσ ij ofσ can be written in the formσ ij = Φ • σ with a suitable function Φ ∈ C 1 (R n×n ) (depending on i and j) with Φ(0) = 0 and bounded gradient. Consequently X ∈ V (σ ij , s, p) for each i and j by Lemma 4.35, and this means that X ∈ V (σ, s, p).
We prove Theorem 3.28.
Since g is invertible with inverse f satisfying (3.7) (cf. Proof of Theorem 3.24), this implies that X t = X t = f (Y t ), t ∈ [0, T ]. , and it can also be obtained by a quick subordination argument. Statement (ii) is a direct consequence of (i) and Fubini's theorem.
(ii) If ν 1 and ν 2 are finite nonnegative Borel measures with compact support, then

B Mollification results
We collect some useful known approximation results used in the main text. We begin with an approximation lemma for Riesz potentials that is a slight variant of [67, Section I.3, Theorem 1.11 and its proof]. As usual we say that (η ε ) ε>0 is a (radially symmetric) mollifier if η ∈ C ∞ c (R n ) is a nonnegative radial function, compactly supported inside the unit ball and such that R n η(x)dx = 1, and η ε (x) := ε −n η(ε −1 x), x ∈ R n , for any ε > 0, cf. [5, p. 41]. We say that a mollifier (η ε ) ε>0 is a flat mollifier if for all x ∈ R n with |x| ≤ 1 2 we have η(x) = c η with a suitable constant c η > 0. To have η constant in a small ball around the origin is useful to quickly see the following.
The proof is as in [67, Section I.3, Theorem 1.11], but since we use a slightly different mollifier, we repeat the short arguments for (i) for convenience.
The function Φ defines a continuous function of |x − y| which is zero if |x − y| = 0 and tends to 1 for |x − y| → +∞, and denoting the maximum of this function by c(n, γ), we obtain (i). For (ii) one can follow the proof in [67, p. 73].
We obtain the following consequence for potentials.
Proof. Given x ∈ R n write ψ x (y) := |x − y| −n+γ . Then by Lemma B.2, and an application of Lemma B.1 yields the desired bound.
In order to prove the claim, we quote [1, Lemma 4.1 and its proof]. For any 0 < s < +∞, any R > 0, and any locally integrable function f : R n → R, the fractional sharp maximal function f # s,R of f is defined by |f − f B(x,r) |dy.
Lemma C.2. Let f : R n → R be locally integrable and 0 < s < +∞. Then there is a constant c(n, s) > 0 such that for all Lebesgue points x, y ∈ R n of f we have |f (x) − f (y)| ≤ c(n, s)|x − y| s f # s,4|x−y| (x) + f # s,4|x−y| (y) .

Proof.
For any x ∈ R n and r > 0 we have the 1-1-Poincaré inequality It is trivial to see that for any Borel measure ν on R n , any γ ∈ (0, 1), and any R ∈ (0, +∞] we have M γ,R ν(x) ≤ c U γ ν(x), x ∈ R n (C.2) with a constant c > 0 depending only on n and γ. Together with Proposition C.1 we obtain the following immediate consequence.
Corollary C.3. Let ϕ ∈ BV loc (R n ) and s ∈ (0, 1). If sup x∈R n U 1−s Dϕ (x) < +∞, then ϕ has a Borel version that is Hölder continuous of order s, and any Lebesgue representative of ϕ coincides with this version on the Lebesgue set of ϕ.

D Elements of fractional calculus
We briefly recall the definitions of fractional integrals and derivatives, [84], and the generalized Lebesgue-Stieltjes integral introduced by Zähle in [98] and used in [81,99]. For fixed T < ∞, the fractional left and right Riemann-Liouville integrals of order θ > 0 of a function f ∈ L 1 (0, T ) are denoted by