How simplifying and flexible is the simplifying assumption in pair-copula constructions ... analytic answers in dimension three and a glimpse beyond

Motivated by the increasing popularity and the seemingly broad applicability of pair-copula constructions underlined by numerous publications in the last decade, in this contribution we tackle the unavoidable question on how flexible and simplifying the commonly used ‘simplifying assumption’ is from an analytic perspective and provide answers to two related open questions posed by Nagler and Czado in 2016. Aiming at a simplest possible setup for deriving the main results we first focus on the three-dimensional setting. We prove that the family of simplified copulas is flexible in the sense that it is dense in the set of all three-dimensional copulas with respect to the uniform metric d∞ – considering stronger notions of convergence like the one induced by the metric D1, by weak conditional convergence, by total variation, or by Kullback-Leibler divergence, however, the family even turns out to be nowhere dense and hence insufficient for any kind of flexible approximation. Furthermore, returning to d∞ we show that the partial vine copula is never the optimal simplified copula approximation of a given, non-simplified copula C, and derive examples illustrating that the corresponding approximation error can be strikingly large and extend to more than 28% of the diameter of the metric space. Moreover, the mapping ψ assigning each three-dimensional copula its unique partial vine copula turns out to be discontinuous with respect to d∞ (but continuous with respect to D1 and to weak conditional convergence), implying a surprising sensitivity of partial vine copula approximations. The afore-mentioned main results concerning d∞ are then extended to the general multivariate setting.


Introduction
Pair-copula constructions (most well-known in the context of vine copulas) are a very popular bottom-up approach for constructing high-dimensional copulas out of several bivariate ones; they have a handy graphical representation and can be considered as an ordered sequence of trees. Aiming at a significant reduction of complexity it is usually assumed that the so-called simplifying assumption, saying that the copulas of the conditional distribution functions do not depend on the conditioning variables, holds.
Considering the enormous number of scientific contributions working with and applying simplified pair-copulas (see, e.g., [5,6,7,32,36,38,37]) it is quite surprising that, apart from a few critical voices (see, e.g., [2,8,15]), no analytic and systematic study on the approximation quality and flexibility of these concepts seems to have been published so far.
After an extensive literature research it seems that the publication coming closest to such a study was written by Spanhel and Kurz [34] who focus mainly on partial vine copulas (special simplified pair-copulas whose conditional distribution functions follow a certain intuitive construction principle) and show that partial vine copulas are optimal w.r.t. Kullback-Leibler divergence if the minimization is performed sequentially, but not necessarily if the estimation is done jointly. As stated in [34], this "implies that it may not be optimal to specify the true copulas in the first tree" of a simplified pair-copula approximation.
Motivated by the broad applicability of pair-copula constructions, in this contribution we study flexibility and the extent of simplification imposed by the simplifying assumption from an analytic perspective. For the sake of gen-of our findings, however, in Section 7 we extend some of our main results to the general multivariate setting and discuss the notion of so-called universally simplified copulas. Various examples and graphics illustrate both the obtained results and the ideas underlying the proofs.

Notation and preliminaries
Throughout this paper we will write I := [0, 1] and let d ≥ 2 be an integer, which will be kept fixed. Bold symbols will be used to denote vectors, e.g., x = (x 1 , . . . , x d ) ∈ R d . The d-dimensional Lebesgue measure will be denoted by λ d , in case of d = 1 we will also simply write λ. We will let C d denote the family of all d-dimensional copulas, M will denote the comonotonicity copula, Π the independence copula and, for d = 2, W will denote the countermonotonicity copula (we omit the index indicating the dimension since no confusion will arise). For every C ∈ C d the corresponding d-stochastic measure will be denoted by μ C , i.e. μ C ([0, u]) = C(u) for all u ∈ I d , and P C will denote the family of all d-stochastic measures. For more background on copulas and d-stochastic measures we refer to [10,29]. For every metric space (S, δ) the Borel σ-field on S will be denoted by B (S).
In what follows Markov kernels will play a prominent role: A Markov kernel from R to B(R d−1 ) is a mapping K : R × B(R d−1 ) → I such that for every fixed E ∈ B(R d−1 ) the mapping y → K(y, E) is (Borel-)measurable and for every fixed y ∈ R the mapping E → K(y, E) is a probability measure. Given a real-valued random variable Y and a real-valued (d − 1)-dimensional random vector X on a probability space (Ω, A, P) we say that a Markov kernel K is a regular conditional distribution of X given Y if holds P-almost surely for every E ∈ B(R d−1 ). It is well-known that for each random vector (X, Y ) a regular conditional distribution K of X given Y always exists and is unique for P Y -a.e. y ∈ R. If (X, Y ) has distribution function H (in which case we will also write (X, Y ) ∼ H and let μ H denote the corresponding probability measure on B(R d )) we will let K H denote (a version of) the regular conditional distribution of X given Y and simply refer to it as Markov kernel of H. If C ∈ C d is a copula then we will consider the Markov kernel of C automatically as mapping K C : I × B(I d−1 ) → I. Defining the v-section of a set G ∈ B(I d ) as G v := {u ∈ R d−1 : (u, v) ∈ G} the so-called disintegration theorem yields For more background on conditional expectation and general disintegration we refer to [19,23]. We call a copula C ∈ C d completely dependent (w.r.t. the last coordinate) if there exist λ-preserving transformations h 1 , . . . , h d−1 : I → I (i.e., transformations fulfilling λ(h −1 i (F )) = λ(F ) for every F ∈ B(I)) such that is a Markov kernel of C. Since the collection of all completely dependent copulas contains all shuffles of Min, it is dense in (C d , d ∞ ) (also see [26]). For more properties of complete dependence we refer to [25] as well as to [11] and the references therein. Markov kernels can be used to define metrics stronger than the standard uniform metric d ∞ , defined by It is well known that the metric space (C d , d ∞ ) is compact and that pointwise and uniform convergence of a sequence of copulas (C n ) n∈N are equivalent (see [10]). Following [11] and defining it can be shown that D 1 , D 2 and D ∞ are metrics generating the same topology on C d and that the family of completely dependent copulas is closed with respect to these three metrics. In the sequel we will mainly work with D 1 and refer to [11] for more information on D 2 and D ∞ . The metric space (C d , D 1 ) is complete and separable but not compact. Viewing copulas in terms of their conditional distributions and considering weak convergence gives rise to what we refer to as weak conditional convergence in the sequel: Consider a sequence of copulas (C n ) n∈N and a copula C and let (K Cn ) n∈N and K C be (versions of) the corresponding Markov kernels. We will say that (C n ) n∈N converges weakly conditional (w.r.t. the last coordinate) to C if and only if for λ-almost every v ∈ I we have that the sequence (K Cn (v, ·)) n∈N of probability measures on B(I d−1 ) converges weakly to the probability measure K C (v, ·). In the latter case we will write C n wcc − − → C (where 'wcc' stands for 'weak conditional convergence').
According to Lemma 5 in [11] weak conditional convergence of (C n ) n∈N to C implies convergence w.r.t. D 1 but not vice versa (see Example 2.1 below), and convergence w.r.t. D 1 implies convergence in d ∞ but not vice versa.
k2 −m , set n = 2 m + k − 2 and consider the sequence of generalized EFGM copulas (C n ) n∈N given by Then, for every n ∈ N, the identity holds for all u ∈ I d−1 and almost all v ∈ I. Thus, the sequence (K Cn (v, ·)) n∈N fails to converge weakly to K Π (v, ·) for λ-almost all v ∈ I, and it follows that (C n ) n∈N does not converge weakly conditional to Π. On the other hand, considering so lim n→∞ D 1 (C n , Π) = 0. For a counterexample in the case d = 2 we refer to [20].
For any subset J = {j 1 , ..., j |J| } ⊆ {1, . . . , d} with 2 ≤ |J| ≤ d such that j k < j l for all k, l ∈ {1, ..., |J|} with k < l we let C J denote the marginal copula of C with respect to the coordinates in J. If J only contains two indices i, j then we will sometimes also write C ij instead of C {i,j} (no confusion will arise). Weak conditional convergence of a sequence of copulas transfers to marginal copulas: . . , |J|}. Disintegration implies that for every copula C ∈ C d there exists some Markov kernel K C such that C can be expressed as for all (u, v) ∈ I d−1 × I and some Markov kernel K C J∪{d} such that we have for all (s, v) ∈ I |J| × I. Thus holds for all s ∈ I |J| and λ-almost all t ∈ I. Suppose now that C, C 1 , C 2 , . . . are as in the theorem. Since projections are continuous, the Continuous Mapping Theorem and the previous identity imply that for λ-almost every v ∈ I weak convergence of the sequence (K Cn (v, ·)) n∈N to K C (v, ·) implies weak convergence of the sequence (K (Cn) J∪{d} (v, ·)) n∈N to K C J∪{d} (v, ·), which proves the assertion.
We complete this section with two additional notions of convergence considered, e.g., in Spanhel and Kurz [34], the Kullback-Leibler divergence (distance) KL and the total variation metric TV, and describe their relationship with D 1 and d ∞ . Defining T V on C d by convergence with respect to T V implies convergence with respect to D 1 :

Theorem 2.3. The inequalities
hold for all copulas C 1 , C 2 ∈ C d . In particular, convergence w.r.t. T V implies convergence w.r.t. D 1 and D ∞ .
from which the desired inequalities follow immediately. The first inequality has already been proved in [11,Lemma 3].
It is well-known that KL divergence (which is not a metric and only welldefined for absolutely continuous copulas whose density is positive λ d -almost everywhere) is stronger than TV (see the generalized Pinsker inequality in, e.h., [31]). Altogether we have the following interrelation, where a =⇒ b indicates the convergence with respect to a implies convergence with respect to b (and the first implication is restricted to those copulas for which KL divergence is well-defined):

Simplified copulas
In this section we introduce three-dimensional so-called simplified copulas, i.e., copulas for which the conditional copulas do not depend on the conditioning variable. The enormous importance of this type of copulas is underlined by the fact that every copula can be approximated arbitrarily well with respect to d ∞ by simplified copulas (see Corollary 3.7). On the other hand, we will show that simplified pair-copula constructions may fail to approximate a given dependence structure w.r.t. d ∞ reasonably well (see Example 3.8). Additionally, we will see that the afore-mentioned denseness gets lost entirely when finer topologies or stronger metrics are considered, and prove that for D 1 (Theorem 3.9), for the total variation metric TV (Theorem 3.10), and the Kullback-Leibler (KL) divergence (Theorem 3.11) the family is even nowhere dense. With very few exceptions, in literature pair-copula constructions are introduced by working with copula densities, i.e., all copulas are assumed to be absolutely continuous. Ensuring that no key idea of the underlying concept is left out and aiming at a setting as general as possible we deviate from this approach and work with Markov kernels instead.
In this and the subsequent three sections all conditioning will be done with respect to the last coordinate, notice that this does not impose any restriction (as can be seen from Theorem 3.10, Theorem 3.11, Remark 5.4 and Section 7).
According to disintegration for every copula C ∈ C 3 there exists some Markov kernel K C such that C can be expressed as for all (u, v) ∈ I 2 × I. Since K C is a Markov kernel, for every u ∈ I 2 the mapping t → K C (t, [0, u]) is measurable and for almost every t ∈ I the mapping u → K C (t, [0, u]) is a bivariate distribution function with (conditional ) univariate marginal distribution functions F 1|3 (·|t) and F 2|3 (·|t) (conditional on t). Sklar's Theorem implies that for almost every t ∈ I there exists some (conditional ) bivariate copula C t 12;3 (conditional on t) satisfying for all u ∈ I 2 such that the identity holds for all (u, v) ∈ I 2 × I.

Remark 3.1.
(1) Since the (conditional) univariate marginal distribution functions satisfy F 1|3 (1|t) = 1 = F 2|3 (1|t) for every t ∈ I the bivariate marginal copulas C 13 and C 23 of C satisfy for all (u, v) ∈ I 2 × I and their corresponding Markov kernels fulfill for all u ∈ I 2 and λ-almost all t ∈ I (compare with Equation (2.2)). (2) Notice that we choose this different notation for the (conditional) univariate distribution functions on purpose since this facilitates comprehending what follows. (3) For the copulas corresponding to the conditional bivariate distribution functions K C (t, .) we write C t 12;3 instead of C t 12|3 and hence adopt the notation used in the literature (see, e.g., [34]).
The following two observations concerning Equation (3.1) are key: (O1) the (conditional) bivariate copulas C t 12;3 may depend on t; (O2) since the (conditional) univariate marginal distribution functions F 1|3 (.|t) and F 2|3 (.|t) may fail to be continuous the (conditional) bivariate copulas C t 12;3 are not unique in general. To the best of the authors' knowledge, the second observation has not yet been addressed in the literature which is somehow not surprising considering the fact that pair-copula constructions are usually focused on absolutely continuous copulas.
In the sequel we will study copulas C for which (O1) is not true, i.e., copulas for which the (conditional) copulas C t 12;3 do not depend on t. We will refer to a copula C ∈ C 3 as generalized simplified (with respect to the third coordinate) if there exists some bivariate copula A ∈ C 2 such that the identity holds for all (u, v) ∈ I 2 × I. In the sequel C 3 GS will denote the family of all three-dimensional generalized simplified copulas.
The following first results (Theorem 3.2 and Corollary 3.3) imply that the family of generalized simplified copulas is very flexible. Proof. Let C ∈ C 3 be a completely dependent copula, i.e., assume that there exist λ-preserving functions h 1 , h 2 : holds for all (u, v) ∈ I 2 × I. This yields C ∈ C 3 GS . Note that completely dependent copulas are generalized simplified in the broadest sense since Equation (3.4) does not only hold for one or some copulas, it holds for every A ∈ C 2 .
Since the collection of all completely dependent copulas is dense in (C 3 , d ∞ ) Theorem 3.2 has the following consequence:

Corollary 3.3. The collection of all generalized simplified copulas is dense in
Returning to observation (O2) in what follows we will mainly restrict ourselves to the family of copulas C ∈ C 3 for which almost all (conditional) univariate marginal distribution functions F 1|3 (.|t) and F 2|3 (.|t) are continuous and let C 3 c denote the family of all these copulas. According to Sklar's theorem, for every copula C ∈ C 3 c the (conditional) bivariate copulas C t 12;3 are unique for almost all t ∈ I. Obviously the family of all absolutely continuous copulas C 3 ac is a subset of C 3 c , so for absolutely continuous copulas the conditional copulas are unique.
We will let C 3 S := C 3 GS ∩ C 3 c denote the collection of all simplified copulas, i.e., the class of all three-dimensional copulas C which are generalized simplified and have continuous (conditional) univariate marginal distribution functions F 1|3 (.|t) and F 2|3 (.|t). In this case the copula A ∈ C 2 in Equation 3.4 is unique and equals C t 12;3 for almost all t ∈ I. Before proceeding we illustrate the above simplifying assumption in terms of the (Fréchet) class of all three-dimensional copulas C fulfilling that coordinates 1&3 as well as 2&3 are independent: for all u ∈ I 2 and almost all t ∈ I. Thus, C EFGM is non-simplified. (3) The copula C Cube ∈ F 3 Π which distributes mass uniformly within the four cubes and has no mass outside these cubes satisfies and (C Cube ) t 12;3 = A 2 for almost all t ∈ 1 2 , 1 , and the copulas A 1 and A 2 are checkerboard copulas (see [10] for a general definition) whose density is depicted in Figure 1. As a direct consequence C Cube is non-simplified. In contrast to the afore-mentioned class, some copula families only contain simplified copulas: Example 3.5. [17,35] (1) All three-dimensional Gaussian and Student t-copulas are simplified.
(2) The only three-dimensional Archimedean copulas that are simplified are those of Clayton type.
We now focus on empirical copulas, show that they are simplified and then conclude that C 3 S is dense in (C 3 , d ∞ ) (Corollary 3.7). Consider a random vector (X, Y ) with continuous univariate marginals and suppose that (X 1 , Y 1 ), . . . , (X n , Y n ) is a sample from (X, Y ). Since the univariate marginals are continuous w.l.o.g. we can assume that there are no ties. LetĈ n denote the empirical copula (by which we mean the unique copula determined by trilinear interpolation of the empirical subcopula). Then there exist two permutations σ 1 , σ 2 of {1, . . . , n} such that the densityĉ n ofĈ n is given by (uniform distribution on n cubes of volume 1 n 3 ) Theorem 3.6. Every three-dimensional empirical copula is simplified.
Proof. Considering that the (conditional) univariate marginal distribution functions (F n ) 1|3 (·|v), (F n ) 2|3 (·|v) ofĈ n are continuous and given by Since the collection of all empirical copulas is dense in (C 3 , d ∞ ) (see [9, Proposition 3.2]), Theorem 3.6 has the following consequence (for a stronger and more general result see Corollary 7.2):

Corollary 3.7. The collection of all simplified copulas is dense in
Although every copula can be approximated arbitrarily well by simplified copulas a reasonable approximation from the same Fréchet class might not be possible as the following example illustrates: which can be shown as follows: Recall that every simplified copula D from this class fulfills > ε Thus C Cube can not be approximated arbitrarily well by a simplified copula D from the class F 3 Π . We now focus on the afore-mentioned stronger metrics or finer topologies on C (or important subclasses). To simplify notation we will write C 3 ac,>0 for the collection of all absolutely continuous copulas with positive density. Theorem 3.9.

The collection of all simplified copulas is nowhere dense in
with C ∈ C and r > 0. Since according to Lemma A.6 non-simplified checkerboard copulas are dense in (C 3 , D 1 ) we can find a non-simplified checkerboard copula C * ∈ O D1 (C, r). Since, by assumption, the D 1 -closure of the family of all simplified copulas contains O D1 (C, r) there exists a sequence (C n ) n∈N of simplified copulas with lim n→∞ D 1 (C n , C * ) = 0, a contradiction to Lemma A.5.
Proceeding analogously yields the second and the third assertion.
Theorem 3.9 and Theorem 2.3 imply the following two striking results: Theorem 3.10.

The collection of all simplified copulas is nowhere dense in
The collection of all simplified copulas with positive density is nowhere dense in (C 3 ac,>0 , T V ). Theorem 3.11. The collection of all simplified copulas with positive density is nowhere dense in (C 3 ac,>0 , KL). Theorems 3.9, 3.10 and 3.11 answer the question "How dense does the set of simplified densities lie in the set of all densities?" posed by Nagler and Czado [27] in a complete and definitive manner.
In the same article the authors also pose the question on "how far off can we be by assuming a simplified model?" -one of the main objectives of the subsequent sections is to answer this very question. Notice that, for this purpose, we can restrict ourselves to the metric d ∞ since (according to the afore-mentioned results) simplified copulas are nowhere dense w.r.t. D 1 , D ∞ , TV and KL.

Simplified pair-copula constructions
Equation (3.1) suggests the construction of a three-dimensional copula in terms of two families of (conditional) univariate marginal distribution functions characterizing the dependence structure between coordinates 1&3 and coordinates 2&3, respectively, and (conditional) bivariate copulas representing the dependence structure between coordinates 1&2 conditional on the third variable. This just-mentioned construction principle is called vine decomposition or pair-copula construction (see [1,3]). In case the conditioning variable only enters indirectly through the conditional marginals (as it is the case in Equation (3.4); see, e.g., [18] for an early reference), the pair-copula construction is said to be simplified (see [17]).

Construction principle
Simplified pair-copula constructions are used to approximate the data generating copula (from C 3 c ) by a simplified copula (from C 3 S ) using the following hierarchical bottom-up algorithm based on Equation (3.4): (1) Estimation of the (conditional) univariate marginal distribution functions F 1|3 (.|t) and F 2|3 (.|t) conditional on t; (2) Estimation of the (conditional) copula A of coordinates 1&2 conditional on variable 3 assuming that the conditioning variable enters only through the arguments of the conditional copula A (simplifying assumption). The estimation is either done step-by-step or jointly, parametric or non-parametric, for more information we refer to [1,2,16,17,21,27,34] and the references therein. For an additional discussion about estimating conditional copulas satisfying the simplifying assumption (step (2)), we additionally refer to [8,13,14,30].
The 3-dimensional copula resulting from this algorithm is simplified and is said to be a simplified vine copula (SVC). Apparently, the above algorithm and thus its output, the SVC, depend on the estimation method used and also on the suitable family of copulas from which the estimators are selected. The above algorithm may certainly provide a reasonable estimator if the (data generating) copula is simplified. The natural question arising at this point, however, is how well an SVC approximates the data generating copula if the latter fails to be simplified. We start with the following example also discussed in [35, Section 5]: Minimizing the Kullback-Leibler divergence between the conditional copula and its estimator selected from the family of all bivariate EFGM copulas in step (2) yields the bivariate independence copula as the optimal approximation. The SVC selected by a step-by-step algorithm hence equals the three-dimensional independence copula.
Comparing the data generating copula with its selected SVC yields a d ∞distance of 1/64; this equals 6.25% of the maximal d ∞ -distance of two copulas within the (Fréchet) class of all copulas having pairwise independent marginals (using the results in [28,Section 3.3] it is straightforward to verify that the diameter of this class is 1/4).
We refer to [2,17,35] for more examples and comparisons of the data generating copula with its selected simplified vine copula whereby the quality of the approximations is judged quite differently.
Aiming to obtain more general analytic results concerning the optimality of simplified pair-copula constructions, in what follows we discuss the concept of partial vine copulas.

Partial vine copulas (PVCs)
The basic idea behind a partial vine copula is that the conditional bivariate copulas of the original three-dimensional copula are averaged (see [33,34]): Considering that for every C ∈ C 3 c the copula C t 12;3 is unique for almost every t ∈ I it follows that the function C p : I 2 → I, given by is well-defined. In the sequel we will refer to C p as the partial copula of C (also see [4]). Coinciding with the expected conditional copula, the partial copula is often used as an approximation of the conditional copula (see [33,34] for more information). Given C p in the above setting the mapping ψ : C 3 c → C 3 c , given by is well-defined and assigns to every copula C ∈ C 3 c a simplified copula ψ(C). The copula ψ(C) is referred to as the partial vine copula of C (with respect to the third coordinate) in the sequel. It is obvious that every partial vine copula is simplified.
The transformation ψ preserves the dependence structure between coordinates 1&3 as well as between coordinates 2&3. The following lemma gathers some additional properties of ψ: c . Then the following assertions hold: (1) The partial vine copula ψ(C) of C satisfies (ψ(C)) 13 = C 13 as well as (ψ(C)) 23 = C 23 .
The identity (ψ(C)) 23 = C 23 follows in the same manner.
In fact, considering that F 1|3 (s 1 |t) = s 1 and F 2|3 (s 2 |t) = s 2 hold for all s ∈ I 2 and almost all t ∈ I we get for all s ∈ I 2 . Having this, the fact that (ψ(C))(u The copula C RCube ∈ F 3 Π whose mass is distributed uniformly within the cubes

T. Mroz et al.
and has no mass outside these cubes is non-simplified, satisfies PVCs have been used in [24] to test the simplifying assumption in vine copula models and in [27] to construct a non-parametric estimator for multivariate distributions. In [34] the authors showed that "under regularity conditions, stepwise estimators of pair-copula constructions converge to the PVC irrespective of whether the simplifying assumption holds or not" (see [34,Corollary 6.1]). Nevertheless, this does not need to be true if the estimation is done jointly in a non-simplified setting (see [34,Corollary 6.1]). The authors further proved that "if one sequentially minimizes the Kullback-Leibler divergence related to each tree then the optimal SVC is the PVC" (see [34,Theorem 5.1]). Since, again, this is not necessarily true if the estimation is done jointly in a non-simplified setting (see [34,Theorem 5.2]) the authors conclude that PVCs "may not be the best approximation in the space of SVCs" but are "often the best feasible SVC approximation in practice." Motivated by these results in what follows we discuss analytic properties and optimality of simplified pair-copula constructions and focus mainly on partial vine copulas. In Section 5 we calculate the d ∞ -distance between non-simplified copulas and their unique partial vine copulas for different dependence structures, in Section 6 we discuss continuity of ψ with respect to different notions of convergence.

Optimality of partial vine copulas
Main objective of this section is to provide an answer to the question "how far off can we be by assuming a simplified model?" posed by Nagler and Czado [27]. We proceed as follows: We first show that partial vine copulas are never the best simplified copula approximation (with respect to d ∞ ) if the true copula is nonsimplified (Theorem 5.1). We then compare non-simplified copulas C with their unique partial vine copulas ψ(C) in different settings and calculate their d ∞distance. It turns out that the maximal distance within the family of all copulas with pairwise independent marginals is 1/8 which corresponds to 50% of the diameter of this class w.r.t. d ∞ . Going even further, we provide an example of a copula C ∈ C 3 fulfilling d ∞ (C, ψ(C)) = 3/16 which, in turn, corresponds to 28.125% of the diameter of (C 3 , d ∞ ). In other words, ψ(C) can be far away from C, so working with PVCs must be done with care. Corollary 3.7 implies that if C does not fulfill the simplifying assumption then the partial vine copula fails to be optimal with respect to d ∞ : c is non-simplified. Then there exists some simplified copula D ∈ C 3 S satisfying d ∞ (C, D) < d ∞ (C, ψ(C)). Proof. Considering C ∈ C 3 c \C 3 S we have C = ψ(C), so setting 0 < d ∞ (C, ψ(C)) =: ε and using Corollary 3.7 yields the desired result.
As next step we calculate show that the supremum is attained and then characterize all elements in F 3 Π attaining the maximum. Afterwards we provide a lower bound for The (dis)continuity results in Section 6 will make it clear why we can not simply use compactness of (C 3 , d ∞ ) to conclude that the supremum in the last expression is attained.

Worst case scenario for the class F 3 Π
The following theorem holds -notice that the set of maximizers includes the two copulas C Cube and C RCube introduced in Examples 3.4 and 4.4: Theorem 5.2. For every copula C ∈ F 3 Π the inequality d ∞ C, ψ(C) ≤ 1 8 holds. Moreover, for every C ∈ F 3 Π the following two conditions are equivalent: Proof. Consider C ∈ F 3 Π , fix (u, v) ∈ I 2 × (0, 1) and set Then and Having this and using Example 4.3 yields which proves the first assertion.
For proving the stated equivalence we proceed as follows: First suppose that (b) holds. Considering that for u = 1 2 and v = 1 2 we have it follows that so (a) holds and it remains to show that (a) implies (b). First of all notice that and that it is straightforward to show that |k − l| is at most 1/2 and that 1/2 can only be attained by choosing u 1 = 1/2 = u 2 (irrespective of the value of v).
Notice that Theorem 5.2 implies the following striking property: The maximal distance of a copula C with pairwise independent marginals and its partial vine copula ψ(C) corresponds to -50% of the diameter of the metric space of all copulas with pairwise independent marginals w.r.t. d ∞ ; the diameter of this class equals 1/4 which can be calculated via [28,Section 3.3]. -18.75% of the diameter of (C 3 , d ∞ ), which is given by 2/3.

Remark 5.3.
An equally striking result can be shown for the metric D 1 : Again working with C Cube it follows that holds. Using the results in [11] we therefore get that the maximal D 1 -distance of a copula C ∈ F 3 Π and its partial vine copula ψ(C) is greater than or equal to 42.1875% of the diameter of the metric space (C 3 , D 1 ); the diameter of this class is at most 5/9 which can be calculated via [11,Lemma 2].
Remark 5.4. At this point it is worth to mention that C Cube is exchangeable and hence approximating C Cube by ψ(C Cube ) leads to equally poor results no matter which coordinate is chosen for the conditioning.

Worst case scenario for the full class C 3 c
We are now going to show that the maximal d ∞ -distance of a copula C ∈ C 3 c and its assigned partial vine copula ψ(C) is at least 3/16 which corresponds to 28.125% of the diameter of the metric space (C 3 c , d ∞ ).

Example 5.5. Consider the intervals
. . , 4}. We use Equation (3.1) in order to construct a three-dimensional non-simplified copula C satisfying that its conditional copulas C t 12;3 , t ∈ I, are identical for all t within each of the four subintervals. To this end, set where the bivariate copulas D 1 , . . . , D 4 are the shuffles of W depicted in Figure  2 (for the definition of shuffles we refer to [9, Definition 2.1] and [12, Section 5]). As next step we construct the (conditional) univariate marginal distribution  functions F 1|3 (.|t) and F 2|3 (.|t) (conditional on t ∈ I) and proceed as follows: Let B * , B * * denote bivariate checkerboard copulas (see [11] for a definition) whose respectively (see Figure 3). Then the Markov kernels of B * and B * * obviously satisfy Completing the construction of C we use the copulas A t , t ∈ I, as conditional copulas and the Markov kernels K B * and K B * * as (conditional) univariate marginal distribution functions, and set (5.1) Then C ∈ C 3 c is non-simplified, satisfies C t 12;3 = A t for all t ∈ I 1 ∪ I 2 ∪ I 3 ∪ I 4 , C 13 = B * , C 23 = B * * , as well as C 0.5, 0.5, 1 Considering that the partial copula C p of C is given by from which we get d ∞ (C, ψ(C)) ≥ 3 16 . We have therefore proved the following theorem: Theorem 5.6. There exists a copula C ∈ C 3 c fulfilling d ∞ (C, ψ(C)) ≥ 3 16 and we have sup

Continuity of ψ
In this section we discuss continuity properties of the mapping ψ : C 3 c → C 3 c assigning every C ∈ C 3 c its partial vine copula. Having in mind Lemma 4.2 intuitively one might interpret ψ as projection and therefore think that ψ has to be continuous with respect to d ∞ . It turns out, however, that this interpretation is wrong, we will show that ψ is not continuous with respect to d ∞ . Considering stronger topologies than the one induced by d ∞ changes the picture -we will prove that ψ is continuous with respect to weak conditional convergence and with respect to the metric D 1 (under some mild regularity conditions).

Uniform convergence
The mapping ψ is not continuous with respect to d ∞ -the following result holds: Then C is a discontinuity point of the the mapping ψ : C 3 c → C 3 c . In other words: Every non-simplified C ∈ C 3 c is a discontinuity point of ψ. Proof. Let C be as in the theorem and set ε := d ∞ (C, ψ(C)) > 0. Suppose that X 1 , X 2 , . . . is an i.i.d. sample from X ∼ C and let C n denote the corresponding empirical copula. With probability one we have that X 1 , X 2 , . . . has no ties and that ( C n ) n∈N converges to C with respect to d ∞ . Considering that empirical copulas are simplified according to Theorem 3.6 and using the triangle inequality it follows immediately that follows, implying that ψ is not continuous at C.
Using convex combinations (of empirical copulas with a non-simplified copula) it is straightforward to verify that the set of all C ∈ C 3 c that are nonsimplified is dense in (C 3 c , d ∞ ) -Theorem 6.1 therefore has the following corollary: Corollary 6.2. The mapping ψ : C 3 c → C 3 c is discontinuous on a dense subset of (C 3 c , d ∞ ).

Weak conditional convergence
Focusing on weak conditional convergence the mapping ψ behaves more nicely: (2) C n Proof. The first assertions follows from Theorem 2.2. To prove the second one we proceed as follows: Since for almost all v ∈ I the marginal distribution functions of K Cn (v, .), n ∈ N, and of K C (v, .) are continuous, Lemma A.2 implies uniform convergence of the sequence ((C n ) v 12;3 ) n∈N to C v 12;3 . For s ∈ I 2 we get and dominated convergence yields To prove the last assertion notice that for almost all t ∈ I we have (ψ(C)) t 12;3 = C p as well as (ψ(C n )) t 12;3 = (C n ) p for every n ∈ N. Hence, using the second assertion it follows that holds for almost all t ∈ I. According to Lemma A.2 it now suffices to show that the marginal distribution functions of the Markov kernels converge weakly, which is, however, an immediate consequence of the fact that (ψ(C)) i3 = C i3 and (ψ(C n )) i3 = (C n ) i3 , i ∈ {1, 2} holds for every n ∈ N (see Lemma 4.2).

Convergence with respect to D 1
We finally discuss D 1 -continuity. Similar to the proof of Theorem 6.3, we first relate D 1 -convergence of copulas to uniform convergence of the corresponding partial copulas. The slightly technical (but straightforward) proof of the following useful lemma is deferred to the appendix: Lemma 6.4. Suppose that C, C 1 , C 2 , . . . are copulas in C 3 c . Then the following assertions hold: We now show D 1 -continuity of the mapping ψ on the subclass of absolutely continuous copulas satisfying some integrability condition. The following lemma whose proof is deferred to the appendix will be key for proving this result: Lemma 6.5. Suppose that C, C 1 , C 2 , . . . are copulas in C 3 c , that C is absolutely continuous and let c 13 , c 23 denote the densities of the marginal copulas C 13 , C 23 of C. If there exist some constants p 13 , p 23 , p 123 ∈ (1, ∞) such that Combining the previous two lemmata yields continuity of ψ with respect to D 1 under some mild regularity conditions: Theorem 6.6. Consider a sequence of copulas (C n ) n∈N in C 3 c and an absolutely continuous copula C ∈ C 3 c , and let c 13 , c 23 denote the densities of the marginal copulas C 13 , C 23 of C, respectively. If there exist some constants p 13 , p 23 , p 123 ∈ (1, ∞) such that

Results for arbitrary dimension
To confirm that the case of dimension three is similar to higher dimension in this section we extend (slightly modified versions of) our main results (Theorem 3.6, Corollary 3.7, Theorem 5.1, Theorem 5.6, Theorem 6.1 and Corollary 6.2) to arbitrary dimensions.

Simplified copulas.
Using disintegration for every copula C ∈ C d , every J ⊆ {1, . . . , d} with 2 ≤ |J| ≤ d and every L ⊆ J with 1 ≤ |L| ≤ |J| −2, there exists some Markov kernel K C J such that the lower dimensional marginal copula C J of C corresponding to the indices of the coordinates of C belonging to J can be expressed as for all u ∈ I |J| . Thereby u L ∈ I |L| denotes the vector of coordinates of u belonging to L, and u J\L ∈ I |J\L| the vector of coordinates of u belonging to J\L. Since K C J is a Markov kernel, for every u J\L ∈ I |J\L| the mapping t → K C J (t, [0, u J\L ]) is measurable and, for μ C L -almost every t ∈ I |L| , the mapping u J\L → K C J (t, [0, u J\L ]) is a multivariate distribution function with (conditional ) univariate marginal distribution functions F j|L (.|t), j ∈ J\L, (conditional on t). By Sklar's theorem we get that for almost every t ∈ I |L| there exists some (conditional ) copula C t J\L;L (conditional on t) satisfying for all u J\L = (u j1 , . . . , u j |J\L| ) ∈ I |J\L| such that the identity holds for all u ∈ I |J| .
We will refer to a copula C ∈ C d as universally simplified if for every J ⊆ {1, . . . , d} with 2 ≤ |J| ≤ d and every L ⊆ J with 1 ≤ |L| ≤ |J| − 2 the following properties hold: (U1) There exists some copula A ∈ C |J\L| such that the identity holds for all u ∈ I |J| .
(U2) The (conditional) univariate marginal distribution functions F j|L (.|t), j ∈ J\L, are continuous for μ C L -almost all t ∈ I |L| . Notice that every universally simplified three-dimensional copula is simplified in the sense studied in the last sections but not necessarily vice versa. If C ∈ C d is universally simplified then Sklar's theorem implies that the (conditional) copulas C t J\L;L are unique for μ C L -almost all t ∈ I |L| . In what follows we will let C d c denote the family of all d-dimensional copulas having continuous (conditional) univariate marginal distribution functions, C d US will denote the family of all ddimensional universally simplified copulas. Notice that Π ∈ C d US and that the collection of all absolutely continuous copulas C d ac is contained in C d c . As first step we now prove a sharper version of Theorem 3.6 and show that all d-variate empirical copulas (d-linear interpolations) are universally simplified.

Theorem 7.1. Every d-dimensional empirical copula is universally simplified.
Proof. Suppose that X is a d-dimensional random vector with continuous univariate marginals and suppose that X 1 , . . . , X n is a sample from X. W.l.o.g. assume that there are no ties. LettingĈ n denote the (d-linear interpolation of the) empirical copula there exists unique permutations σ 1 , . . . , σ d−1 of {1, . . . , n} such that the densityĉ n ofĈ n is given by (uniform distribution on n d-dimensional squares of volume 1 n d ) where ⎞ ⎠ and the conditional univariate distribution functions F j|L (u j |u L ) for every j ∈ J \ L can be expressed as Having this we have shown , which completes the proof.
Since the collection of all empirical copulas is dense in (C d , d ∞ ) ([9, Proposition 3.2]) Theorem 7.1 has the following immediate consequence: Thus, every copula can be approximated arbitrarily well by universally simplified ones. Given a d-dimensional, non universally simplified copula C, a good uniform approximation by a universally simplified one from the same (Fréchet) class might, however, not exist. The next example illustrates this fact: holds for all (u, v) ∈ I 2 × I d−2 . Notice that the second equality holds since in case of X ∼ D we have that (X 1 , X 3 ) and (X 4 , . . . , X d ) are independent, hence for all u ∈ (0, 1) d , so X 1 and (X 3 , . . . , X d ) are independent and we get F 1|L (u 1 |t) = u 1 (the same reasoning applies to F 2|L ).
Setting C(u, v) = C Cube (u) Π(v) for all (u, v) ∈ I 3 × I d−3 and considering Example 3.8 it therefore follows that d ∞ (C, D) > ε (with ε as in Example 3.8), i.e., it is not possible to approximate C by universally simplified copulas in F d Ind with an error smaller than ε.

Partial vine copulas (PVC-D)
We finally introduce partial vine copulas (PVC) belonging to a D-vine structure, follow [34], and work with absolutely continuous copulas. The hierarchical construction of a partial vine copula of D-vine structure then works as follows: Consider some absolutely continuous copula C ∈ C d ac and set I := {(k, l) : -In the first step, i.e. for (i, j) ∈ I with j = 1, we define the partial copulas In the second step, i.e. for (i, j) ∈ I with j = 2, we set S i,j := (i + 1) and define the partial copulas (see [34, p. 1262]) In the next steps, i.e. for (i, j) ∈ I with j ≥ 3, we set S i,j := (i + 1, . . . , i + j − 1) and define the higher-order partial copulas (see [34, p. 1262]) is referred to as the partial vine copula of D-vine structure corresponding to C (see [34, p. 1262]. The mapping induced by the afore-mentioned procedure will be denoted by ψ : C d ac → C d ac , i.e., Notice that, by definition, C PVC is simplified with respect to the underlying D-vine structure but may fail to be universally simplified.
Lemma A.3. Suppose that C ∈ C 3 ac is an absolutely continuous copula, and let c 13 , c 23 denote the densities of the marginal copulas C 13 , C 23 of C, respectively. Then the following inequality holds for everyC ∈ C 3 c : Proof. ForC ∈ C 3 c and C ∈ C 3 ac we have For every t ∈ I define T t : Then T t is measurable, obviously satisfies (T t ) −1 (I 2 ) = I 2 , and for every u ∈ I 2 , implying that (λ 2 ) T t is absolutely continuous with density (a 1 , a 2 ) → c 13 (a 1 , t) c 23 (a 2 , t). This yields Focusing on I 2 , using Sklar's theorem, Lipschitz continuity, and a similar argument as before yields − KC t, 0,F ← 1|3 (s 1 |t) × 0,F ← 2|3 (s 2 |t) dλ 2 (s)dλ(t) This proves the assertion.
Suppose that C ∈ C 3 is a checkerboard copula. Then C ∈ C 3 ac . We will say that C has resolution N ≥ 2 if N is the smallest integer such that (there is a version of) its density c of C is constant on each square of the form Notice that if C is a checkerboard copula with resolution N then its density c fulfills c(u, t) ≤ N 2 for λ 3 -almost all (u, t) ∈ I 3 . Given a checkerboard copula C with resolution N w.l.o.g. we may assume that the mapping t → The second assertion now follows from Lemma A.4 and the fact that Δ(C) only depends on C and not on n, the assertion concerning weak conditional convergence from the fact that weak conditional convergence implies convergence w.r.t. D 1 .
1. The family of all non-simplified checkerboards is dense in (C 3 , D ∞ ), in (C 3 , D 1 ), and dense in C 3 endowed with the topology induced by weak conditional convergence. Proof. To prove the first assertion let C ∈ C 3 be arbitrary but fixed. Since according to [11] checkerboard copulas are dense in (C 3 , D 1 ) we can find a sequence (B n ) n∈N of checkerboard copulas with lim n→∞ D 1 (B n , C) = 0. For every n ∈ N let E n be a non-simplified checkerboard copula with the same resolution and the same (1, 3)-and (2, 3)-marginals as B n . Setting C n := (1 − 1 n )B n + 1 n E n for every n ∈ N yields a sequence (C n ) n∈N of non-simplified checkerboard copulas. Considering D 1 (C n , C) ≤ (1 − 1 n )D 1 (C, B n ) + 1 n D 1 (C, E n ) it follows that lim n→∞ D 1 (C n , C) = 0, which completes the proof of the first assertion concerning D 1 and D ∞ . The assertion concerning weak conditional convergence can be shown analogously: in fact, it is straightforward to extend the bivariate proof in [20,Theorem 3.2] to the three-dimensional setting, hence reusing the convex combination idea and considering C n := (1 − 1 n )B n + 1 n E n yields the desired result.
To prove the second assertion suppose that C ∈ C 3 ac has positive density. According to the first assertion we can find a sequence (B n ) n∈N of non-simplified checkerboard copulas with lim n→∞ D 1 (B n , C) = 0. Setting C n := (1 − 1 n )B n + 1 n Π for every n ∈ N yields a sequence (C n ) n∈N of non-simplified checkerboard copulas with positive density. Considering we get lim n→∞ D 1 (C n , C) = 0. Since the assertion for weak conditional convergence can be shown analogously, the proof is complete.