Further Results on Size and Power of Heteroskedasticity and Autocorrelation Robust Tests, with an Application to Trend Testing

We complement the theory developed in Preinerstorfer and P\"otscher (2016) with further finite sample results on size and power of heteroskedasticity and autocorrelation robust tests. These allows us, in particular, to show that the sufficient conditions for the existence of size-controlling critical values recently obtained in P\"otscher and Preinerstorfer (2016) are often also necessary. We furthermore apply the results obtained to tests for hypotheses on deterministic trends in stationary time series regressions, and find that many tests currently used are strongly size-distorted.


Introduction
Heteroskedasticity and autocorrelation robust tests in regression models suggested in the literature (e.g., tests based on the covariance estimators in West (1987, 1994), Andrews (1991), and Andrews and Monahan (1992), or tests in Kiefer et al. (2000), Vogelsang (2002a,b, 2005)) often suffer from substantial size distortions or power deficiencies. This has been repeatedly documented in simulation studies, and has been explained analytically by the theory developed in Preinerstorfer and Pötscher (2016) to a large extent. Given a test for an affine restriction on the regression coefficient vector, the results in Preinerstorfer and Pötscher (2016) provide several sufficient conditions that imply size equal to one, or severe biasedness of the test (resulting in low power in certain regions of the alternative). The central object in that theory is the Summarizing we arrive at the following situation: Preinerstorfer and Pötscher (2016) provide -inter alia -sufficient conditions for non-existence of sizecontrolling critical values in terms of the set of concentration spaces of a covariance model, whereas Pötscher and Preinerstorfer (2018) provide sufficient conditions for the existence of size-controlling critical values formulated in terms of a different set of linear spaces derived from the covariance model. Combining the results in Preinerstorfer and Pötscher (2016) and Pötscher and Preinerstorfer (2018) does in general not result in necessary and sufficient conditions for the existence of size-controlling critical values. [This is partly due to the fact that different sets of linear spaces associated with the covariance model are used in these two papers.] Rather, there remains a range of problems for which the existence of size-controlling critical values can be neither disproved by the results in Preinerstorfer and Pötscher (2016) nor proved by the results in Pötscher and Preinerstorfer (2018).
In the present paper we close the "gap" between the negative results in Preinerstorfer and Pötscher (2016) on the one hand, and the positive results in Pötscher and Preinerstorfer (2018) on the other hand. We achieve this by obtaining new negative results that are typically more general than the ones in Preinerstorfer and Pötscher (2016). Instead of directly working with concentration spaces of a given covariance model (as in Preinerstorfer and Pötscher (2016)) our main strategy is essentially as follows: We first show that size properties of (invariant) tests are preserved when passing from the given covariance model to a suitably constructed auxiliary covariance model which has the property that the concentration spaces of this auxiliary covariance model coincide with the set J of linear spaces derived from the initial covariance model (as used in the results of Pötscher and Preinerstorfer (2018)). Then we apply results in Preinerstorfer and Pötscher (2016) to the concentration spaces of the auxiliary covariance model to obtain a necessary condition for the existence of size-controlling critical values. [This result is first formulated for arbitrary covariance models, and is then further specialized to the case of stationary autocorrelated errors.] The so-obtained new result now allows us to prove that the conditions developed in Pötscher and Preinerstorfer (2018) for the possibility of size control are not only sufficient, but are -under certain (weak) conditions on the test statistic -also necessary. Additionally, we also study power properties and provide conditions under which a critical value leading to size control will lead to low power in certain regions of the alternative; we also discuss conditions under which this is not so.
Obtaining results for the class of problems inaccessible by the results of Preinerstorfer and Pötscher (2016) and Pötscher and Preinerstorfer (2018) is not only theoretically satisfying. It is also practically important as this class contains empirically relevant testing problems: As a further contribution we thus apply our results to the important problem of testing hypotheses on polynomial or cyclical trends in stationary time series, the former being our main focus. Testing for trends certainly is an important problem (not only) in economics, and has received a great amount of attention in the literature. Using our new results we can prove that many tests currently in use (e.g., conventional tests based on long-run-variance estimators, or more specialized tests as suggested in Vogelsang (1998) and Bunzel and Vogelsang (2005)) suffer from severe size problems whenever the covariance model is not extremely small (that is, is large enough to contain all covariance matrices of stationary autoregressive processes of order two or a slight enlargement of that set, a weak condition that is satisfied by the covariance models used in Vogelsang (1998) or Bunzel and Vogelsang (2005); cf. also the last paragraph preceding Section 5.1.1). Furthermore, our results show that this problem can not be resolved by increasing the critical values used (as it is established that no size-controlling critical value exists).
The structure of the article is as follows: Section 2 introduces the framework and some notation. In Section 3 we present results concerning size properties of nonsphericity-corrected F-type tests. This is done on two levels of generality: In Subsection 3.1 we present results for general covariance models, whereas in Subsection 3.2 we present results for covariance models obtained from stationary autocorrelated errors. In these two sections it is also shown that the conditions for size control obtained in Theorems 3.2, 3.8, 6.5, 6.6 and in Corollary 5.6 of Pötscher and Preinerstorfer (2018) are not only sufficient but are also necessary in important scenarios. In Section 4 we present results concerning the power of tests based on size-controlling critical values. Finally, in Section 5 we discuss consequences of our results for testing restrictions on coefficients of polynomial and cyclical regressors. All proofs as well as some auxiliary results are given in the appendices.

The model and basic notation
Consider the linear regression model where X is a (real) nonstochastic regressor (design) matrix of dimension n × k and where β ∈ R k denotes the unknown regression parameter vector. We always assume rank(X) = k and 1 ≤ k < n. We furthermore assume that the n × 1 disturbance vector U = (u 1 , . . . , u n ) is normally distributed with mean zero and unknown covariance matrix σ 2 Σ, where Σ varies in a prescribed (nonempty) set C of symmetric and positive definite n × n matrices and where 0 < σ 2 < ∞ holds (σ always denoting the positive square root). 3 The set C will be referred to as the covariance model. We shall always assume that C allows σ 2 and Σ to be uniquely determined from σ 2 Σ. 4 [This entails virtually no loss of generality and can always be achieved, e.g., by imposing some normalization assumption on the elements of C such as normalizing the first diagonal element of Σ or the norm of Σ to one, etc.] The leading case will concern the situation where C results from the assumption that the elements u 1 , . . . , u n of the n × 1 disturbance vector U are distributed like consecutive elements of a zero mean weakly stationary Gaussian process with an unknown spectral density, but allowing for more general covariance models is useful. The linear model described in (2.1) together with the Gaussianity assumption on U induces a collection of distributions on the Borel-sets of R n , the sample space of Y. Denoting a Gaussian probability measure with mean μ ∈ R n and (possibly singular) covariance matrix A by P μ,A , the induced collection of distributions is then given by P μ,σ 2 Σ : μ ∈ span(X), 0 < σ 2 < ∞, Σ ∈ C . (2.2) Since every Σ ∈ C is positive definite by assumption, each element of the set in the previous display is absolutely continuous with respect to (w.r.t.) Lebesgue measure on R n . We shall consider the problem of testing a linear (better: affine) hypothesis on the parameter vector β ∈ R k , i.e., the problem of testing the null Rβ = r against the alternative Rβ = r, where R is a q × k matrix always of rank q ≥ 1 and r ∈ R q . Set M = span(X). Define the affine space Adopting these definitions, the above testing problem can then be written more precisely as We also define M lin 0 as the linear space parallel to M 0 , i.e., M lin 0 = M 0 − μ 0 for some μ 0 ∈ M 0 . Obviously, M lin 0 does not depend on the choice of μ 0 ∈ M 0 . The previously introduced concepts and notation will be used throughout the paper.
The assumption of Gaussianity is made mainly in order not to obscure the structure of the problem by technicalities. Substantial generalizations away from Gaussianity are possible exactly in the same way as the extensions discussed in Section 5.5 of Preinerstorfer and Pötscher (2016); see also Appendix E of Pötscher and Preinerstorfer (2018). The assumption of nonstochastic regressors can be relaxed somewhat: If X is random and, e.g., independent of U, the results of the paper apply after one conditions on X. For arguments supporting conditional inference see, e.g., Robinson (1979).
We next collect some further terminology and notation used throughout the paper. A (nonrandomized) test is the indicator function of a Borel-set W in R n , with W called the corresponding rejection region. The size of such a test (rejection region) is the supremum over all rejection probabilities under the null hypothesis H 0 , i.e., sup Throughout the paper we letβ X (y) = (X X) −1 X y, where X is the design matrix appearing in (2.1) and y ∈ R n . The corresponding ordinary least squares (OLS) residual vector is denoted byû X (y) = y − Xβ X (y). If it is clear from the context which design matrix is being used, we shall drop the subscript X from β X (y) andû X (y) and shall simply writeβ(y) andû(y). We use Pr as a generic symbol for a probability measure. Lebesgue measure on the Borel-sets of R n will be denoted by λ R n , whereas Lebesgue measure on an affine subspace A of R n (but viewed as a measure on the Borel-sets of R n ) will be denoted by λ A , with zero-dimensional Lebesgue measure being interpreted as point mass. The set of real matrices of dimension l × m is denoted by R l×m (all matrices in the paper will be real matrices). Let B denote the transpose of a matrix B ∈ R l×m and let span(B) denote the subspace in R l spanned by its columns. For a symmetric and nonnegative definite matrix B we denote the unique symmetric and nonnegative definite square root by B 1/2 . For a linear subspace L of R n we let L ⊥ denote its orthogonal complement and we let Π L denote the orthogonal projection onto L. For an affine subspace A of R n we denote by G(A) the group of all affine transformations on R n of the form y → δ(y − a) + a * where δ = 0 and a as well as a * belong to A. [If A is a linear space, G(A) consists precisely of all transformations of the form y → δy +ā with δ = 0 andā ∈ A.] The j-th standard basis vector in R n is written as e j (n). Furthermore, we let N denote the set of all positive integers. A sum (product, respectively) over an empty index set is to be interpreted as 0 (1, respectively). Finally, for a subset A of a topological space we denote by cl(A) the closure of A (w.r.t. the ambient space).

Classes of test statistics
The rejection regions we consider will be of the form W = {y ∈ R n : T (y) ≥ C}, where the critical value C satisfies −∞ < C < ∞ and the test statistic T is a Borel-measurable function from R n to R. With the exception of Section 4, the results in the present paper will concern the class of nonsphericity-corrected Ftype test statistics as defined in (28) of Section 5.4 in Preinerstorfer and Pötscher (2016) that satisfy Assumption 5 in that reference. For the convenience of the reader we recall the definition of this class of test statistics. We start with the following assumption, which is Assumption 5 in Preinerstorfer and Pötscher (2016): Assumption 1. (i) Suppose we have estimatorsβ : R n \N → R k andΩ : R n \N → R q×q that are well-defined and continuous on R n \N , where N is a closed λ R n -null set. Furthermore,Ω(y) is symmetric for every y ∈ R n \N . (ii) The set R n \N is assumed to be invariant under the group G(M), i.e., y ∈ R n \N implies δy + Xη ∈ R n \N for every δ = 0 and every η ∈ R k . (iii) The estimators satisfy the equivariance propertiesβ(δy + Xη) = δβ(y) + η andΩ(δy + Xη) = δ 2Ω (y) for every y ∈ R n \N , for every δ = 0, and for every η ∈ R k . (iv)Ω is λ R n -almost everywhere nonsingular on R n \N .
Nonsphericity-corrected F-type test statistics are now of the form whereβ,Ω, and N satisfy Assumption 1 and N * = N ∪ y ∈ R n \N : detΩ(y) = 0 . We recall from Lemmata 5.15 and F.1 in Preinerstorfer and Pötscher (2016) that N * is then a closed λ R n -null set that is invariant under G(M), and that T is continuous on R n \N * (and is obviously Borel-measurable on R n ). Furthermore, T is G(M 0 )-invariant, i.e., T (δ(y − μ 0 ) + μ 0 ) = T (y) holds for every y ∈ R n , every δ = 0, every μ 0 ∈ M 0 , and for every μ 0 ∈ M 0 .
Remark 2.1. (Important subclasses) (i) Classical autocorrelation robust test statistics (e.g., those considered in Newey and West (1987), Andrews (1991) Sections 3-5, or in Kiefer et al. (2000), Vogelsang (2002a,b, 2005)) fall into this class: More precisely, denoting such a test statistic by T w as in Pötscher and Preinerstorfer (2018), it follows that T w is a nonsphericity-corrected F-type test statistic with Assumption 1 above being satisfied, provided only Assumptions 1 and 2 of Pötscher and Preinerstorfer (2018) hold. Hereβ is given by the ordinary least squares estimatorβ,Ω is given byΩ w defined in Section 3 of Pötscher and Preinerstorfer (2018), and N = ∅ holds (see Remark 5.17 in Pötscher and Preinerstorfer (2018)). Furthermore,Ω =Ω w is then nonnegative definite on all of R n (see Section 3.2 of Preinerstorfer and Pötscher (2016) or Section 3 of Pötscher and Preinerstorfer (2018)). We also recall from Section 5.3 of Pötscher and Preinerstorfer (2018) that in this case the set N * can be shown to be a finite union of proper linear subspaces of R n .
(ii) Classical autocorrelation robust test statistics like T w , but where the weights are now allowed to depend on the data (e.g., through data-driven bandwidth choice or through prewithening, etc.) as considered, e.g., in Andrews (1991), Andrews and Monahan (1992), and Newey and West (1994), also fall into the class of nonsphericity-corrected F-type tests under appropriate conditions (with the set N now typically being nonempty), see Preinerstorfer (2017) for details. The same is typically true for test statistics based on parametric long-run variance estimators or test statistics based on feasible generalized least squares (cf. Section 3.3 of Preinerstorfer and Pötscher (2016)).
(iii) A statement completely analogous to (i) above applies to the more general class of test statistics T GQ discussed in Section 3.4B of Pötscher and Preinerstorfer (2018), provided Assumption 1 of Pötscher and Preinerstorfer (2018) is traded for the assumption that the weighting matrix W * n appearing in the definition of T GQ is positive definite (andΩ is of course now as discussed in Section 3.4B of Pötscher and Preinerstorfer (2018)); see Remark 5.17 in Pötscher and Preinerstorfer (2018). Again,Ω is then nonnegative definite on all of R n (see Section 3.2.1 of Preinerstorfer and Pötscher (2016)), N = ∅ holds, and N * is a finite union of proper linear subspaces of R n (see Section 5.3 of Pötscher and Preinerstorfer (2018)).
(iv) The (weighted) Eicker-test statistic T E,W (cf. Eicker (1967)) as defined on pp.410-411 of Pötscher and Preinerstorfer (2018) is also a nonsphericitycorrected F-type test statistic with Assumption 1 above being satisfied, wherě β =β,Ω =Ω E,W defined on p.411 of Pötscher and Preinerstorfer (2018), and N = ∅ holds. Again,Ω is nonnegative definite on all of R n , and N * = span(X) holds (see Sections 3 and 5.3 of Pötscher and Preinerstorfer (2018)). We note that the classical (i.e., uncorrected) F-test statistic also falls into this class as it coincides (up to a known constant) with T E,W in case W is the identity matrix.
(v) Under the assumptions of Section 4 of Preinerstorfer and Pötscher (2016) (including Assumption 3 in that reference), usual heteroskedasticity-robust test statistics considered in the literature (see Long and Ervin (2000) for an overview) also fall into the class of nonsphericity-corrected F-type test statistics with Assumption 1 being satisfied. Again, the matrixΩ is then nonnegative definite everywhere, N = ∅ holds, and N * is a finite union of proper linear subspaces of R n (the latter following from Lemma 4.1 in Preinerstorfer and Pötscher (2016) combined with Lemma 5.18 of Pötscher and Preinerstorfer (2018)).
We shall also encounter cases whereΩ(y) may not be nonnegative definite for some values of y ∈ R n \N . For these cases the following assumption, which is Assumption 7 in Preinerstorfer and Pötscher (2016), will turn out to be useful. For a discussion of this assumption see p. 314 of that reference.

A result for general covariance models
In this subsection we start with a negative result concerning size in a class of nonsphericity-corrected F-type test statistics that is central to many of the results in the present paper. In particular, it allows us to show that the sufficient conditions for size control obtained in Pötscher and Preinerstorfer (2018) are often also necessary. The result complements negative results in Preinerstorfer and Pötscher (2016) and is obtained by combining Lemmata A.1 and A.3 in Appendix A with Corollary 5.17 of Preinerstorfer and Pötscher (2016). Its relationship to negative results in Preinerstorfer and Pötscher (2016) is further discussed in Appendix A.1. We recall the following definition from Pötscher and Preinerstorfer (2018).
Definition 3.1. Given a linear subspace L of R n with dim(L) < n and a covariance model C, we let The space L figuring in this definition will always be an appropriately chosen subspace related to invariance properties of the tests under consideration. A leading case is when L = M lin 0 . Loosely speaking, the linear spaces belonging to J(L, C) are either (nontrivial) projections of concentration spaces of the covariance model C (in the sense of Preinerstorfer and Pötscher (2016)) on L ⊥ , or are what one could call "higher-order" concentration spaces. For a more detailed discussion see Appendix B.1 of Pötscher and Preinerstorfer (2018). Theorem 3.1. Let C be a covariance model. Let T be a nonsphericity-corrected F-type test statistic of the form (2.4) based onβ andΩ satisfying Assumption 1 with N = ∅. Furthermore, assume thatΩ(y) is nonnegative definite for every y ∈ R n . If an S ∈ J(M lin 0 , C) satisfying S ⊆ span(X) exists, then holds for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞).

Remark 3.2.
(Extensions) (i) As noted in Section 2.2, any T as in the theorem is G(M 0 )-invariant. In some cases T and its associated set N * are additionally invariant w.r.t. addition of elements from a linear space V ⊆ R n . In such a case L = span(M lin 0 ∪ V) necessarily has dimension less than n − 1 < n, and the variant of Theorem 3.1 where J(M lin 0 , C) is replaced by J(L, C) also holds. 5 (ii) A result similar to Theorem 3.1, operating under a weaker condition than S ⊆ span(X) for some S ∈ J(M lin 0 , C), is given in Theorem A.4 in Appendix A. This result also allows for N = ∅, but is restricted to the case where q, the number of restrictions tested, is equal to 1 and whereβ is the least squares estimator in (2.1).
The preceding theorem can now be used to show that the conditions for size control obtained in Corollary 5.6 (and Remark 5.8) of Pötscher and Preinerstorfer (2018) are not only sufficient, but are actually necessary, in some important scenarios. This is formulated in the subsequent corollary; see also Remark 3.4 below. [We note that T in this corollary satisfies the assumptions of Corollary 5.6 of Pötscher and Preinerstorfer (2018) (with N † = N * and V = {0}) in view of Lemma 5.16 in the same reference.] Corollary 3.3. Let C be a covariance model. Let T be a nonsphericity-corrected F-type test statistic of the form (2.4) based onβ andΩ satisfying Assumption 1 with N = ∅. Furthermore, assume thatΩ(y) is nonnegative definite for every y ∈ R n , and that N * = span(X). Then S span(X) for every S ∈ J(M lin 0 , C) 5 That dim(L) < n − 1 must hold is seen as follows: Suppose dim(L) ≥ n − 1. Then T is λ R n -almost everywhere constant (this is trivial if dim(L) = n and follows from Remark 5.14(i) in Pötscher and Preinerstorfer (2018) in case dim(L) = n − 1). However, this contradicts Part 2 of Lemma 5.16 of Pötscher and Preinerstorfer (2018).
is necessary and sufficient for size-controllability (at any significance level α ∈ (0, 1)), i.e., is necessary and sufficient for the fact that for every α ∈ (0, 1) there exists a real number C(α) such that holds. 6 Remark 3.4. (Special cases) (i) Corollary 3.3 applies, in particular, to the (weighted) Eicker-test statistic T E,W in view of Remark 2.1(iv) above. Note that N * = span(X) is here always satisfied. By Remark 2.1(iv), Corollary 3.3 also applies to the classical F-test statistic.
(ii) Next consider the classical autocorrelation robust test statistic T w with Assumptions 1 and 2 of Pötscher and Preinerstorfer (2018) being satisfied. Then Corollary 3.3 also applies to T w in view of Remark 2.1(i) above, provided N * = span(X) holds. While the relation N * = span(X) need not always hold for T w (see the discussion in Section 5.3 of Pötscher and Preinerstorfer (2018)), it holds for many combinations of restriction matrix R and design matrix X (in fact, it holds generically in many universes of design matrices as a consequence of Lemma A.3 in Appendix A of Pötscher and Preinerstorfer (2018)). Hence, for such combinations of R and X, Corollary 3.3 applies to T w .
(iii) For test statistics T GQ with positive definite weighting matrix W * n a statement completely analogous to (ii) above holds in view of Remark 2.1(iii). The same is true for heteroskedasticity-robust test statistics as discussed in Remark 2.1(v). Remark 3.5. While Theorem 3.1 applies to any combination of test statistic T and covariance model C as long as they satisfy the assumptions of the theorem, in a typical application the choice of the test statistic used will certainly be dictated by properties of the covariance model C one maintains. For example, in case C models stationary autocorrelated errors different test statistics will be employed than in the case where C models heteroskedasticity.

Results for covariance models obtained from stationary autocorrelated errors
We next specialize the results of the preceding section to the case of stationary autocorrelated errors. i.e., to the case where the elements u 1 , . . . , u n of the n × 1 disturbance vector U in model (2.1) are distributed like consecutive elements of a zero mean weakly stationary Gaussian process with an unknown spectral density, which is not almost everywhere equal to zero. Consequently, the covariance matrix of the disturbance vector is positive definite and can be written as , with f varying in F, a prescribed (nonempty) family of normalized (that is, π −π f (ω)dω = 1) spectral densities, and where 0 < σ 2 < ∞ holds. Here ι denotes the imaginary unit. We define the associated covariance model via Examples for the set F are (i) F all , the set of all normalized spectral densities, or (ii) F ARMA(p,q) , the set of all normalized spectral densities corresponding to stationary autoregressive moving average models of order at most (p, q), or (iii) the set of normalized spectral densities corresponding to (stationary) fractional autoregressive moving average models, etc. We shall write F AR(p) for F ARMA(p,0) .
Remark 3.7. Suppose F in Theorem 3.6 has the property that γ ∈ S(F, M lin 0 ) implies {γ} ∈ S(F, M lin 0 ) (as is, e.g., the case if F ⊇ F AR(2) , cf. Lemma 3.8 below). Then it is easy to see that the set Γ in the theorem can be chosen to be a singleton.
This theorem is applicable to any nonempty set F of normalized spectral densities. In case more is known about the richness of F, the sufficient condition in the preceding result can sometimes be simplified substantially. Below we present such a result making use of the subsequent lemma. Remark 3.9. (i) A sufficient condition for κ(ω(L), d(L)) + κ(γ, 1) < n (κ(ω(L), d(L)) + 2 < n, respectively) is given by dim(L)+κ(γ, 1) < n (dim(L)+ 2 < n, respectively). This follows from κ(ω(L), d(L)) ≤ dim(L) established in Lemma D.1 in Appendix D of Pötscher and Preinerstorfer (2018).
Armed with the preceding lemma we can now establish the following consequence of Theorem 3.6 provided F is rich enough to encompass F AR(2) , which clearly is a very weak condition in the context of autocorrelation robust testing. 7 Theorem 3.10. Let F ⊆ F all satisfy F ⊇ F AR(2) . Let T be a nonsphericitycorrected F-type test statistic of the form (2.4) based onβ andΩ satisfying As- for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞). (2018)) (i) Suppose T is as in Theorem 3.6, additionally satisfying N * = span(X). Theorem 3.6 then shows that the sufficient conditions for size control given in Part 1 of Theorem 6.5 in Pötscher and Preinerstorfer (2018) (or the equivalent formulation given in Part 2 of that theorem) is also necessary.

Remark 3.11. (Further comments on the necessity of the sufficient conditions for size control in Pötscher and Preinerstorfer
(ii) Suppose T is as in (i) and assume furthermore that F is as in Remark 3.7. Then also the sufficient condition for size control "span(E n,ρ(γ) (γ)) span(X) for every γ ∈ S(F, M lin 0 )" mentioned in Part 2 of Theorem 6.5 of Pötscher and Preinerstorfer (2018) is necessary. [This is seen as follows: Suppose not, i.e., span(E n,ρ(γ) (γ)) ⊆ span(X) holds for some γ ∈ S(F, M lin 0 ). Now apply Theorem 3.6 with Γ = {γ}, which is possible because of Remark 3.7, resulting in size being equal to one, a contradiction.] (iii) Suppose T is as in (i) and assume that F ⊆ F all satisfies F ⊇ F AR(2) . Then F satisfies the property in Remark 3.7 in view of Lemma 3.8, and thus (ii) above applies. In this situation even more is true in view of Theorem 3.10: The further sufficient condition for size control "span(E n,ρ(γ) (γ)) span(X) for every γ ∈ [0, π]" given in Part 2 of Theorem 6.5 of Pötscher and Preinerstorfer (2018) is in fact also necessary.
(iv) The discussion in (i)-(iii) covers (weighted) Eicker-test statistics T E,W (including the classical F-test statistic) as well as classical autocorrelation robust test statistics T w (the latter under Assumptions 1 and 2 of Pötscher and Preinerstorfer (2018) and if N * = span(X) holds); it also covers the test statistics T GQ (provided the weighting matrix W * n is positive definite and N * = span(X) holds). In particular, the discussion in (i)-(iii) thus applies to the sufficient conditions given in Theorem 6.6 in Pötscher and Preinerstorfer (2018) and its variants outlined in Remark 6.8 of that reference. Furthermore, it transpires from this discussion that the sufficient conditions for size control provided in Theorem 3.8 of Pötscher and Preinerstorfer (2018) are actually necessary; and the same is true for Theorem 3.2 in that reference (provided the set B given there coincides with span(X)). 8 The results so far have only concerned the size of nonsphericity-corrected F-type test statistics for which the exceptional set N is empty andΩ is nonnegative definite everywhere. We now provide a result also for the case where this condition is not met. 9 While the preceding result maintained that F contains F AR(2) , the next result maintains the slightly stronger condition that F ⊇ F ext AR(2) .
. Let T be a nonsphericitycorrected F-type test statistic of the form (2.4) based onβ andΩ satisfying Assumption 1. Furthermore, assume thatΩ also satisfies Assumption 2. Suppose there exists a γ ∈ [0, π] such that span(E n,ρ(γ) (γ)) ⊆ span(X). Then for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞) it holds that with the random variableξ γ (x) given bȳ The significance of the preceding theorem is that it provides a lower bound for the size of a large class of nonsphericity-corrected F-type tests, including those with N = ∅ or withΩ not necessarily nonnegative definite. In particular, it shows that size can not be controlled at a given desired significance level α, if α is below the threshold given by the lower bound in (3.6). Observe that this threshold will typically be close to 1, at least if n is sufficiently large, since (possibly after rescaling)Ω will often approach a positive definite matrix as n → ∞. Remark 3.13. (i) There are at most finitely many γ satisfying the assumption span(E n,ρ(γ) (γ)) ⊆ span(X) in the preceding theorem. To see this note that any such γ must coincide with a coordinate of ω(span(X)) (since trivially span(E n,0 (γ)) ⊆ span(X) in case ρ(γ) = 0 by this assumption, and since span(E n,0 (γ)) ⊆ M lin 0 ⊆ span(X) in case ρ(γ) > 0), and that the dimension of the vector ω(span(X)) is finite since ρ(ω, span(X)) > 0 can hold at most for finitely many ω's as discussed subsequent to (3.3).
(ii) If denotes the (finite) set of γ's that satisfy the assumption span(E n,ρ(γ) (γ)) ⊆ span(X) in the theorem, relation (3.6) in fact implies (iii) Similar to Theorem 3.10, Theorem 3.12 also delivers (3.5) in caseΩ is nonnegative definite λ R n -almost everywhere. However, note that the latter theorem imposes a stronger condition on the set F. and ρ(γ, L) replaces ρ(γ), can be seen to hold. Remark 3.15. Some results in this section are formulated for sets of spectral densities F satisfying F ⊇ F AR(2) or F ⊇ F ext AR(2) , and thus for covariance models (2) ), respectively. Trivially, these results also hold for any covariance model C (not necessarily of the form C(F)) that satisfies C ⊇ C(F AR(2) ) or C ⊇ C(F ext AR(2) ), respectively. This observation also applies to other results in this paper further below and will not be repeated.

Results concerning power
We now show for a large class of test statistics, even larger than the class of nonsphericity-corrected F-type test statistics, that -under certain conditionsa choice of critical value leading to size less than one necessarily implies that the test is severely biased and thus has bad power properties in certain regions of the alternative hypothesis (cf. Part 3 of Theorem 5.7 and Remark 5.5(iii) in Preinerstorfer and Pötscher (2016)). The relevant conditions essentially say that a collection K as in the subsequent lemma can be found that is nonempty. It should be noted, however, that there are important instances where (i) the relevant conditions are not satisfied (that is, a nonempty K satisfying the properties required in the lemma does not exist) and (ii) small size and good power properties coexist. For results in that direction see Theorems 3.7, 5.10, 5.12, and 5.21 in Preinerstorfer and Pötscher (2016) as well as Proposition 5.2 and Theorem 5.4 in Preinerstorfer (2017).
The subsequent lemma is a variant of Lemma 5.11 in Pötscher and Preinerstorfer (2018). Recall that H, defined in that lemma, certainly contains all one-dimensional S ∈ J(L, C) (provided such elements exist).
Lemma 4.1. Let C be a covariance model. Assume that the test statistic T : R n → R is Borel-measurable and is continuous on the complement of a closed set N † . Assume that T and N † are G(M 0 )-invariant, and are also invariant w.r.t. addition of elements of a linear subspace V of R n . Define L = span(M lin 0 ∪ V) and assume that dim L < n. Let H and C(S) be defined as in Lemma 5.11 of Pötscher and Preinerstorfer (2018). Let K be a subset of H and define C * (K) = inf S∈K C(S) and C * (K) = sup S∈K C(S), with the convention that C * (K) = ∞ and C * (K) = −∞ if K is empty. Suppose that K has the property that for every S ∈ K the set N † is a λ μ0+S -null set for some μ 0 ∈ M 0 (and hence for all μ 0 ∈ M 0 ). Then the following holds:

For every
Part 1 of the lemma implies that the size of the test equals 1 if C < C * (K). Part 2 shows that the test is severely biased for C > C * (K), which -in view of the invariance properties of T (cf. Part 3 of Theorem 5.7 and Remark 5.5(iii) in Preinerstorfer and Pötscher (2016)) -implies bad power properties such as (4.3) and (4.4) below. In particular, Part 2 implies that infimal power is zero for such choices of C. [Needless to say, the lemma neither implies that sup Σ∈C P μ0,σ 2 Σ (T ≥ C) is less than 1 for C > C * (K) nor that inf Σ∈C P μ0,σ 2 Σ (T ≥ C) is positive for C < C * (K). For conditions implying that size is less than 1 for appropriate choices of C see Pötscher and Preinerstorfer (2018).] The computation of the constants C * (K) and C * (K) can sometimes be simplified, see Lemma C.1 in Appendix C. Before proceeding, we want to note that the preceding lemma also provides a negative size result (namely that the test based on T has size equal to 1 for every C), if C * (K) = ∞ holds for a collection K satisfying the assumptions of that lemma.
The announced theorem is now as follows and builds on the preceding lemma.
Theorem 4.2. Let C be a covariance model. Assume that the test statistic T : R n → R is Borel-measurable and is continuous on the complement of a closed set N † . Assume that T and N † are G(M 0 )-invariant, and are also invariant w.r.t. addition of elements of a linear subspace V of R n . Define L = span(M lin 0 ∪ V) and assume that dim L < n. Then the following hold: 1. Suppose there exist two elements S 1 and S 2 of H such that C(S 1 ) = C(S 2 ).
In the important special case where V = {0}, the assumptions on T and the associated set N † in the second and third sentence of the preceding theorem are satisfied, e.g., for nonsphericity-corrected F-type test statistics (under Assumption 1), including the test statistics T w , T GQ , and T E,W given in Section 2.2 above; see also Section 5.3 in Pötscher and Preinerstorfer (2018). Furthermore, for the class of test statistics T such that Theorem 3.1 applies (and for which N † = N * = span(X) holds), it can be shown that N † is a λ μ0+S -null set for any S ∈ H (in fact, for any S ∈ J(L, C)) provided (4.1) holds. These observations lead to the following corollary.

Corollary 4.3. Let C be a covariance model and let T be a nonsphericitycorrected F-type test statistic of the form (2.4) based onβ andΩ satisfying
Assumption 1 with N = ∅. Furthermore, assume thatΩ(y) is nonnegative definite for every y ∈ R n and that N * = span(X). Theorem 4.2 as well as the preceding corollary maintain conditions that, in particular, require H to be nonempty. In view of Lemma 5.11 in Pötscher and Preinerstorfer (2018), H is certainly nonempty if a one-dimensional S ∈ J(L, C) exists. The following lemma shows that for C = C(F) with F ⊇ F AR(2) this is indeed the case; in fact, for such C typically at least two such spaces exist. 11

Suppose there exist two elements
The preceding lemma continues to hold for any covariance model C ⊇ C(F AR(2) ) in a trivial way, since J(L, C) ⊇ J(L, C(F AR(2) )) then certainly holds. Also note that the condition dim(L) + 1 < n is always satisfied in the important special case where L = M lin 0 , since dim(M lin 0 ) = k − q < n − 1.

Consequences for testing hypotheses on deterministic trends
In this section we discuss important consequences of the results obtained so far for testing restrictions on coefficients of polynomial and cyclical regressors when the errors are stationary, more precisely, have a covariance model of the form C(F). Such testing problems have, for obvious reasons, received a great deal of attention in econometrics, and are relevant in many other fields such as, e.g., climate or ecological research. 12 In particular, we show that a large class of nonsphericity-corrected F-type test statistics leads to unsatisfactory test procedures in this context. In Subsection 5.1 we present results concerning hypotheses on the coefficients of polynomial regressors. Results concerning tests for hypotheses on the coefficients of cyclical regressors are briefly discussed in Subsection 5.2.

Polynomial regressors
We consider here the case where one tests hypotheses that involve the coefficient of a polynomial regressor as expressed in the subsequent assumption: . Furthermore, suppose that the restriction matrix R has a nonzero column R ·i for some i = 1, . . . , k F , i.e., the hypothesis involves coefficients of the polynomial trend.
Under this assumption one obtains the subsequent theorem as a consequence of Theorem 3.10.
The previous theorem relies in particular on the assumption that N = ∅ and thatΩ is nonnegative definite everywhere. While these two assumptions may appear fairly natural and are widely satisfied, e.g., for the test statistics T w , T GQ , and T E,W as discussed in Remark 2.1, we shall see in Subsections 5.1.1 and 5.1.2 below that they are not satisfied by some tests suggested in the literature. To obtain results also for tests that are not covered by the previous theorem we can apply Theorem 3.12. The following result is then obtained. (2) . Suppose that Assumption 3 holds. Let T be a nonsphericity-corrected F-type test statistic of the form (2.4) based onβ andΩ satisfying Assumption 1. Furthermore, assume thatΩ also satisfies Assumption 2. Then for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞) it holds that where R ·i0 denotes the first nonzero column of R. [Note thatΩ is P 0,In -almost everywhere nonsingular in view of Assumption 1.] Theorem 5.2 shows that under Assumption 3 a large class of nonsphericitycorrected F-type tests, including cases with N = ∅ or with N = ∅ but whereΩ is not necessarily nonnegative definite everywhere, typically have large size. In particular, size can not be controlled at a given desired significance level α, if α is below the lower bound in (5.1). Observe that this lower bound will typically be close to 1, at least if n is sufficiently large.
Remark 5.3. (i) In the special case where Assumption 3 is satisfied with R ·1 = 0, Theorem 5.1 continues to hold even under the weaker assumption that only F ⊇ F AR(1) holds. 13 This follows from Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) upon noting that Z = span(e + ) is a concentration space of C(F) by Lemma G.1 in the same reference, thatΩ vanishes on span(X) ⊇ Z as a consequence of the assumption N = ∅ (see the discussion following (27) in Preinerstorfer and Pötscher (2016)), and that Rβ(λe + ) = λR ·1 = 0 for all λ = 0. 14 Here e + denotes the n × 1 vector of ones.
(ii) In the special case where Assumption 3 is satisfied with R ·1 = 0, also Theorem 5.2 continues to hold under the weaker assumption that F ⊇ F AR(1) holds, provided the identity matrix I n appearing in (5.1) is replaced by the nonsingular matrix Φ(0) = e + e + + D(0), where D(0) is the matrix D given in Part 3 of Lemma G.1 in Preinerstorfer and Pötscher (2016). This follows from Remark 5.14(iii) further below, upon noting that the situation considered here can be viewed as a special case of the situation described in Remark 5.14(iii) with ω = 0.
To illustrate the scope and applicability of Theorems 5.1 and 5.2 above (beyond the test statistics such as T w , T GQ , and T E,W mentioned before), we shall now apply them to some commonly used test statistics that have been designed for testing polynomial trends. First, in Subsection 5.1.1, we shall derive properties of conventional tests for polynomial trends. Such tests are based on long-run-variance estimators and classical results due to Grenander (1954). In Subsection 5.1.2 we shall discuss properties of tests that have been introduced more recently by Vogelsang (1998) and Bunzel and Vogelsang (2005). While our discussion of methods is certainly not exhaustive (for example, we do not discuss tests in Harvey et al. (2007) or Perron and Yabu (2009), which have been suggested only for the special case of testing a restriction on the slope in a "linear trend plus noise model"), it should also serve the purpose of presenting a general pattern how one can check the reliability of polynomial trend tests. It might also help to avoid pitfalls in the construction of novel tests for polynomial trends.
Before we proceed to a discussion of properties of specific tests, we would like to emphasize the following: in the present section we provide, for some commonly used tests, results on their maximal rejection probability over for every μ 0 ∈ M 0 and every σ 2 ∈ (0, ∞). We establish these results under the weak assumption that F contains at least F AR(2) or the slight enlargement F ext AR(2) ⊆ F ARMA(2,2) . The recent trend testing literature, cf. in particular Section 3.1 in Vogelsang (1998) and Assumption 1 in Bunzel and Vogelsang (2005), studies tests for models induced by all regression errors u t satisfying Here δ ∈ (−1, 1] is an additional unknown parameter and w t is a weakly stationary linear process with martingale difference innovations that have uniformly bounded fourth moments and conditional variance 1, and with coefficients d i for i ∈ N ∪ {0} satisfying ∞ i=0 d i = 0 and the summability condition ∞ i=0 i|d i | < ∞. Also the coefficients d i are unknown parameters. Obviously, the assumptions on the innovations are satisfied for an i.i.d. sequence of standard normal random variables. Hence, setting δ = 0 in the previous displayed equation, we see that the model considered in Vogelsang (1998) or Bunzel and Vogelsang (2005) contains, in particular, As a consequence, any lower bound for size obtained in our context for sets F required only to satisfy F ⊇ F AR(2) (or F ⊇ F ext AR(2) ) a fortiori provides a lower bound for the size in the setting considered in Vogelsang (1998) and Bunzel and Vogelsang (2005) 2) ).

Properties of conventional tests for hypotheses on polynomial trends
The structure of tests that have traditionally been used for testing restrictions on coefficients of polynomial trends (i.e., when the design matrix X satisfies Assumption 3, and in particular if k F = k) is motivated by results concerning the asymptotic covariance matrix of the OLS estimator (and its efficiency) in regression models with stationary error processes and deterministic polynomial time trends by Grenander (1954) (cf. also the discussion in Bunzel and Vogelsang (2005) on p. 383). The corresponding test statistics are nonsphericity-corrected F-type test statistics as in (2.4). They are based on the OLS estimatorβ(=β X ) and a covariance matrix estimatoř Here the "long-run-variance estimator"ω W is of the form where W(y) is a symmetric, possibly data-dependent, n × n-dimensional matrix that may not be well-defined on all of R n . 15 In many cases, however, W is constant, i.e., does not depend on y, and is also positive definite. For example, this is so in the leading case where the (i, j)-th element of W is of the form κ(|i − j|/M ) for some (deterministic) M > 0 (typically depending on n) and a kernel function κ such as the Bartlett, Parzen, Quadratic-Spectral, or Daniell kernel (positive definiteness does not hold, e.g., for the rectangular kernel with M > 1). Note that in case W is given by a kernel κ the estimatorω W in the previous display can be written in the more familiar form whereγ i (y) =γ −i (y) = n −1 n j=i+1û j (y)û j−i (y) for i ≥ 0. For trend tests based on the OLS estimatorβ and a covariance estimatorΩ W as in (5.2) we shall first obtain two corollaries from Theorems 5.1 and 5.2 that cover the case where W is constant. 16 Further below we shall then address the case where W is allowed to depend on y. Note that the assumptions on W in the subsequent corollary are certainly met if W is constant, symmetric, and positive definite, and hence are satisfied in the leading case mentioned before (provided M is deterministic).
Corollary 5.4. Let F ⊆ F all satisfy F ⊇ F AR(2) and suppose that Assumption 3 holds. Suppose further that W is constant and symmetric, and that Π span(X) ⊥ WΠ span(X) ⊥ is nonzero and nonnegative definite. Thenβ =β anď Ω =Ω W satisfy Assumption 1 with N = ∅. Let T be of the form (2.4) witȟ β =β,Ω =Ω W , and N = ∅. Then 15 The matrix W may depend on n, a dependence not shown in the notation. Furthermore, assuming symmetry of W entails no loss of generality, since given a long-run-variance-estimator as in (5.3) and based on a non-symmetric weights matrix W * , one can always pass to an equivalent long-run-variance estimator by replacing W * with the symmetric matrix W = (W * + W * )/2. 16 The slightly more general case, where W is not constant in y (and is defined on all of R n ) but W * := Π span(X) ⊥ WΠ span(X) ⊥ is so, can immediately be subsumed under the present discussion, if one observes thatω W coincides withω W * and W * is constant.
We next consider the case where the matrix Π span(X) ⊥ WΠ span(X) ⊥ is nonzero, but not (necessarily) nonnegative definite, and thus the previous corollary is not applicable. The subsequent corollary covers this case and is obtained under the slightly stronger assumption that F ⊇ F ext AR(2) . [Note also that the case where W is constant but Π span(X) ⊥ WΠ span(X) ⊥ is equal to zero is of no interest as it leads to a long-run-variance estimator that vanishes identically.] (2) and suppose that Assumption 3 holds. Suppose further that W is constant and symmetric, and that Π span(X) ⊥ WΠ span(X) ⊥ is nonzero. Thenβ =β andΩ =Ω W satisfy Assumption 1 with N = ∅. Let T be of the form (2.4) withβ =β,Ω =Ω W , and N = ∅. Then holds for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞ The previous corollary shows that the size of the test is bounded from below by the probability that the long-run-variance estimatorω W used in the construction of the test statistic is nonnegative, where the probability is taken under N (0, I n )-distributed errors. For consistent long-run-variance estimators this probability approaches 1 as sample size increases, and hence the size of tests based on such estimatorsω W will exceed any prescribed nominal significance level α ∈ (0, 1) eventually. Additionally, it is shown in that corollary that for nonnegative critical values (the standard in applications) the probability P 0,In (ω W ≥ 0) also provides an upper bound on the maximal power of the test under i.i.d. errors. Thus, if the lower bound in (5.4) is small, and hence (5.4) does not tell us much about size, the inequality in (5.5) shows that power must then be small over a substantial subset of the parameter space (unless perhaps one chooses a negative critical value). To get an idea of the magnitude of the lower (upper) bound in (5.4) ((5.5)) in a special case, we computed P 0,In (ω W ≥ 0) numerically for the rectangular kernel, i.e., for W ij = 1 (−1,1) ((i − j)/M ), for the cases when Assumption 3 is satisfied with k F = k ∈ {1, 2, 3, 4, . . . , 10}, respectively, sample size n = 150, and bandwidth parameter M = bn for  Figure 1. 18 For all values of b and k the probability P 0,In (ω W ≥ 0) is quite large, in particular is larger than 1/4, and thus exceeds commonly used significance levels. Thus, as a consequence of (5.4), one has strong size distortions regardless of the values of b and C chosen if one decides to use a test based on the rectangular kernel. Together with (5.5), Figure 1 also shows that for a large range of b's the power under i.i.d. errors of the corresponding tests (with nonnegative critical value C) can nowhere exceed 0.8, no matter how strong the deviation from the null hypothesis might be (and this bound even falls to 0.6 if the case k F = k = 1 is disregarded). Note also that the probability P 0,In (ω W ≥ 0) can be easily obtained numerically in any other case, as it is the probability that a quadratic form in a standard Gaussian random vector is nonnegative (for the actual computation we used the algorithm by Davies (1980)).
The assumption of W being data-independent, i.e., constant as a function of y ∈ R n , in the previous two corollaries is not satisfied for the important class of 17 For b ∈ {0.994, . . . , 1} the matrix W has all entries equal to one, implying thatω W and thusΩ W are identically zero. This is an uninteresting case and falls outside the scope of Corollary 5.5. [If one insists on using the corresponding test statistic T as defined in (2.4), T is then identically zero, leading to a useless testing procedure.] Of course, for such values of b the probability P 0,In (ω W ≥ 0) equals one, explaining the sharp increase of the graph in Figure 1 for b close to 1.
18 The corresponding figure in early versions of this paper was incorrect due to a coding error. Furthermore, to emphasize that the functions shown in the figure are step functions, we now use a finer grid for b in the computation than in the early versions; and the vertical connecting lines were added to facilitate readability. Additionally note that the case k F = k = 1 had not been considered in any previous version.
long-run-variance estimators that incorporate prewhitening or data-dependent bandwidth parameters (e.g., Andrews (1991), Andrews and Monahan (1992), and Newey and West (1994)). An additional complication for such estimators is that the corresponding weights matrix W(y), and thus alsoΩ W , are in general not well-defined for every y ∈ R n . Nevertheless, after a careful structural analysis of such estimators (similar to the results obtained in Section 3.3 of Preinerstorfer (2017)), one can typically show that the resulting test statistic satisfies the assumptions of Theorem 5.2 above and thus one can obtain suitable versions of the above corollaries tailored towards test statistics based on specific classes of prewhitened long-run-variance estimators with data-dependent bandwidth parameters. To make this more compelling, we provide in the following such a result for a widely used procedure in that class. We consider a version of the AR(1)-prewhitened long-run-variance estimator based on auxiliary AR(1) models for bandwidth selection and the Quadratic-Spectral kernel as discussed in Andrews and Monahan (1992). This is a long-run-variance estimator as in (5.3), where the weights matrix is obtained as follows (the set where all involved quantities are well-defined is given in (5.7) further below): Let (5.6) and definev i (y) =û i+1 (y)−ρ(y)û i (y) for i = 1, . . . , n−1, which one can write in an obvious way asv(y) = A(ρ(y))û(y) with ρ → A(ρ) ∈ R (n−1)×n a continuous function on R. Define the data-dependent bandwidth parameter M AM via M AM (y) = 1.3221 n 4ρ 2 (y) (1 −ρ(y)) 4 1/5 The long-run-variance estimatorω WAM is now obtained (granted the involved expressions are well-defined) by choosing W in (5.3) equal to where [κ QS (|i − j|/M AM (y))] n−1 i,j=1 is defined as I n−1 in case M AM (y) = 0 holds (cf., e.g., p. 821 in Andrews (1991) for a definition of the Quadratic-Spectral kernel κ QS ). The corresponding covariance matrix estimatorΩ WAM is then given by pluggingω WAM into (5.2). The set where W AM (and henceΩ WAM ) is welldefined is easily seen to coincide with the set of all y ∈ R n such thatρ(y) and ρ(y) are both well-defined and are not equal to 1, i.e., with the set y ∈ R n : (5.7) Define N AM as the complement of the set (5.7) in R n . A result concerning size properties of polynomial trend tests based on the long-run-variance estimator ω WAM is now obtained by combining Theorem 5.2 above with results obtained in Lemma D.3 in Appendix D, showing, in particular, thatβ andΩ WAM satisfy Assumptions 1 with N = N AM , provided N AM = R n holds. Note that (i) the condition N AM = R n only depends on properties of the design matrix X and hence can be checked, and that (ii) in case N AM = R n , the matrixΩ WAM is nowhere well-defined, and tests based on this estimator hence break down in a trivial way.
Remark 5.7. In the special case where Assumption 3 is satisfied with R ·1 = 0, appropriate versions of Corollaries 5.4, 5.5, and 5.6 maintaining only F ⊇ F AR(1) can be obtained by perusing Remark 5.3. We abstain from spelling out details. A similar remark applies to Corollaries 5.8, 5.9, and 5.10 given in the next subsection.

Properties of some recently suggested tests for hypotheses on polynomial trends
In this subsection we discuss finite sample properties of classes of tests for polynomial trends that have been suggested in Vogelsang (1998) and Bunzel and Vogelsang (2005). We start with a discussion of the tests introduced in the former article. Vogelsang (1998) introduces two classes of tests for testing hypotheses on trends, in particular polynomial trends. From Section 3.2 of Vogelsang (1998) it is not difficult to see that these classes of test statistics (i.e., the classes referred to as P S i T and P SW i T in that reference) are (possibly up to a constant positive multiplicative factor that can be absorbed into the critical value) of the form (2.4). More specifically, the test statistics in Vogelsang (1998) are based on a combination of one of the two estimatorš with a corresponding covariance estimator of the form Here A is the n × n-dimensional matrix that has 0 above the main diagonal and 1 on and below the main diagonal, c is a real number 19 , U is an n×m-dimensional matrix (with m ≥ 1) such that (X, U ) is of full column-rank k + m < n. [In Vogelsang (1998) the column vectors of U correspond to polynomial trends of an order exceeding the polynomial trends already contained in span(X).] Furthermore, Gβ (X,U ) (y), (5.10) and J 2 n,U (y) is defined as with G = (0, I m ) ∈ R m×(k+m) , where we use the notation s 2 D1,D2 (y) = n −1 y D 1 Π span(D1D2) ⊥ D 1 y for nonsingular D 1 ∈ R n×n and for D 2 ∈ R n×l of rank l ≤ n. It is obvious from the above expressions that the covariance estimatorΩ Vo c,U,i,V is not welldefined on all of R n . However, it is also not difficult to see that the set where such an estimator is well-defined coincides with R n \ span(X, U ), see the proof of Lemma D.4 in Appendix D. We stress once more that the matrix U used in the construction above is chosen in a particular way in Vogelsang (1998). We do not impose such a restriction here, because it would unnecessarily complicate the presentation of the result below, and because this restrictions is actually not necessary for establishing the result. The following result now shows, in particular, that the tests suggested in Vogelsang (1998) suffer from substantial size distortions in case F ⊇ F ext AR(2) . Corollary 5.8. Let F ⊆ F all satisfy F ⊇ F ext AR(2) and suppose Assumption 3 holds. Let V ∈ {A, I n }, c ∈ R, i ∈ {1, 2}, and let U be an n × m-dimensional matrix with m ≥ 1, k + m < n, such that (X, U ) is of full column-rank. Theň β =β V andΩ =Ω Vo c,U,i,V satisfy Assumption 1 with N = span(X, U ). Let T be of the form (2.4) withβ =β V ,Ω =Ω Vo c,U,i,V , and N = span(X, U ). Then holds for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞).
Next we turn to the tests introduced in Bunzel and Vogelsang (2005). We first discuss tests introduced in that article with data-independent tuning parameters and data-independent critical values: These tests are based on the OLS estimator β and two classes of covariance matrix estimators, both of which incorporate a tuning parameter c ∈ R, and which are defined aš Ω BV,J W,U,c (y) =ω W (y) exp(cJ 1 n,U (y))R(X X) −1 R , (5.11) where U is an n × m-dimensional matrix with m ≥ 1 such that (X, U ) is of full column-rank k + m < n (note thatω W and J 1 n,U have been defined in (5.3) and (5.10) above), anď where A has been defined below (5.9). The subsequent result applies, in particular, if W ij = κ(|i − j|/M ) where M > 0 is a (fixed) real number and κ is a kernel function such that W is positive definite, including the recommendation in Bunzel and Vogelsang (2005) to use the Daniell kernel. In that case, and more generally whenever Π span(X) ⊥ WΠ span(X) ⊥ is nonzero and nonnegative definite (with W constant 20 and symmetric), the subsequent corollary shows that the above mentioned tests in Bunzel and Vogelsang (2005) have size equal to one if F ⊇ F ext AR(2) ; in case Π span(X) ⊥ WΠ span(X) ⊥ is nonzero but not nonnegative definite, a lower bound on the size is obtained, which also provides an upper bound for the power in the case of i.i.d. errors. A discussion similar to the discussion following Corollary 5.5 also applies here (cf. also Figure 1). Corollary 5.9. Let F ⊆ F all satisfy F ⊇ F ext AR(2) and suppose Assumption 3 holds. Suppose that W is constant and symmetric, that Π span(X) ⊥ WΠ span(X) ⊥ is nonzero, and that c ∈ R. Furthermore, for the statements that involve U , suppose U is an n × m-dimensional matrix with m ≥ 1 such that (X, U ) is of full column-rank k + m < n. Then,β =β andΩ =Ω BV W,c (β =β andΩ = Ω BV,J W,U,c , respectively) satisfy Assumption 1 with N = span(X) (N = span(X, U ), respectively). Let T be of the form (2.4) withβ =β,Ω =Ω BV W,c , and N = span(X), or withβ =β,Ω =Ω BV,J W,U,c , and N = span(X, U ). Then holds for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞).
We shall now turn to the approach Bunzel and Vogelsang (2005) suggest for practical applications. This approach is based on a data-driven selection of the weights matrix W and of the tuning parameter c, and on a data-driven selection of the critical value C. Their approach is as follows: Bunzel and Vogelsang (2005) focus onω W based on the Daniell kernel. More specifically, they set W ij = κ D (|i − j|/ max(bn, 2)) (cf. Bunzel and Vogelsang (2005), Appendix B, for a definition of the Daniell kernel). Recall that, regardless of the value of b, the matrix with elements W ij = κ D (|i − j|/ max(bn, 2)) based on the Daniell kernel is positive definite. The authors recommend to choose b as a positive piecewise constant function ofρ (which has been defined in (5.6) above), more precisely, for constants a i ∈ (0, ∞), i = 0, . . . , m (m ∈ N), andā i ∈ R, i = 1, . . . , m , they suggest to use b BV (y, a,ā) For a recommendation concerning the choice of these constants see Bunzel and Vogelsang (2005), p. 388. Furthermore, Bunzel and Vogelsang (2005) suggest to choose their data-driven critical value C and a data-driven tuning parameter c as a polynomial function of b BV (y, a,ā), respectively. More precisely, for constants h 0 , . . . , h m ∈ R (m ∈ N, h m = 0) and p 0 , . . . , p m ∈ R (m ∈ N, p m = 0) they suggest to use

Then they set
and define, in correspondence with (5.11) and (5.12), the covariance estimatorš The vectors of (constant) tuning parameters a = (a 0 , . . . , a m ) ,ā = (ā 1 , . . . ,ā m ) , h = (h 0 , . . . , h m ) , and p = (p 0 , . . . , p m ) this approach is based on are tabulated in Bunzel and Vogelsang (2005) for certain cases, and need to be obtained numerically, following the rationale in Bunzel and Vogelsang (2005), for the cases not tabulated in that paper. Furthermore, the data-driven tuning parameters b BV and c BV as well as the data-driven critical value C BV are welldefined for a given y ∈ R n if and only ifρ(y) is well-defined, i.e., these quantities are well-defined on the complement of the closed set N := y ∈ R n : (5.14) Clearly, span(X) is contained inÑ . Hence, it is not difficult to see that the estimatorΩ BV a,ā,h,p is well-defined on R n \Ñ and that the estimatorΩ BV,J

U,a,ā,h,p
is well-defined on R n \(span(X, U ) ∪Ñ ). In fact, under Assumption 3 we have thatÑ = span(X) (see the proof of the subsequent corollary). Consequently, under Assumption 3, the estimatorΩ BV a,ā,h,p is well defined on R n \ span(X) anď Ω BV,J U,a,ā,h,p is well-defined on R n \ span(X, U ). [In order that the data-driven critical value is also defined for every y, we set C BV (y, h) equal to an arbitrary value (0, say) on the null-setÑ . Of course, the choice of assignment on this null-set is inconsequential for the result below.] The following corollary shows that the tests for hypotheses concerning polynomial trends based on data-driven tuning parameters and a data-driven critical value as suggested in Bunzel and Vogelsang (2005) have size one in case F ⊇ F ext AR(2) . The proof of this is based on a similar approach as used in the proof of Corollary 5.9 above, but has to deal with the fact that the choice of the tuning parameters and the critical value is data-driven, and hence is more involved. In particular, it turns out that in order for Assumption 1 to be satisfied for the covariance estimators used here, one has to work with null-sets N BV,U and N BV that are larger than span(X, U ) and span(X), respectively. holds for every μ 0 ∈ M 0 and for every σ 2 ∈ (0, ∞).
Remark 5.11. Alternatively one can consider T * , where for all y ∈ R n \ span(X) such thatΩ BV a,ā,h,p (y) is nonsingular, and where T * (y) = 0 else, (and we can similarly define a test statistic T * * withΩ BV,J U,a,ā,h,p and span(X, U ) in place ofΩ BV a,ā,h,p and span(X), respectively). While T * and T * * are well-defined test statistics, we are not guaranteed thatβ andΩ BV a,ā,h,p (β andΩ BV,J U,a,ā,h,p , respectively) satisfy Assumption 1 with N = span(X) (N = span(X, U ), respectively). However, T * as well as T * * differ from the corresponding test statistics considered in the preceding corollary at most on a null-set, hence the conclusions of the corollary carry over to T * and T * * .

Cyclical trends
We here consider briefly the case when one tests hypotheses concerning a cyclical trend, i.e., when the following assumption is satisfied: Assumption 4. Suppose that X = (E n,0 (ω),X) for some ω ∈ (0, π) whereX is an n × (k − 2)-dimensional matrix such that X has rank k (hereX is the empty matrix if k = 2). Furthermore, suppose that the restriction matrix R has a nonzero column R ·i for some i = 1, 2, i.e., the hypothesis involves coefficients of the cyclical component.
Under this assumption we obtain the subsequent theorem from Theorem 3.10.
Under a slightly stronger condition on F, the following theorem is applicable in case the assumption that N = ∅ or the nonnegative definiteness assumption onΩ in the previous theorem are violated.
Using these results, one can now obtain similar results as in Subsection 5.1.2 concerning the tests developed in Vogelsang (1998) and Bunzel and Vogelsang (2005) under Assumption 4. Due to space constraints, however, we do not spell out the details.
(ii) In case ω = 0 (or ω = π), Theorem 5.12 (with the before mentioned interpretation of Assumption 4) in fact continues to hold under the weaker assumption that F ⊇ F AR(1) . 21 This follows from Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) upon noting that Z = span(Ē n,0 (ω)) is a concentration space of the covariance model C(F), thatΩ vanishes on span(X) ⊇ Z as a consequence of the assumption N = ∅ (see the discussion following (27) in Preinerstorfer and Pötscher (2016)), and that Rβ(z) = 0 for every z ∈ Z with z = 0. 22 (iii) In case ω = 0 (or ω = π), Theorem 5.13 (with the before mentioned interpretation of Assumption 4) also continues to hold under the weaker as- on the event where {((Ē n,0 (ω)Ē n,0 (ω) ) 1/2 + D(ω) 1/2 )G ∈ R n \N * } and by ξ ω (x) = 0 otherwise, and if the distribution P 0,In appearing in the lower bound is replaced by (0) is the matrix D given in Part 3 and D(π) is the matrix D given in Part 4 of Lemma G.1 in Preinerstorfer and Pötscher (2016). This can be proved by making use of Theorem 5.19 and Lemma G.1 in Preinerstorfer and Pötscher (2016).
Remark A.2. (i) By construction J(L, C) = J(L, C ) = J(L, C ). Furthermore, all three collections coincide with the collection of all concentration spaces of C (the union over which is J(C ) in the notation of Preinerstorfer and Pötscher (2016)).
(ii) The sum S + L is an orthogonal sum and hence S is uniquely determined.
(iii) The map Σ → Σ is surjective from C to C by definition, and the analogous statement holds for the map Σ → Σ . But these maps need not be injective.
Lemma A.3. Let C be a covariance model and let L be a linear subspace of R n with dim(L) < n. Furthermore, let W ⊆ R n be a rejection region of a test, which is G(a + L)-invariant for some a ∈ R n . Then for every σ, 0 < σ < ∞, and every Σ ∈ C we have P a,σ 2 Σ (W ) = P a,σ 2 L(Σ) (W ) = P a,σ 2 Σ (W ) = P a,σ 2 Σ (W ).
Furthermore, these probabilities do not depend on σ and they are unaffected if a is replaced by an arbitrary element of a + L.
Proof. The first claim is essentially proved by the argument establishing (B.1) in Appendix B of Pötscher and Preinerstorfer (2018). The second claim is an immediate consequence of the assumed invariance (cf. also Proposition 5.4 in Preinerstorfer and Pötscher (2016)).
Proof of Theorem 3.1. By monotonicity w.r.t. C we may assume C > 0. Note that dim(M lin 0 ) = k − q < n by our general model assumptions. Since T is G(M 0 )-invariant by Lemma 5.16 in Preinerstorfer and Pötscher (2016), the preceding Lemma A.3, applied with L = M lin 0 and a = μ 0 , hence shows that it suffices to prove the theorem with C replaced by C . By Lemma A.1, also applied with L = M lin 0 , the space S appearing in the formulation of the theorem is a concentration space of C . We now apply Part 3 of Corollary 5.17 of Preinerstorfer and Pötscher (2016) to the linear model (2.1) considered in the present paper, but with C replaced by C . All assumptions of that result, except for the assumption thatΩ(z) = 0 and Rβ(z) = 0 simultaneously hold λ S -almost everywhere, are easily seen to be satisfied. We verify the remaining assumption now as follows: The discussion following (27) in Section 5.4 of Preinerstorfer and Pötscher (2016) shows that in case N = ∅ (which is assumed here)Ω(z) = 0 holds for every z ∈ span(X), and thus for every z ∈ S (since S ⊆ span(X) has been assumed). Hence,Ω(z) = 0 λ S -almost everywhere follows (note that λ S (R n \S) = 0 trivially holds). Furthermore, Assumption 1 together with N = ∅ imply thatβ(Xγ) =β(ε · 0 + Xγ) = εβ(0) + γ for every γ ∈ R k and every ε = 0, which of course impliesβ(Xγ) = γ for every γ ∈ R k . Since we have assumed S ⊆ span(X), it follows on the one hand that for every z ∈ S we have Rβ(z) = 0 if and only if z ∈ M lin 0 . On the other hand, by construction S ⊆ (M lin 0 ) ⊥ holds, showing that Rβ(z) = 0 must hold for all nonzero z ∈ S in view of the fact that S ⊆ span(X) has been assumed. Since S can not be zero-dimensional in view of its definition (cf. the discussion in Pötscher and Preinerstorfer (2018) following Definition 5.1), λ S ({0}) = 0 follows, which completes the proof (since λ S (R n \S) = 0 trivially holds).
Proof of Corollary 3.3. Necessity follows immediately from Theorem 3.1. For sufficiency we apply Corollary 5.6 in Pötscher and Preinerstorfer (2018) with V = {0}, i.e., with L = M lin 0 : Observe that dim(L) = k − q < n holds, and that T and N † = N * satisfy the assumptions of this corollary in view of Lemma 5.16 in the same reference. Since N * = span(X) is assumed, the condition S span(X) for every S ∈ J(M lin 0 , C) implies μ 0 + S N * = N † for every μ 0 ∈ M 0 (as span(X) is obviously invariant under addition of elements μ 0 ∈ M 0 ) and for every S ∈ J(M lin 0 , C). An application of Corollary 5.6 in Pötscher and Preinerstorfer (2018) now delivers (3.2).
Theorem A.4. Let C be a covariance model. Let T be a nonsphericity-corrected F-type test statistic of the form (2.4) based onβ andΩ satisfying Assumption 1. Assume further that q = 1, thatβ =β X , and thatΩ(y) is nonnegative definite for every y ∈ R n \N . Suppose there exists an S ∈ J(M lin 0 , C) with the property that s ∈ R n \N and s ∈ N * hold for λ S -almost all s ∈ S. Furthermore, assume that S is not orthogonal to span(X). Then (3.1) holds for every critical value C, −∞ < C < ∞, for every μ 0 ∈ M 0 , and for every σ 2 ∈ (0, ∞).
Proof. The proof proceeds as the proof of Theorem 3.1 up to the point where Part 3 of Corollary 5.17 of Preinerstorfer and Pötscher (2016) is applied to the linear model (2.1), but with C replaced by C . Here now all assumptions of this result in Preinerstorfer and Pötscher (2016) are easily seen to be satisfied, except for (i)Ω(s) = 0 λ S -almost everywhere, and (ii) Rβ(s) = 0 λ S -almost everywhere. Since s ∈ N * holds for λ S -almost all s ∈ S by assumption, we have thatΩ(s) is singular for λ S -almost all s ∈ S. But this impliesΩ(s) = 0 for λ Salmost all s ∈ S since q = 1 has been assumed. Since trivially λ S (R n \S) = 0, this verifies (i). We turn to (ii): Let s ∈ S. Note that then s ∈ (M lin 0 ) ⊥ by construction of S. But then Hence, Rβ(s) = 0 if and only if Rβ X (Π span(X) s) = 0, which in turn is equivalent to Π span(X) s ∈ M lin 0 (since Π span(X) s ∈ span(X)). But since Π span(X) s also belongs to (M lin 0 ) ⊥ as shown before, we conclude that Rβ(s) = 0 holds if and only if Π span(X) s = 0. As a consequence, s ∈ S : Rβ(s) = 0 = s ∈ S : Π span(X) s = 0 = S ∩ ker(Π span(X) ). This is a proper linear subspace of S except in case S ⊆ ker(Π span(X) ), which, however, is impossible by the assumption that S is not orthogonal to span(X). Hence, Rβ(s) = 0 only occurs on a proper linear subspace of S, and hence on a subset of S that has λ S -measure zero. Since trivially λ S (R n \S) = 0, this proves (ii) and completes the proof.

A.1. Some comments on Lemmata A.1 and A.3
Lemmata A.1 and A.3 allow one to derive results regarding the rejection probabilities under a covariance model C by working with a different, though related, covariance model C . [By Lemma A.1 this related covariance model has the property that its concentration spaces in the sense of Preinerstorfer and Pötscher (2016) are precisely given by the elements S of J(L, C).] A case in point is Theorem 3.1 in Section 3.1, which provides a "size one" result for the covariance model C, and which has been derived by applying Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) to the covariance model C , after an appeal to the aforementioned lemmata. In a similar vein, one can combine other results of Preinerstorfer and Pötscher (2016) with these lemmata, but we do not spell this out here. Often this will lead to improvements over what one obtains from a direct application of the respective result of Preinerstorfer and Pötscher (2016) to the covariance model C. We illustrate this in the following by comparing the result in Theorem 3.1 with what one gets if instead one works with the originally given C and directly applies Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) to C.
Suppose C and T are as in Theorem 3.1 (again with N = ∅ and nonnegative definiteness ofΩ(y) for every y ∈ R n ). Applying Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) to the originally given covariance model C allows one to obtain the following result: If a concentration space Z of C exists that satisfies Z ⊆ span(X) and Z M lin 0 , then (3.1) holds (for every C, every μ 0 ∈ M 0 , and every σ 2 ∈ (0, ∞)).
[To see this note that by Corollary 5.17 in Preinerstorfer and Pötscher (2016) one only has to verify thatΩ(z) = 0 and Rβ(z) = 0 hold λ Z -almost everywhere. The argument forΩ(z) = 0 λ Z -a.e. is identical to the corresponding argument given in the proof of Theorem 3.1. For the second claim a similar argument as in the proof of Theorem 3.1 shows that for z ∈ Z we have Rβ ( ⊆ span(X), and rank(X) < n). Furthermore, observe that S ⊆ span(X) must also hold, since Z ⊆ span(X) and M lin 0 ⊆ span(X). Theorem 3.1 will sometimes actually give a strictly better result for the following reason (at least for covariance models C that are bounded, an essentially costfree assumption in view of Remark 5.1(ii) in Pötscher and Preinerstorfer (2018)): Concentration spaces Z of C, that satisfy Z ⊆ span(X) but also Z ⊆ M lin 0 , can not be used in a direct application of Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016) since such spaces do not satisfy the relevant assumptions (note that Rβ(z) = 0 for all z ∈ Z holds for such spaces Z); hence they do not help in establishing a result of the form (3.1) via a direct application of Part 3 of Corollary 5.17 in Preinerstorfer and Pötscher (2016). Nevertheless, such concentration spaces can have associated with them spaces S ∈ J(M lin 0 , C) in the way as described in Part 2 of Lemma B.3 in Appendix B.1 of Pötscher and Preinerstorfer (2018), that then may allow one to establish (3.1) via an application of Theorem 3.1 (provided the condition S ⊆ span(X) can be shown to hold).

Appendix B: Proofs and auxiliary results for Section 3.2
Proof of Theorem 3.6. First, that S ⊆ span(X) is equivalent to A ⊆ span(X) where A := span(E n,ρ(γ1) (γ 1 ), . . . , E n,ρ(γp) (γ p )) is obvious since any element of A is the sum of an element of S and an element of M lin 0 ⊆ span(X). Second, S ⊆ span(X), M lin 0 ⊆ span(X), and the fact that S is certainly orthogonal to M lin 0 imply dim(S)+dim(M lin 0 ) ≤ dim(span(X)) = k. Since we always maintain k < n we can conclude that dim(S) < n − dim(M lin 0 ) must hold. This together with Proposition 6.1 of Pötscher and Preinerstorfer (2018) now shows that the linear subspace S figuring in the theorem belongs to J(M lin 0 , C(F)) as clearly dim(M lin 0 ) = k − q < n holds. An application of Theorem 3.1 with C = C(F) then completes the proof.
Proof of Lemma 3.8. If {γ} ∈ S(F,L) holds, the definition of S(F, L) (Definition 6.4 in Pötscher and Preinerstorfer (2018)) immediately implies that κ(ω(L), d(L)) + κ(γ, 1) < n must hold. To prove the converse, we first claim that there exists a sequence of spectral densities f m in F so that the sequence of spectral measures m gm defined by their spectral densities converges weakly to a spectral measure m that satisfies supp(m) ∩ [0, π] = {γ}.
Proof of Theorem 3.10. Since span(E n,ρ(γ) (γ)) ⊆ span(X) but we have span (E n,ρ(γ) must hold. The r.h.s. of the above inequality is now not larger than k in view of Lemma D.1 in Appendix D of Pötscher and Preinerstorfer (2018). As we always maintain k < n, the first claim follows. Because of the claim just established and since F ⊇ F AR(2) , we conclude from Lemma 3.8 that {γ} ∈ S(F, M lin 0 ) (note that dim(M lin 0 ) = k−q < n always holds). Set S = span(Π (M lin 0 ) ⊥ E n,ρ(γ) (γ)) and observe that S satisfies all the conditions of Theorem 3.6 (recall that S ⊆ span(X) if and only if span(E n,ρ(γ) (γ)) ⊆ span(X) holds as noted in that theorem). An application of Theorem 3.6 then establishes (3.5 Proof. Let γ ∈ [0, π] and c > 0 be given. For ease of notation we set L = M lin 0 in the remainder of the proof. We can use the argument in the proof of Lemma 3.8 to obtain a sequence of spectral densities f m in F AR(2) so that the sequence m gm with spectral density given by converges weakly to the spectral measure (δ −γ + δ γ )/2. Now, set e m := π −π |Δ ω(L),d(L) (e ιν )| 2 f m (ν)dν, which is a sequence of positive real numbers (since Δ ω(L),d(L) is a polynomial and f m is nonzero a.e.). By Lemma D.2 in Appendix D of Pötscher and Preinerstorfer (2018) we have where the convergence is due to weak convergence of m gm to (δ −γ + δ γ )/2; see Appendix D and Definition C.3 in Appendix C of Pötscher and Preinerstorfer (2018) for a definition of H n , Σ(·, ·), as well as . Lemma D.3 in Appendix D of the same reference now shows that the limit in the preceding display can be written as for some positive real number a = a(γ). Now set σ 2 m = e −1 m a −1 + ce m and set Observe that h m ∈ F ext AR(2) holds. But then Proof of Theorem 3.12. It suffices to prove the result for C > 0, which we henceforth assume. For ease of notation we set L = M lin 0 in the remainder of the proof. Let γ ∈ [0, π] satisfy span(E n,ρ(γ) (γ)) ⊆ span(X). Observe that for μ 0 ∈ M 0 , 0 < τ 2 < ∞, and h ∈ F ext AR(2) it holds that This follows from G(M 0 )-invariance of T and is proved in the same way as is relation (B.1) in Appendix B of Pötscher and Preinerstorfer (2018). Let now c > 0 and fix μ 0 ∈ M 0 , 0 < σ 2 < ∞. By Lemma B.1 there exists a sequence h m ∈ F ext AR(2) and a sequence σ 2 m of positive real numbers such that where the limit matrix is obviously nonsingular. Consequently, total variation norm (by an application of Scheffé's Lemma). By G(M 0 )-invariance of T we also have P μ0,σ 2 Σ(hm) (T ≥ C) = P μ0,σ 2 m Σ(hm) (T ≥ C), cf. Remark 5.5(iii) in Preinerstorfer and Pötscher (2016). Using (B.2), the preceding displays now imply that Size and power of HAR tests The limit in the preceding display coincides -using again G(M 0 )-invariance of T similarly as in (B.2) -with P μ0,σ 2 E n,ρ(γ) (γ)E n,ρ(γ) (γ)+cIn (T ≥ C).
Hence the additional assumption on Σ m appearing in Theorem 5.19 of Preinerstorfer and Pötscher (2016) is satisfied with s m = c m and D = Π span(E n,ρ(γ) (γ)) ⊥ . Note also that span(Σ) ⊆ M = span(X) holds by our assumption on γ. Furthermore, since span(Σ) = span(E n,ρ(γ) (γ)) is not contained in L = M lin 0 in view of the definition of ρ(γ), it follows that there exists a z ∈ span(Σ) so that z / ∈ L. As both spaces are linear it even follows that z / ∈ L is true for λ span(Σ) -almost all z ∈ span(Σ). In view of span(Σ) ⊆ span(X), this implies that Rβ(z) = 0 holds λ span(Σ) -almost everywhere. Thus Theorem 5.19 of Preinerstorfer and Pötscher (2016) is applicable, and delivers (setting Z =Ē n,ρ(γ) (γ) in that theorem) the claim (B.3), upon observing that in the definition ofξ(γ) in Theorem 5.19 of Preinerstorfer and Pötscher (2016) and in the event following that definition given in Theorem 5.19 of Preinerstorfer and Pötscher (2016) one can replacē Σ 1/2 by Π span(Σ) due to span(Σ) ⊆ M, due to the equivariance property ofΩ expressed in Assumption 1, and due to G(M)-invariance of N * (and noting that in the case considered here Π span(Σ) + D 1/2 translates into I n ). It remains to show the left-most inequality in (3.6). But this is obvious upon noting that the event whereΩ(G) is nonnegative definite is contained in the event ξ γ (x) ≥ 0 for every x.

Appendix C: Proofs for Section 4
Proof of Lemma 4.1. In view of G(M 0 )-invariance of T we may set σ 2 = 1. In case K is empty there is nothing to prove. Hence assume K = ∅. To prove Part 1, observe that then C * (K) > −∞. Choose C ∈ (−∞, C * (K)). Since C < C * (K), there exists an S ∈ K with C < C(S) ≤ C * (K). Now repeat, with obvious modifications, the arguments in the proof of Part 2 of Lemma 5.11 of Pötscher and Preinerstorfer (2018) that establish (25) in that reference. To prove Part 2, observe that C * (K) < ∞, and choose C ∈ (C * (K), ∞). Then there exists an S ∈ K with C * (K) ≤ C(S) < C. Now repeat, with obvious modifications, the arguments in the proof of Part 3 of Lemma 5.11 of Pötscher and Preinerstorfer (2018).
Lemma C.1. Suppose the assumptions of Lemma 4.1 are satisfied and suppose that G is a subset of K with the property that for any S ∈ K there is an element S ∈ G such that S ⊆ S or S ⊇ S holds. Then C * (K) = C * (G) and C * (K) = C * (G).
Remark C.2. An example of such a collection G is provided by the set of all minimal (maximal) elements of K w.r.t. inclusion. Note that this set is welldefined as K is a collection of linear subspaces of R n .
Proof. Obviouslyβ is well-defined and continuous on all of R n , and thus also when restricted to R n \N . Furthermore,Ω is clearly well-defined and symmetric on R n \N , and is continuous on R n \N in view of (c). Since N is a closed λ R n -null set by (a), we have verified Part (i) of Assumption 1 with N = N . Part (ii) of this assumption is contained in (b). Thatβ satisfies the required equivariance property in Part (iii) of Assumption 1 is obvious. ThatΩ satisfies the required equivariance property in that assumption follows immediately from (b), completing the verification of Part (iii) of Assumption 1. Part (iv) in that assumption follows from (d) together with R(X H HX) −1 R being positive definite. The same argument also shows thatΩ satisfies Assumption 2. The final statement is trivial.
Lemma D.2. Suppose W is constant and symmetric, and assume that Π span(X) ⊥ WΠ span(X) ⊥ is nonzero. Then the estimatorsβ andΩ W satisfy Assumption 1 with N = ∅, andΩ W satisfies Assumption 2. If, additionally, Π span(X) ⊥ WΠ span(X) ⊥ is nonnegative definite, thenΩ W (y) is nonnegative definite for every y ∈ R n . Proof. We verify (a)-(d) in Lemma D.1 for H = I n , ν =ω W , and N = ∅. Obviously (a) is satisfied, and (c) follows immediately from the constancy assumption on W, since ν =ω W can clearly be written as a quadratic form in y. Concerning (d), note thatω W (y) = 0 is equivalent to y Π span(X) ⊥ WΠ span(X) ⊥ y = 0. In view of the constancy assumption on W, the subset of R n on whichω W vanishes is the zero set of a multivariate polynomial, in fact of a quadratic form, on R n . Since the (constant) matrix Π span(X) ⊥ WΠ span(X) ⊥ is symmetric and nonzero, the polynomial under consideration does not vanish everywhere on R n , implying that the zero set is a λ R n -null set. This completes the verification of (d). That (b) is satisfied follows immediately from ν(y) =ω W (y) = n −1 y Π span(X) ⊥ WΠ span(X) ⊥ y, the constancy of W, and from Π span(X) ⊥ (δy + Xη) = δΠ span(X) ⊥ (y) for every δ ∈ R, every y ∈ R n and every η ∈ R k . Now apply Lemma D.1. Note that the final statement concerning nonnegative definiteness follows from the last part of Lemma D.1, since nonnegative definiteness of Π span(X) ⊥ WΠ span(X) ⊥ obviously implies nonnegativity ofω W on R n .
Proof of Corollary 5.4. The statement follows upon combining Lemma D.2 with Theorem 5.1.
Proof of Corollary 5.5. The first part of the corollary follows upon combining Lemma D.2 with Theorem 5.2 noting thatΩ W (z) is nonnegative definite if and only ifω W (z) ≥ 0. For the second statement, note that R(X X) −1 R is positive definite, and hence from which it follows (note that {y : Rβ(y) = r} is an affine subspace of R n that does not coincide with R n , and is hence a λ R n -null set) that P μ,σ 2 In (T ≥ 0) coincides with P μ,σ 2 In (ω W ≥ 0). For C ≥ 0 we then have (using monotonicity w.r.t. C) sup μ∈M1 sup 0<σ 2 <∞ P μ,σ 2 In (T ≥ C) ≤ sup μ∈M1 sup 0<σ 2 <∞ P μ,σ 2 In (ω W ≥ 0).
Proof. Observe thatρ,ρ, M AM , W AM , andω WAM are well-defined on R n \N AM . We next verify (a)-(d) in Lemma D.1 for H = I n , ν =ω WAM , and N = N AM . We start with (a): Using arguments as in the proof of Lemma 3.9 in Preinerstorfer (2017), or in the proof of Lemma B.1 in Preinerstorfer and Pötscher (2016), it is not difficult to verify that N AM is an algebraic set. We leave the details to the reader. This, and the assumption N AM = R n , implies that N AM is a closed λ R nnull set. To verify (c) in Lemma D.1 it suffices to establish continuity of W AM on R n \N AM , sinceû(y) is certainly continuous on R n . To achieve this note that, sinceρ is obviously continuous on R n \N AM , sinceρ(y) = 1 for y ∈ R n \N AM , and since A(·) is continuous on R, it suffices to verify that [κ QS (|i − j|/M AM )] n−1 i,j=1 is continuous on R n \N AM . Now, M AM is certainly continuous on R n \N AM and κ QS is continuous on R. Hence, [κ QS (|i−j|/M AM )] n−1 i,j=1 is easily seen to be continuous at every y ∈ R n \N AM that satisfies M AM (y) = 0. For y ∈ R n \N AM satisfying M AM (y) = 0 continuity of [κ QS (|i − j|/M AM )] n−1 i,j=1 follows from continuity of M AM on R n \N AM together with κ QS (x) → 0 as |x| → ∞, κ QS (0) = 1, and the convention [κ QS (|i − j|/M AM (y))] n−1 i,j=1 = I n−1 for y so that M AM (y) = 0. That (b) in Lemma D.1 holds is easily seen to follow fromû(δy + Xη) = δû(y) for every δ ∈ R, every y ∈ R n and every η ∈ R k , which in particular implieŝ ρ(δy + Xη) =ρ(y) andρ(δy + Xη) =ρ(y) for every δ = 0, every y ∈ R n \N AM and every η ∈ R k . Finally, note that (d) in Lemma D.1 is satisfied, becausê ω WAM (y) > 0 holds if y ∈ R n \N AM . The latter follows from the well-known fact that [κ QS (|i − j|/M AM (y))] n−1 i,j=1 is positive definite in case M AM (y) is welldefined (recall that this matrix is defined as I n−1 in case M AM (y) = 0), together with the observation that y ∈ R n \N AM implies A(ρ(y))û(y) =v(y) = 0. Now apply Lemma D.1. Note that the just established fact, thatω WAM (y) > 0 holds if y ∈ R n \N AM , also shows that the last part of Lemma D.1 applies, and hence shows thatΩ WAM (y) is positive definite for every y ∈ R n \N AM .
Proof of Corollary 5.6. This follows upon combining Lemma D.3 and Theorem 5.2, noting that the lower bound obtained via Theorem 5.2 equals 1 due to nonnegative definiteness ofΩ WAM (y) for every y ∈ R n \N AM , which is the complement of a λ R n -null set.
Lemma D.4. Let V ∈ {A, I n }, c ∈ R, let i ∈ {1, 2}, and let U be an n × mdimensional matrix with m ≥ 1 such that (X, U ) is of full column-rank k + m < n. Then the estimatorsβ V andΩ Vo c,U,i,V satisfy Assumption 1 with N = span(X, U ), andΩ Vo c,U,i,V also satisfies Assumption 2; furthermoreΩ Vo c,U,i,V is positive definite on R n \ span(X, U ).
Proof. We verify (a)-(d) in Lemma D.1 for H = V (which is invertible), ν = n j(V ) s 2 A,X exp(cJ i n,U ), and N = span(X, U ). By assumption, k + m < n, hence span(X, U ) is a closed λ R n -null set, showing that (a) in Lemma D.1 is satisfied. Next, note that s 2 A,X , s 2 In,(X,U ) , and s 2 A,(X,U ) are well-defined and continuous on R n ; and that J 1 n,U and J 2 n,U are well-defined and continuous on the set where s 2 In,(X,U ) and s 2 A,(X,U ) are nonzero, respectively. Obviously, s 2 In,(X,U ) (y) = 0 if and only if y ∈ span(X, U ). Similarly, s 2 A,(X,U ) (y) = 0 if 2. The estimatorsβ andΩ BV a,ā,h,p satisfy Assumption 1 with N = N BV , where N BV = span(X) ∪ {y ∈ R n \ span(X) :ρ(y) ∈ {ā 1 , . . . ,ā m }} in caseρ attains at least two different values on R n \ span(X), and N BV = span(X) else. Furthermore,Ω BV a,ā,h,p satisfies Assumption 2, andΩ BV a,ā,h,p (y) is positive definite for every y ∈ R n \N BV (in fact, for y ∈ R n \ span(X)).

Proof.
The assumption e n (n) / ∈ span(X) ⊥ implies non-existence of a y ∈ R n \ span(X) so that n−1 i=1û 2 i (y) = 0, showing thatρ is well-defined everywhere on R n \ span(X), i.e., thatÑ = span(X). We consider two cases: First, assume that the design matrix X is such thatρ = ρ holds everywhere on R n \ span(X) for some fixed ρ ∈ R. Then, the statements in 1. and 2., except for the positive definiteness claims, follow from Lemma D.5, because b BV (., a, A) and c BV (., p) are then constant equal to b and c, say, respectively, on R n \ span(X) and thuš Ω BV,J U,a,ā,h,p (y) =Ω BV,J W,U,c (y) holds for every y / ∈ span(X, U ), andΩ BV a,ā,h,p (y) = Ω BV W,c (y) holds for every y / ∈ span(X) where the matrix W = (W ij ) = (κ D (|i − j|/ max(bn, 2))). Observe here that W is constant in y, is symmetric, and is positive definite. The positive definiteness claims in 1. and 2. finally follow sincê ω W (y) > 0 holds for y ∈ R n \ span(X) in view of positive definiteness of W.
Next, we consider the case where X is such thatρ attains at least two different values on R n \ span(X). We start with the statement in 1.: First of all, N BV,U is easily seen to be G(M)-invariant (becauseρ : R n \ span(X) → R is so). Second, we can rewrite From that we see that N BV,U is a finite union of algebraic sets, and hence an algebraic set. Thus, N BV,U is closed. Since we also work under the hypothesis thatρ attains at least two different values on R n \ span(X), we can conclude that y ∈ R n : n i=2û i (y)û i−1 (y) −ā i n−1 i=1û 2 i (y) = 0 = R n holds for every i = 1, . . . , m . It follows that the algebraic set in the previous display is a λ R n -null set for every i = 1, . . . , m . Hence N BV,U is a closed λ R n -null set as span(X, U ) = R n . To prove the statements of 1., we now verify (a)-(d) in Lemma D.1 for H = I n , ν(.) =ω WBV (.) exp(c BV (., p)J 1 n,U (.)), and N = N BV,U . We have already verified (a). Furthermore, note that b BV (., a,ā) is continuous on R n \N BV,U . As a consequence, c BV (., p), and W BV (.), and thusω WBV are continuous on R n \N BV,U . We already know from the proof of Lemma D.4 that J 1 n,U is continuous on the complement of span(X, U ) ⊆ N BV,U . It thus follows that y →ω WBV (y) exp(c BV (y, p)J 1 n,U (y)) is continuous on R n \N BV,U . Hence, we have verified (c) in Lemma D.1. To verify (b) we recall from above that N BV,U is G(M)-invariant. Furthermore, the required equivariance property in (b) holds as a consequence of G(M)-invariance ofρ and J 1 n,U (cf. (D.5)), and hence of c BV (., p) and W BV (.) on R n \N BV,U , together withû(δy + Xη) = δû(y) for every δ = 0, y ∈ R n and η ∈ R k . That ν(y) =ω WBV (y) exp(c BV (y, p)J 1 n,U (y)) is even positive on R n \N BV,U follows because W BV (y) is a positive definite matrix for every y ∈ R n \Ñ andÑ = span(X) holds. This implies (d) in Lemma D.1, and also the sufficient condition for positive definiteness in the same lemma. The statements in 2. for the case whereρ attains at least two different values on R n \ span(X) are almost identical, and we skip the details.
Proof of Corollary 5.10. From Assumption 3 it follows that the last row of X is not equal to zero, i.e., e n (n) / ∈ span(X) ⊥ must hold. Hence, all assumptions of Lemma D.6 are satisfied. Combining this lemma with Theorem 5.2 proves the claims with C BV (y, h) replaced by an arbitrary constant critical value C (noting that the lower bound obtained via Theorem 5.2 equals 1 due to nonnegative definiteness ofΩ BV,J U,a,ā,h,p (Ω BV a,ā,h,p , respectively) on the complement of the λ R nnull set N BV,U (N BV , respectively)). But now we observe that y → C BV (y, h) is well-defined on R n (recall the convention preceding Corollary 5.10), and by construction takes on only finitely many real numbers C 1 < . . . < C l , say. Hence, for every f ∈ F, every μ 0 ∈ M 0 , every σ 2 ∈ (0, ∞) we can conclude that P μ0,σ 2 Σ(f ) ({y ∈ R n : T (y) ≥ C BV (y, h)}) ≥ P μ0,σ 2 Σ(f ) ({y ∈ R n : T (y) ≥ C l }).
Now apply what has been established before with C = C l . This completes the proof.
Proof of Theorem 5.13. We apply Theorem 3.12. It suffices to verify that γ = ω satisfies the assumption span(E n,ρ(γ) (γ)) ⊆ span(X) in that theorem. But this can be established exactly in the same way as in the proof of Theorem 5.12.