Electronic Journal of Statistics Articles (Project Euclid)
http://projecteuclid.org/euclid.ejs
The latest articles from Electronic Journal of Statistics on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2010 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Thu, 05 Aug 2010 15:41 EDTFri, 03 Jun 2011 09:20 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
The bias and skewness of M -estimators in regression
http://projecteuclid.org/euclid.ejs/1262876992
<strong>Christopher Withers</strong>, <strong>Saralees Nadarajah</strong><p><strong>Source: </strong>Electron. J. Statist., Volume 4, 1--14.</p><p><strong>Abstract:</strong><br/>
We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables.
</p>projecteuclid.org/euclid.ejs/1262876992_Thu, 05 Aug 2010 15:41 EDTThu, 05 Aug 2010 15:41 EDTThe minimax learning rates of normal and Ising undirected graphical modelshttps://projecteuclid.org/euclid.ejs/1593136952<strong>Luc Devroye</strong>, <strong>Abbas Mehrabian</strong>, <strong>Tommy Reddad</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2338--2361.</p><p><strong>Abstract:</strong><br/>
Let $G$ be an undirected graph with $m$ edges and $d$ vertices. We show that $d$-dimensional Ising models on $G$ can be learned from $n$ i.i.d. samples within expected total variation distance some constant factor of $\min \{1,\sqrt{(m+d)/n}\}$, and that this rate is optimal. We show that the same rate holds for the class of $d$-dimensional multivariate normal undirected graphical models with respect to $G$. We also identify the optimal rate of $\min \{1,\sqrt{m/n}\}$ for Ising models with no external magnetic field.
</p>projecteuclid.org/euclid.ejs/1593136952_20200625220318Thu, 25 Jun 2020 22:03 EDTOn the predictive potential of kernel principal componentshttps://projecteuclid.org/euclid.ejs/1578020612<strong>Ben Jones</strong>, <strong>Andreas Artemiou</strong>, <strong>Bing Li</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1--23.</p><p><strong>Abstract:</strong><br/>
We give a probabilistic analysis of a phenomenon in statistics which, until recently, has not received a convincing explanation. This phenomenon is that the leading principal components tend to possess more predictive power for a response variable than lower-ranking ones despite the procedure being unsupervised. Our result, in its most general form, shows that the phenomenon goes far beyond the context of linear regression and classical principal components — if an arbitrary distribution for the predictor $X$ and an arbitrary conditional distribution for $Y\vert X$ are chosen then any measureable function $g(Y)$, subject to a mild condition, tends to be more correlated with the higher-ranking kernel principal components than with the lower-ranking ones. The “arbitrariness” is formulated in terms of unitary invariance then the tendency is explicitly quantified by exploring how unitary invariance relates to the Cauchy distribution. The most general results, for technical reasons, are shown for the case where the kernel space is finite dimensional. The occurency of this tendency in real world databases is also investigated to show that our results are consistent with observation.
</p>projecteuclid.org/euclid.ejs/1578020612_20200630220406Tue, 30 Jun 2020 22:04 EDTMonotone least squares and isotonic quantileshttps://projecteuclid.org/euclid.ejs/1578020615<strong>Alexandre Mösching</strong>, <strong>Lutz Dümbgen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 24--49.</p><p><strong>Abstract:</strong><br/>
We consider bivariate observations $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ such that, conditional on the $X_{i}$, the $Y_{i}$ are independent random variables. Precisely, the conditional distribution function of $Y_{i}$ equals $F_{X_{i}}$, where $(F_{x})_{x}$ is an unknown family of distribution functions. Under the sole assumption that $x\mapsto F_{x}$ is isotonic with respect to stochastic order, one can estimate $(F_{x})_{x}$ in two ways:
(i) For any fixed $y$ one estimates the antitonic function $x\mapsto F_{x}(y)$ via nonparametric monotone least squares, replacing the responses $Y_{i}$ with the indicators $1_{[Y_{i}\le y]}$.
(ii) For any fixed $\beta \in (0,1)$ one estimates the isotonic quantile function $x\mapsto F_{x}^{-1}(\beta)$ via a nonparametric version of regression quantiles.
We show that these two approaches are closely related, with (i) being more flexible than (ii). Then, under mild regularity conditions, we establish rates of convergence for the resulting estimators $\hat{F}_{x}(y)$ and $\hat{F}_{x}^{-1}(\beta)$, uniformly over $(x,y)$ and $(x,\beta)$ in certain rectangles as well as uniformly in $y$ or $\beta$ for a fixed $x$.
</p>projecteuclid.org/euclid.ejs/1578020615_20200630220406Tue, 30 Jun 2020 22:04 EDTNon-parametric adaptive estimation of order 1 Sobol indices in stochastic models, with an application to Epidemiologyhttps://projecteuclid.org/euclid.ejs/1578042013<strong>Gwenaëlle Castellan</strong>, <strong>Anthony Cousien</strong>, <strong>Viet Chi Tran</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 50--81.</p><p><strong>Abstract:</strong><br/>
Global sensitivity analysis is a set of methods aiming at quantifying the contribution of an uncertain input parameter of the model (or combination of parameters) on the variability of the response. We consider here the estimation of the Sobol indices of order 1 which are commonly-used indicators based on a decomposition of the output’s variance. In a deterministic framework, when the same inputs always give the same outputs, these indices are usually estimated by replicated simulations of the model. In a stochastic framework, when the response given a set of input parameters is not unique due to randomness in the model, metamodels are often used to approximate the mean and dispersion of the response by deterministic functions. We propose a new non-parametric estimator without the need of defining a metamodel to estimate the Sobol indices of order 1. The estimator is based on warped wavelets and is adaptive in the regularity of the model. The convergence of the mean square error to zero, when the number of simulations of the model tend to infinity, is computed and an elbow effect is shown, depending on the regularity of the model. Applications in Epidemiology are carried to illustrate the use of non-parametric estimators.
</p>projecteuclid.org/euclid.ejs/1578042013_20200630220406Tue, 30 Jun 2020 22:04 EDTModel-based clustering with envelopeshttps://projecteuclid.org/euclid.ejs/1578042014<strong>Wenjing Wang</strong>, <strong>Xin Zhang</strong>, <strong>Qing Mai</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 82--109.</p><p><strong>Abstract:</strong><br/>
Clustering analysis is an important unsupervised learning technique in multivariate statistics and machine learning. In this paper, we propose a set of new mixture models called CLEMM (in short for Clustering with Envelope Mixture Models) that is based on the widely used Gaussian mixture model assumptions and the nascent research area of envelope methodology. Formulated mostly for regression models, envelope methodology aims for simultaneous dimension reduction and efficient parameter estimation, and includes a very recent formulation of envelope discriminant subspace for classification and discriminant analysis. Motivated by the envelope discriminant subspace pursuit in classification, we consider parsimonious probabilistic mixture models where the cluster analysis can be improved by projecting the data onto a latent lower-dimensional subspace. The proposed CLEMM framework and the associated envelope-EM algorithms thus provide foundations for envelope methods in unsupervised and semi-supervised learning problems. Numerical studies on simulated data and two benchmark data sets show significant improvement of our propose methods over the classical methods such as Gaussian mixture models, K-means and hierarchical clustering algorithms. An R package is available at https://github.com/kusakehan/CLEMM.
</p>projecteuclid.org/euclid.ejs/1578042014_20200630220406Tue, 30 Jun 2020 22:04 EDTNonparametric false discovery rate control for identifying simultaneous signalshttps://projecteuclid.org/euclid.ejs/1578366075<strong>Sihai Dave Zhao</strong>, <strong>Yet Tien Nguyen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 110--142.</p><p><strong>Abstract:</strong><br/>
It is frequently of interest to identify simultaneous signals, defined as features that exhibit statistical significance across each of several independent experiments. For example, genes that are consistently differentially expressed across experiments in different animal species can reveal evolutionarily conserved biological mechanisms. However, in some problems the test statistics corresponding to these features can have complicated or unknown null distributions. This paper proposes a novel nonparametric false discovery rate control procedure that can identify simultaneous signals even without knowing these null distributions. The method is shown, theoretically and in simulations, to asymptotically control the false discovery rate. It was also used to identify genes that were both differentially expressed and proximal to differentially accessible chromatin in the brains of mice exposed to a conspecific intruder. The proposed method is available in the R package github.com/sdzhao/ssa.
</p>projecteuclid.org/euclid.ejs/1578366075_20200630220406Tue, 30 Jun 2020 22:04 EDTEfficient estimation in expectile regression using envelope modelshttps://projecteuclid.org/euclid.ejs/1578366076<strong>Tuo Chen</strong>, <strong>Zhihua Su</strong>, <strong>Yi Yang</strong>, <strong>Shanshan Ding</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 143--173.</p><p><strong>Abstract:</strong><br/>
As a generalization of the classical linear regression, expectile regression (ER) explores the relationship between the conditional expectile of a response variable and a set of predictor variables. ER with respect to different expectile levels can provide a comprehensive picture of the conditional distribution of the response variable given the predictors. We adopt an efficient estimation method called the envelope model ([8]) in ER, and construct a novel envelope expectile regression (EER) model. Estimation of the EER parameters can be performed using the generalized method of moments (GMM). We establish the consistency and derive the asymptotic distribution of the EER estimators. In addition, we show that the EER estimators are asymptotically more efficient than the ER estimators. Numerical experiments and real data examples are provided to demonstrate the efficiency gains attained by EER compared to ER, and the efficiency gains can further lead to improvements in prediction.
</p>projecteuclid.org/euclid.ejs/1578366076_20200630220406Tue, 30 Jun 2020 22:04 EDTEstimation of linear projections of non-sparse coefficients in high-dimensional regressionhttps://projecteuclid.org/euclid.ejs/1578366077<strong>David Azriel</strong>, <strong>Armin Schwartzman</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 174--206.</p><p><strong>Abstract:</strong><br/>
In this work we study estimation of signals when the number of parameters is much larger than the number of observations. A large body of literature assumes for these kind of problems a sparse structure where most of the parameters are zero or close to zero. When this assumption does not hold, one can focus on low-dimensional functions of the parameter vector. In this work we study one-dimensional linear projections. Specifically, in the context of high-dimensional linear regression, the parameter of interest is ${\boldsymbol{\beta}}$ and we study estimation of $\mathbf{a}^{T}{\boldsymbol{\beta}}$. We show that $\mathbf{a}^{T}\hat{\boldsymbol{\beta}}$, where $\hat{\boldsymbol{\beta}}$ is the least squares estimator, using pseudo-inverse when $p>n$, is minimax and admissible. Thus, for linear projections no regularization or shrinkage is needed. This estimator is easy to analyze and confidence intervals can be constructed. We study a high-dimensional dataset from brain imaging where it is shown that the signal is weak, non-sparse and significantly different from zero.
</p>projecteuclid.org/euclid.ejs/1578366077_20200630220406Tue, 30 Jun 2020 22:04 EDTPerspective maximum likelihood-type estimation via proximal decompositionhttps://projecteuclid.org/euclid.ejs/1578452535<strong>Patrick L. Combettes</strong>, <strong>Christian L. Müller</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 207--238.</p><p><strong>Abstract:</strong><br/>
We introduce a flexible optimization model for maximum likelihood-type estimation (M-estimation) that encompasses and generalizes a large class of existing statistical models, including Huber’s concomitant M-estimator, Owen’s Huber/Berhu concomitant estimator, the scaled lasso, support vector machine regression, and penalized estimation with structured sparsity. The model, termed perspective M-estimation, leverages the observation that convex M-estimators with concomitant scale as well as various regularizers are instances of perspective functions, a construction that extends a convex function to a jointly convex one in terms of an additional scale variable. These nonsmooth functions are shown to be amenable to proximal analysis, which leads to principled and provably convergent optimization algorithms via proximal splitting. We derive novel proximity operators for several perspective functions of interest via a geometrical approach based on duality. We then devise a new proximal splitting algorithm to solve the proposed M-estimation problem and establish the convergence of both the scale and regression iterates it produces to a solution. Numerical experiments on synthetic and real-world data illustrate the broad applicability of the proposed framework.
</p>projecteuclid.org/euclid.ejs/1578452535_20200630220406Tue, 30 Jun 2020 22:04 EDTBayesian variance estimation in the Gaussian sequence model with partial information on the meanshttps://projecteuclid.org/euclid.ejs/1578452536<strong>Gianluca Finocchio</strong>, <strong>Johannes Schmidt-Hieber</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 239--271.</p><p><strong>Abstract:</strong><br/>
Consider the Gaussian sequence model under the additional assumption that a fixed fraction of the means is known. We study the problem of variance estimation from a frequentist Bayesian perspective. The maximum likelihood estimator (MLE) for $\sigma^{2}$ is biased and inconsistent. This raises the question whether the posterior is able to correct the MLE in this case. By developing a new proving strategy that uses refined properties of the posterior distribution, we find that the marginal posterior is inconsistent for any i.i.d. prior on the mean parameters. In particular, no assumption on the decay of the prior needs to be imposed. Surprisingly, we also find that consistency can be retained for a hierarchical prior based on Gaussian mixtures. In this case we also establish a limiting shape result and determine the limit distribution. In contrast to the classical Bernstein-von Mises theorem, the limit is non-Gaussian. We show that the Bayesian analysis leads to new statistical estimators outperforming the correctly calibrated MLE in a numerical simulation study.
</p>projecteuclid.org/euclid.ejs/1578452536_20200630220406Tue, 30 Jun 2020 22:04 EDTAsymptotic seed bias in respondent-driven samplinghttps://projecteuclid.org/euclid.ejs/1586397684<strong>Yuling Yan</strong>, <strong>Bret Hanlon</strong>, <strong>Sebastien Roch</strong>, <strong>Karl Rohe</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1577--1610.</p><p><strong>Abstract:</strong><br/>
Respondent-driven sampling (RDS) collects a sample of individuals in a networked population by incentivizing the sampled individuals to refer their contacts into the sample. This iterative process is initialized from some seed node(s). Sometimes, this selection creates a large amount of seed bias. Other times, the seed bias is small. This paper gains a deeper understanding of this bias by characterizing its effect on the limiting distribution of various RDS estimators. Using classical tools and results from multi-type branching processes [12], we show that the seed bias is negligible for the Generalized Least Squares (GLS) estimator and non-negligible for both the inverse probability weighted and Volz-Heckathorn (VH) estimators. In particular, we show that (i) above a critical threshold, VH converge to a non-trivial mixture distribution, where the mixture component depends on the seed node, and the mixture distribution is possibly multi-modal. Moreover, (ii) GLS converges to a Gaussian distribution independent of the seed node, under a certain condition on the Markov process. Numerical experiments with both simulated data and empirical social networks suggest that these results appear to hold beyond the Markov conditions of the theorems.
</p>projecteuclid.org/euclid.ejs/1586397684_20200630220406Tue, 30 Jun 2020 22:04 EDTRandom distributions via Sequential Quantile Arrayhttps://projecteuclid.org/euclid.ejs/1586397685<strong>Annalisa Fabretti</strong>, <strong>Samantha Leorato</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1611--1647.</p><p><strong>Abstract:</strong><br/>
We propose a method to generate random distributions with known quantile distribution, or, more generally, with known distribution for some form of generalized quantile. The method takes inspiration from the random Sequential Barycenter Array distributions (SBA) proposed by Hill and Monticino (1998) which generates a Random Probability Measure (RPM) with known expected value. We define the Sequential Quantile Array (SQA) and show how to generate a random SQA from which we can derive RPMs. The distribution of the generated SQA-RPM can have full support and the RPMs can be both discrete, continuous and differentiable. We face also the problem of the efficient implementation of the procedure that ensures that the approximation of the SQA-RPM by a finite number of steps stays close to the SQA-RPM obtained theoretically by the procedure. Finally, we compare SQA-RPMs with similar approaches as Polya Tree.
</p>projecteuclid.org/euclid.ejs/1586397685_20200630220406Tue, 30 Jun 2020 22:04 EDTOn change-point estimation under Sobolev sparsityhttps://projecteuclid.org/euclid.ejs/1586397686<strong>Aurélie Fischer</strong>, <strong>Dominique Picard</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1648--1689.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider the estimation of a change-point for possibly high-dimensional data in a Gaussian model, using a maximum likelihood method. We are interested in how dimension reduction can affect the performance of the method. We provide an estimator of the change-point that has a minimax rate of convergence, up to a logarithmic factor. The minimax rate is in fact composed of a fast rate —dimension-invariant— and a slow rate —increasing with the dimension. Moreover, it is proved that considering the case of sparse data, with a Sobolev regularity, there is a bound on the separation of the regimes above which there exists an optimal choice of dimension reduction, leading to the fast rate of estimation. We propose an adaptive dimension reduction procedure based on Lepski’s method and show that the resulting estimator attains the fast rate of convergence. Our results are then illustrated by a simulation study. In particular, practical strategies are suggested to perform dimension reduction.
</p>projecteuclid.org/euclid.ejs/1586397686_20200630220406Tue, 30 Jun 2020 22:04 EDTA fast MCMC algorithm for the uniform sampling of binary matrices with fixed marginshttps://projecteuclid.org/euclid.ejs/1586419218<strong>Guanyang Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1690--1706.</p><p><strong>Abstract:</strong><br/>
Uniform sampling of binary matrix with fixed margins is an important and difficult problem in statistics, computer science, ecology and so on. The well-known swap algorithm would be inefficient when the size of the matrix becomes large or when the matrix is too sparse/dense. Here we propose the Rectangle Loop algorithm, a Markov chain Monte Carlo algorithm to sample binary matrices with fixed margins uniformly. Theoretically the Rectangle Loop algorithm is better than the swap algorithm in Peskun’s order. Empirically studies also demonstrates the Rectangle Loop algorithm is remarkablely more efficient than the swap algorithm.
</p>projecteuclid.org/euclid.ejs/1586419218_20200630220406Tue, 30 Jun 2020 22:04 EDTPosterior contraction and credible sets for filaments of regression functionshttps://projecteuclid.org/euclid.ejs/1586916096<strong>Wei Li</strong>, <strong>Subhashis Ghosal</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1707--1743.</p><p><strong>Abstract:</strong><br/>
A filament consists of local maximizers of a smooth function $f$ when moving in a certain direction. A filamentary structure is an important feature of the shape of an object and is also considered as an important lower dimensional characterization of multivariate data. There have been some recent theoretical studies of filaments in the nonparametric kernel density estimation context. This paper supplements the current literature in two ways. First, we provide a Bayesian approach to the filament estimation in regression context and study the posterior contraction rates using a finite random series of B-splines basis. Compared with the kernel-estimation method, this has a theoretical advantage as the bias can be better controlled when the function is smoother, which allows obtaining better rates. Assuming that $f:\mathbb{R}^{2}\mapsto \mathbb{R}$ belongs to an isotropic Hölder class of order $\alpha \geq 4$, with the optimal choice of smoothing parameters, the posterior contraction rates for the filament points on some appropriately defined integral curves and for the Hausdorff distance of the filament are both $(n/\log n)^{(2-\alpha )/(2(1+\alpha ))}$. Secondly, we provide a way to construct a credible set with sufficient frequentist coverage for the filaments. We demonstrate the success of our proposed method in simulations and one application to earthquake data.
</p>projecteuclid.org/euclid.ejs/1586916096_20200630220406Tue, 30 Jun 2020 22:04 EDTSimultaneous transformation and rounding (STAR) models for integer-valued datahttps://projecteuclid.org/euclid.ejs/1586937696<strong>Daniel R. Kowal</strong>, <strong>Antonio Canale</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1744--1772.</p><p><strong>Abstract:</strong><br/>
We propose a simple yet powerful framework for modeling integer-valued data, such as counts, scores, and rounded data. The data-generating process is defined by Simultaneously Transforming and Rounding (STAR) a continuous-valued process, which produces a flexible family of integer-valued distributions capable of modeling zero-inflation, bounded or censored data, and over- or underdispersion. The transformation is modeled as unknown for greater distributional flexibility, while the rounding operation ensures a coherent integer-valued data-generating process. An efficient MCMC algorithm is developed for posterior inference and provides a mechanism for adaptation of successful Bayesian models and algorithms for continuous data to the integer-valued data setting. Using the STAR framework, we design a new Bayesian Additive Regression Tree model for integer-valued data, which demonstrates impressive predictive distribution accuracy for both synthetic data and a large healthcare utilization dataset. For interpretable regression-based inference, we develop a STAR additive model, which offers greater flexibility and scalability than existing integer-valued models. The STAR additive model is applied to study the recent decline in Amazon river dolphins.
</p>projecteuclid.org/euclid.ejs/1586937696_20200630220406Tue, 30 Jun 2020 22:04 EDTBias correction in conditional multivariate extremeshttps://projecteuclid.org/euclid.ejs/1587542553<strong>Mikael Escobar-Bach</strong>, <strong>Yuri Goegebeur</strong>, <strong>Armelle Guillou</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1773--1795.</p><p><strong>Abstract:</strong><br/>
We consider bias-corrected estimation of the stable tail dependence function in the regression context. To this aim, we first estimate the bias of a smoothed estimator of the stable tail dependence function, and then we subtract it from the estimator. The weak convergence, as a stochastic process, of the resulting asymptotically unbiased estimator of the conditional stable tail dependence function, correctly normalized, is established under mild assumptions, the covariate argument being fixed. The finite sample behaviour of our asymptotically unbiased estimator is then illustrated on a simulation study and compared to two alternatives, which are not bias corrected. Finally, our methodology is applied to a dataset of air pollution measurements.
</p>projecteuclid.org/euclid.ejs/1587542553_20200630220406Tue, 30 Jun 2020 22:04 EDTExact recovery in block spin Ising models at the critical linehttps://projecteuclid.org/euclid.ejs/1587693632<strong>Matthias Löwe</strong>, <strong>Kristina Schubert</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 1796--1815.</p><p><strong>Abstract:</strong><br/>
We show how to exactly reconstruct the block structure at the critical line in the so-called Ising block model. This model was recently re-introduced by Berthet, Rigollet and Srivastava in [2]. There the authors show how to exactly reconstruct blocks away from the critical line and they give an upper and a lower bound on the number of observations one needs; thereby they establish a minimax optimal rate (up to constants). Our technique relies on a combination of their methods with fluctuation results obtained in [20]. The latter are extended to the full critical regime. We find that the number of necessary observations depends on whether the interaction parameter between two blocks is positive or negative: In the first case, there are about $N\log N$ observations required to exactly recover the block structure, while in the latter case $\sqrt{N}\log N$ observations suffice.
</p>projecteuclid.org/euclid.ejs/1587693632_20200630220406Tue, 30 Jun 2020 22:04 EDTConsistent nonparametric change point detection combining CUSUM and marked empirical processeshttps://projecteuclid.org/euclid.ejs/1591149719<strong>Maria Mohr</strong>, <strong>Natalie Neumeyer</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2238--2271.</p><p><strong>Abstract:</strong><br/>
A weakly dependent time series regression model with multivariate covariates and univariate observations is considered, for which we develop a procedure to detect whether the nonparametric conditional mean function is stable in time against change point alternatives. Our proposal is based on a modified CUSUM type test procedure, which uses a sequential marked empirical process of residuals. We show weak convergence of the considered process to a centered Gaussian process under the null hypothesis of no change in the mean function and a stationarity assumption. This requires some sophisticated arguments for sequential empirical processes of weakly dependent variables. As a consequence we obtain convergence of Kolmogorov-Smirnov and Cramér-von Mises type test statistics. The proposed procedure acquires a very simple limiting distribution and nice consistency properties, features from which related tests are lacking. We moreover suggest a bootstrap version of the procedure and discuss its applicability in the case of unstable variances.
</p>projecteuclid.org/euclid.ejs/1591149719_20200630220406Tue, 30 Jun 2020 22:04 EDTCentral limit theorems for classical multidimensional scalinghttps://projecteuclid.org/euclid.ejs/1593569022<strong>Gongkai Li</strong>, <strong>Minh Tang</strong>, <strong>Nicolas Charon</strong>, <strong>Carey Priebe</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2362--2394.</p><p><strong>Abstract:</strong><br/>
Classical multidimensional scaling is a widely used method in dimensionality reduction and manifold learning. The method takes in a dissimilarity matrix and outputs a low-dimensional configuration matrix based on a spectral decomposition. In this paper, we present three noise models and analyze the resulting configuration matrices, or embeddings. In particular, we show that under each of the three noise models the resulting embedding gives rise to a central limit theorem. We also provide compelling simulations and real data illustrations of these central limit theorems. This perturbation analysis represents a significant advancement over previous results regarding classical multidimensional scaling behavior under randomness.
</p>projecteuclid.org/euclid.ejs/1593569022_20200630220406Tue, 30 Jun 2020 22:04 EDTOracally efficient estimation and simultaneous inference in partially linear single-index models for longitudinal datahttps://projecteuclid.org/euclid.ejs/1593569023<strong>Li Cai</strong>, <strong>Lei Jin</strong>, <strong>Suojin Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2395--2438.</p><p><strong>Abstract:</strong><br/>
Oracally efficient estimation and an asymptotically accurate simultaneous confidence band are established for the nonparametric link function in the partially linear single-index models for longitudinal data. The proposed procedure works for possibly unbalanced longitudinal data under general conditions. The link function estimator is shown to be oracally efficient in the sense that it is asymptotically equivalent in the order of $n^{-1/2}$ to that with all true values of the parameters being known oracally. Furthermore, the asymptotic distribution of the maximal deviation between the estimator and the true link function is provided, and hence a simultaneous confidence band for the link function is constructed. Finite sample simulation studies are carried out which support our asymptotic theory. The proposed SCB is applied to analyze a CD4 data set.
</p>projecteuclid.org/euclid.ejs/1593569023_20200630220406Tue, 30 Jun 2020 22:04 EDTHigh-dimensional joint estimation of multiple directed Gaussian graphical modelshttps://projecteuclid.org/euclid.ejs/1593569024<strong>Yuhao Wang</strong>, <strong>Santiago Segarra</strong>, <strong>Caroline Uhler</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2439--2483.</p><p><strong>Abstract:</strong><br/>
We consider the problem of jointly estimating multiple related directed acyclic graph (DAG) models based on high-dimensional data from each graph. This problem is motivated by the task of learning gene regulatory networks based on gene expression data from different tissues, developmental stages or disease states. We prove that under certain regularity conditions, the proposed $\ell _{0}$-penalized maximum likelihood estimator converges in Frobenius norm to the adjacency matrices consistent with the data-generating distributions and has the correct sparsity. In particular, we show that this joint estimation procedure leads to a faster convergence rate than estimating each DAG model separately. As a corollary, we also obtain high-dimensional consistency results for causal inference from a mix of observational and interventional data. For practical purposes, we propose jointGES consisting of Greedy Equivalence Search (GES) to estimate the union of all DAG models followed by variable selection using lasso to obtain the different DAGs, and we analyze its consistency guarantees. The proposed method is illustrated through an analysis of simulated data as well as epithelial ovarian cancer gene expression data.
</p>projecteuclid.org/euclid.ejs/1593569024_20200630220406Tue, 30 Jun 2020 22:04 EDTEmpirical likelihood inference with public-use survey datahttps://projecteuclid.org/euclid.ejs/1593569025<strong>Puying Zhao</strong>, <strong>J. N. K. Rao</strong>, <strong>Changbao Wu</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 1, 2484--2509.</p><p><strong>Abstract:</strong><br/>
Public-use survey data are an important source of information for researchers in social sciences and health studies to build statistical models and make inferences on the target finite population. This paper presents two general inferential tools through the pseudo empirical likelihood and the sample empirical likelihood methods. Theoretical results on point estimation and linear or nonlinear hypothesis tests involving parameters defined through estimating equations are established, and practical issues with the implementation of the proposed methods are discussed. Results from simulation studies and an application to the 2016 General Social Survey dataset of Statistics Canada show that the proposed methods work well under different scenarios. The inferential procedures and theoretical results presented in the paper make the empirical likelihood a practically useful tool for users of complex survey data.
</p>projecteuclid.org/euclid.ejs/1593569025_20200630220406Tue, 30 Jun 2020 22:04 EDTStein hypothesis and screening effect for covariances with compact supporthttps://projecteuclid.org/euclid.ejs/1593741821<strong>Emilio Porcu</strong>, <strong>Viktor Zastavnyi</strong>, <strong>Moreno Bevilacqua</strong>, <strong>Xavier Emery</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2510--2528.</p><p><strong>Abstract:</strong><br/>
In spatial statistics, the screening effect historically refers to the situation when the observations located far from the predictand receive a small (ideally, zero) kriging weight. Several factors play a crucial role in this phenomenon: among them, the spatial design, the dimension of the spatial domain where the observations are defined, the mean-square properties of the underlying random field and its covariance function or, equivalently, its spectral density.
The tour de force by Michael L. Stein provides a formal definition of the screening effect and puts emphasis on the Matérn covariance function, advocated as a good covariance function to yield such an effect. Yet, it is often recommended not to use covariance functions with a compact support. This paper shows that some classes of covariance functions being compactly supported allow for a screening effect according to Stein’s definition, in both regular and irregular settings of the spatial design. Further, numerical experiments suggest that the screening effect under a class of compactly supported covariance functions is even stronger than the screening effect under a Matérn model.
</p>projecteuclid.org/euclid.ejs/1593741821_20200702220342Thu, 02 Jul 2020 22:03 EDTTesting for local covariate trend effects in volatility modelshttps://projecteuclid.org/euclid.ejs/1594433077<strong>Adriano Zanin Zambom</strong>, <strong>Yulia R. Gel</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2529--2550.</p><p><strong>Abstract:</strong><br/>
With the large amounts of modern financial and econometric data available from disparate informational sources, it becomes increasingly critical to develop inferential tools for the impact of exogenous factors on volatility of financial time series. We develop a new Local Covariate Trend test (LOCOT) for the significance of an exogenous covariate in the autoregressive conditional heteroscedastic volatility model, where the covariate effect can be nonlinear. The new LOCOT statistic is based on an artificial high-dimensional one-way ANOVA where the number of factor levels increases with the sample size. We derive asymptotic properties of the new LOCOT statistic and show its competitive finite sample performance in a broad range of simulation studies. We illustrate utility of the new testing approach in application to volatility analysis of three major cryptoassets and their relationship with the prices of gold and the S&P500 index.
</p>projecteuclid.org/euclid.ejs/1594433077_20200710220439Fri, 10 Jul 2020 22:04 EDTDetangling robustness in high dimensions: Composite versus model-averaged estimationhttps://projecteuclid.org/euclid.ejs/1594692172<strong>Jing Zhou</strong>, <strong>Gerda Claeskens</strong>, <strong>Jelena Bradic</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2551--2599.</p><p><strong>Abstract:</strong><br/>
Robust methods, though ubiquitous in practice, are yet to be fully understood in the context of regularized estimation and high dimensions. Even simple questions become challenging very quickly. For example, classical statistical theory identifies equivalence between model-averaged and composite quantile estimation. However, little to nothing is known about such equivalence between methods that encourage sparsity. This paper provides a toolbox to further study robustness in these settings and focuses on prediction. In particular, we study optimally weighted model-averaged as well as composite $l_{1}$-regularized estimation. Optimal weights are determined by minimizing the asymptotic mean squared error. This approach incorporates the effects of regularization, without the assumption of perfect selection, as is often used in practice. Such weights are then optimal for prediction quality. Through an extensive simulation study, we show that no single method systematically outperforms others. We find, however, that model-averaged and composite quantile estimators often outperform least-squares methods, even in the case of Gaussian model noise. Real data application witnesses the method’s practical use through the reconstruction of compressed audio signals.
</p>projecteuclid.org/euclid.ejs/1594692172_20200713220255Mon, 13 Jul 2020 22:02 EDTOptimal rates for estimation of two-dimensional totally positive distributionshttps://projecteuclid.org/euclid.ejs/1595037615<strong>Jan-Christian Hütter</strong>, <strong>Cheng Mao</strong>, <strong>Philippe Rigollet</strong>, <strong>Elina Robeva</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2600--2652.</p><p><strong>Abstract:</strong><br/>
We study minimax estimation of two-dimensional totally positive distributions. Such distributions pertain to pairs of strongly positively dependent random variables and appear frequently in statistics and probability. In particular, for distributions with $\beta$-Hölder smooth densities where $\beta\in(0,2)$, we observe polynomially faster minimax rates of estimation when, additionally, the total positivity condition is imposed. Moreover, we demonstrate fast algorithms to compute the proposed estimators and corroborate the theoretical rates of estimation by simulation studies.
</p>projecteuclid.org/euclid.ejs/1595037615_20200717220020Fri, 17 Jul 2020 22:00 EDTConfidence regions and minimax rates in outlier-robust estimation on the probability simplexhttps://projecteuclid.org/euclid.ejs/1595037616<strong>Amir-Hossein Bateni</strong>, <strong>Arnak S. Dalalyan</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2653--2677.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating the mean of a distribution supported by the $k$-dimensional probability simplex in the setting where an $\varepsilon$ fraction of observations are subject to adversarial corruption. A simple particular example is the problem of estimating the distribution of a discrete random variable. Assuming that the discrete variable takes $k$ values, the unknown parameter $\boldsymbol{\theta}$ is a $k$-dimensional vector belonging to the probability simplex. We first describe various settings of contamination and discuss the relation between these settings. We then establish minimax rates when the quality of estimation is measured by the total-variation distance, the Hellinger distance, or the $\mathbb{L}^{2}$-distance between two probability measures. We also provide confidence regions for the unknown mean that shrink at the minimax rate. Our analysis reveals that the minimax rates associated to these three distances are all different, but they are all attained by the sample average. Furthermore, we show that the latter is adaptive to the possible sparsity of the unknown vector. Some numerical experiments illustrating our theoretical findings are reported.
</p>projecteuclid.org/euclid.ejs/1595037616_20200717220020Fri, 17 Jul 2020 22:00 EDTCorrecting for differential recruitment in respondent-driven sampling data using ego-network informationhttps://projecteuclid.org/euclid.ejs/1595037617<strong>Isabelle S. Beaudry</strong>, <strong>Krista J. Gile</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2678--2713.</p><p><strong>Abstract:</strong><br/>
Respondent-Driven sampling (RDS) is a sampling method devised to overcome challenges with sampling hard-to-reach human populations. The sampling starts with a limited number of individuals who are asked to recruit a small number of their contacts. Every surveyed individual is subsequently given the same opportunity to recruit additional members of the target population until a pre-established sample size is achieved. The recruitment process consequently implies that the survey respondents are responsible for deciding who enters the study. Most RDS prevalence estimators assume that participants select among their contacts completely at random. The main objective of this work is to correct the inference for departure from this assumption, such as systematic recruitment based on the characteristics of the individuals or based on the nature of relationships. To accomplish this, we introduce three forms of non-random recruitment, provide estimators for these recruitment behaviors and extend three estimators and their associated variance procedures. The proposed methodology is assessed through a simulation study capturing various sampling and network features. Finally, the proposed methods are applied to a public health setting.
</p>projecteuclid.org/euclid.ejs/1595037617_20200717220020Fri, 17 Jul 2020 22:00 EDTBayesian shrinkage towards sharp minimaxityhttps://projecteuclid.org/euclid.ejs/1595404877<strong>Qifan Song</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2714--2741.</p><p><strong>Abstract:</strong><br/>
Shrinkage priors are becoming more and more popular in Bayesian modeling for high dimensional sparse problems due to its computational efficiency. Recent works show that a polynomially decaying prior leads to satisfactory posterior asymptotics under regression models. In the literature, statisticians have investigated how the global shrinkage parameter, i.e., the scale parameter, in a heavy tailed prior affects the posterior contraction. In this work, we explore how the shape of the prior, or more specifically, the polynomial order of the prior tail affects the posterior. We discover that, under the sparse normal means model, the polynomial order does affect the multiplicative constant of the posterior contraction rate. More importantly, if the polynomial order is sufficiently close to 1, it will induce the optimal Bayesian posterior convergence, in the sense that the Bayesian contraction rate is sharply minimax, i.e., not only the order, but also the multiplicative constant of the posterior contraction rate are optimal. The above Bayesian sharp minimaxity holds when the global shrinkage parameter follows a deterministic choice which depends on the unknown sparsity $s$. Therefore, a Beta-prior modeling is further proposed, such that our sharply minimax Bayesian procedure is adaptive to unknown $s$. Our theoretical discoveries are justified by simulation studies.
</p>projecteuclid.org/euclid.ejs/1595404877_20200722040122Wed, 22 Jul 2020 04:01 EDTGaussian processes with multidimensional distribution inputs via optimal transport and Hilbertian embeddinghttps://projecteuclid.org/euclid.ejs/1595404878<strong>François Bachoc</strong>, <strong>Alexandra Suvorikova</strong>, <strong>David Ginsbourger</strong>, <strong>Jean-Michel Loubes</strong>, <strong>Vladimir Spokoiny</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2742--2772.</p><p><strong>Abstract:</strong><br/>
In this work, we propose a way to construct Gaussian processes indexed by multidimensional distributions. More precisely, we tackle the problem of defining positive definite kernels between multivariate distributions via notions of optimal transport and appealing to Hilbert space embeddings. Besides presenting a characterization of radial positive definite and strictly positive definite kernels on general Hilbert spaces, we investigate the statistical properties of our theoretical and empirical kernels, focusing in particular on consistency as well as the special case of Gaussian distributions. A wide set of applications is presented, both using simulations and implementation with real data.
</p>projecteuclid.org/euclid.ejs/1595404878_20200722040122Wed, 22 Jul 2020 04:01 EDTOn finite exchangeability and conditional independencehttps://projecteuclid.org/euclid.ejs/1595404879<strong>Kayvan Sadeghi</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2773--2797.</p><p><strong>Abstract:</strong><br/>
We study the independence structure of finitely exchangeable distributions over random vectors and random networks. In particular, we provide necessary and sufficient conditions for an exchangeable vector so that its elements are completely independent or completely dependent. We also provide a sufficient condition for an exchangeable vector so that its elements are marginally independent. We then generalize these results and conditions for exchangeable random networks. In this case, it is demonstrated that the situation is more complex. We show that the independence structure of exchangeable random networks lies in one of six regimes that are two-fold dual to one another, represented by undirected and bidirected independence graphs in graphical model sense with graphs that are complement of each other. In addition, under certain additional assumptions, we provide necessary and sufficient conditions for the exchangeable network distributions to be faithful to each of these graphs.
</p>projecteuclid.org/euclid.ejs/1595404879_20200722040122Wed, 22 Jul 2020 04:01 EDTConvergence analysis of Tikhonov regularization for non-linear statistical inverse problemshttps://projecteuclid.org/euclid.ejs/1596765613<strong>Abhishake Rastogi</strong>, <strong>Gilles Blanchard</strong>, <strong>Peter Mathé</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2798--2841.</p><p><strong>Abstract:</strong><br/>
We study a non-linear statistical inverse problem, where we observe the noisy image of a quantity through a non-linear operator at some random design points. We consider the widely used Tikhonov regularization (or method of regularization) approach to estimate the quantity for the non-linear ill-posed inverse problem. The estimator is defined as the minimizer of a Tikhonov functional, which is the sum of a data misfit term and a quadratic penalty term. We develop a theoretical analysis for the minimizer of the Tikhonov regularization scheme using the concept of reproducing kernel Hilbert spaces. We discuss optimal rates of convergence for the proposed scheme, uniformly over classes of admissible solutions, defined through appropriate source conditions.
</p>projecteuclid.org/euclid.ejs/1596765613_20200806220019Thu, 06 Aug 2020 22:00 EDTUnbiased Markov chain Monte Carlo for intractable target distributionshttps://projecteuclid.org/euclid.ejs/1596765614<strong>Lawrence Middleton</strong>, <strong>George Deligiannidis</strong>, <strong>Arnaud Doucet</strong>, <strong>Pierre E. Jacob</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2842--2891.</p><p><strong>Abstract:</strong><br/>
Performing numerical integration when the integrand itself cannot be evaluated point-wise is a challenging task that arises in statistical analysis, notably in Bayesian inference for models with intractable likelihood functions. Markov chain Monte Carlo (MCMC) algorithms have been proposed for this setting, such as the pseudo-marginal method for latent variable models and the exchange algorithm for a class of undirected graphical models. As with any MCMC algorithm, the resulting estimators are justified asymptotically in the limit of the number of iterations, but exhibit a bias for any fixed number of iterations due to the Markov chains starting outside of stationarity. This “burn-in” bias is known to complicate the use of parallel processors for MCMC computations. We show how to use coupling techniques to generate unbiased estimators in finite time, building on recent advances for generic MCMC algorithms. We establish the theoretical validity of some of these procedures, by extending existing results to cover the case of polynomially ergodic Markov chains. The efficiency of the proposed estimators is compared with that of standard MCMC estimators, with theoretical arguments and numerical experiments including state space models and Ising models.
</p>projecteuclid.org/euclid.ejs/1596765614_20200806220019Thu, 06 Aug 2020 22:00 EDTConsistent estimation of high-dimensional factor models when the factor number is over-estimatedhttps://projecteuclid.org/euclid.ejs/1597197614<strong>Matteo Barigozzi</strong>, <strong>Haeran Cho</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2892--2921.</p><p><strong>Abstract:</strong><br/>
A high-dimensional $r$-factor model for an $n$-dimensional vector time series is characterised by the presence of a large eigengap (increasing with $n$) between the $r$-th and the $(r+1)$-th largest eigenvalues of the covariance matrix. Consequently, Principal Component (PC) analysis is the most popular estimation method for factor models and its consistency, when $r$ is correctly estimated, is well-established in the literature. However, popular factor number estimators often suffer from the lack of an obvious eigengap in empirical eigenvalues and tend to over-estimate $r$ due, for example, to the existence of non-pervasive factors affecting only a subset of the series. We show that the errors in the PC estimators resulting from the over-estimation of $r$ are non-negligible, which in turn lead to the violation of the conditions required for factor-based large covariance estimation. To remedy this, we propose new estimators of the factor model based on scaling the entries of the sample eigenvectors. We show both theoretically and numerically that the proposed estimators successfully control for the over-estimation error, and investigate their performance when applied to risk minimisation of a portfolio of financial time series.
</p>projecteuclid.org/euclid.ejs/1597197614_20200811220021Tue, 11 Aug 2020 22:00 EDTJoint estimation for SDE driven by locally stable Lévy processeshttps://projecteuclid.org/euclid.ejs/1597197615<strong>Emmanuelle Clément</strong>, <strong>Arnaud Gloter</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2922--2956.</p><p><strong>Abstract:</strong><br/>
Considering a class of stochastic differential equations driven by a locally stable process, we address the joint parametric estimation, based on high frequency observations of the process on a fixed time interval, of the drift coefficient, the scale coefficient and the jump activity of the process. Extending the methodology proposed in [6], where the jump activity was assumed to be known, we obtain two different rates of convergence in estimating simultaneously the scale parameter and the jump activity, depending on the scale coefficient. If the scale coefficient is multiplicative: $a(x,\sigma )=\sigma \overline{a}(x)$, the joint estimation of the scale coefficient and the jump activity behaves as for the translated stable process studied in [5] and the rate of convergence of our estimators is non diagonal. In the non multiplicative case, the results are different and we obtain a diagonal and faster rate of convergence which coincides with the one obtained in estimating marginally each parameter. In both cases, the estimation method is illustrated by numerical simulations showing that our estimators are rather easy to implement.
</p>projecteuclid.org/euclid.ejs/1597197615_20200811220021Tue, 11 Aug 2020 22:00 EDTNonparametric estimation of the ability density in the Mixed-Effect Rasch Modelhttps://projecteuclid.org/euclid.ejs/1597284417<strong>Johanna Kappus</strong>, <strong>Friedrich Liese</strong>, <strong>Alexander Meister</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2957--2987.</p><p><strong>Abstract:</strong><br/>
The Rasch model is widely used in the field of psychometrics when $n$ persons under test answer $m$ questions and the score, which describes the correctness of the answers, is given by a binary $n\times m$-matrix. We consider the Mixed-Effect Rasch Model, in which the persons are chosen randomly from a huge population. The goal is to estimate the ability density of this population under nonparametric constraints, which turns out to be a statistical linear inverse problem with an unknown but estimable operator. Based on our previous result on asymptotic equivalence to a two-layer Gaussian model, we construct an estimation procedure and study its asymptotic optimality properties as $n$ tends to infinity, as does $m$, but moderately with respect to $n$. Moreover numerical simulations are provided.
</p>projecteuclid.org/euclid.ejs/1597284417_20200812220703Wed, 12 Aug 2020 22:07 EDTFrom Gauss to Kolmogorov: Localized measures of complexity for ellipseshttps://projecteuclid.org/euclid.ejs/1597456814<strong>Yuting Wei</strong>, <strong>Billy Fang</strong>, <strong>Martin J. Wainwright</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 2988--3031.</p><p><strong>Abstract:</strong><br/>
The Gaussian width is a fundamental quantity in probability, statistics and geometry, known to underlie the intrinsic difficulty of estimation and hypothesis testing. In this work, we show how the Gaussian width, when localized to any given point of an ellipse, can be controlled by the Kolmogorov width of a set similarly localized. Among other consequences, this connection, when coupled with a previous result due to Chatterjee, leads to a tight characterization of the estimation error of least-squares regression as a function of the true regression vector within the ellipse. This characterization reveals that the rate of error decay varies substantially as a function of location: as a concrete example, in Sobolev ellipses of smoothness $\alpha $, we exhibit rates that vary from $(\sigma ^{2})^{\frac{2\alpha }{2\alpha +1}}$, corresponding to the classical global rate, to the faster rate $(\sigma ^{2})^{\frac{4\alpha }{4\alpha +1}}$. We also show how the local Kolmogorov width can be related to local metric entropy.
</p>projecteuclid.org/euclid.ejs/1597456814_20200814220021Fri, 14 Aug 2020 22:00 EDTCorrecting an estimator of a multivariate monotone function with isotonic regressionhttps://projecteuclid.org/euclid.ejs/1597456815<strong>Ted Westling</strong>, <strong>Mark J. van der Laan</strong>, <strong>Marco Carone</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3032--3069.</p><p><strong>Abstract:</strong><br/>
In many problems, a sensible estimator of a possibly multivariate monotone function may fail to be monotone. We study the correction of such an estimator obtained via projection onto the space of functions monotone over a finite grid in the domain. We demonstrate that this corrected estimator has no worse supremal estimation error than the initial estimator, and that analogously corrected confidence bands contain the true function whenever the initial bands do, at no loss to band width. Additionally, we demonstrate that the corrected estimator is asymptotically equivalent to the initial estimator if the initial estimator satisfies a stochastic equicontinuity condition and the true function is Lipschitz and strictly monotone. We provide simple sufficient conditions in the special case that the initial estimator is asymptotically linear, and illustrate the use of these results for estimation of a G-computed distribution function. Our stochastic equicontinuity condition is weaker than standard uniform stochastic equicontinuity, which has been required for alternative correction procedures. This allows us to apply our results to the bivariate correction of the local linear estimator of a conditional distribution function known to be monotone in its conditioning argument. Our experiments suggest that the projection step can yield significant practical improvements.
</p>projecteuclid.org/euclid.ejs/1597456815_20200814220021Fri, 14 Aug 2020 22:00 EDTNonparametric distributed learning under general designshttps://projecteuclid.org/euclid.ejs/1597975224<strong>Meimei Liu</strong>, <strong>Zuofeng Shang</strong>, <strong>Guang Cheng</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3070--3102.</p><p><strong>Abstract:</strong><br/>
This paper focuses on the distributed learning in nonparametric regression framework. With sufficient computational resources, the efficiency of distributed algorithms improves as the number of machines increases. We aim to analyze how the number of machines affects statistical optimality. We establish an upper bound for the number of machines to achieve statistical minimax in two settings: nonparametric estimation and hypothesis testing. Our framework is general compared with existing work. We build a unified frame in distributed inference for various regression problems, including thin-plate splines and additive regression under random design: univariate, multivariate, and diverging-dimensional designs. The main tool to achieve this goal is a tight bound of an empirical process by introducing the Green function for equivalent kernels. Thorough numerical studies back theoretical findings.
</p>projecteuclid.org/euclid.ejs/1597975224_20200820220032Thu, 20 Aug 2020 22:00 EDTMethod of moments estimators for the extremal index of a stationary time serieshttps://projecteuclid.org/euclid.ejs/1597975225<strong>Axel Bücher</strong>, <strong>Tobias Jennessen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3103--3156.</p><p><strong>Abstract:</strong><br/>
The extremal index $\theta $, a number in the interval $[0,1]$, is known to be a measure of primal importance for analyzing the extremes of a stationary time series. New rank-based estimators for $\theta $ are proposed which rely on the construction of approximate samples from the exponential distribution with parameter $\theta $ that is then to be fitted via the method of moments. The new estimators are analyzed both theoretically as well as empirically through a large-scale simulation study. In specific scenarios, in particular for time series models with $\theta \approx 1$, they are found to be superior to recent competitors from the literature.
</p>projecteuclid.org/euclid.ejs/1597975225_20200820220032Thu, 20 Aug 2020 22:00 EDTSpectral estimation for non-linear long range dependent discrete time trawl processeshttps://projecteuclid.org/euclid.ejs/1597996811<strong>Paul Doukhan</strong>, <strong>François Roueff</strong>, <strong>Joseph Rynkiewicz</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3157--3191.</p><p><strong>Abstract:</strong><br/>
Discrete time trawl processes constitute a large class of time series parameterized by a trawl sequence $(a_{j})_{j\in \mathbb{N}}$ and defined though a sequence of independent and identically distributed (i.i.d.) copies of a continuous time process $(\gamma (t))_{t\in \mathbb{R}}$ called the seed process. They provide a general framework for modeling linear or non-linear long range dependent time series. We investigate the spectral estimation, either pointwise or broadband, of long range dependent discrete-time trawl processes. The difficulty arising from the variety of seed processes and of trawl sequences is twofold. First, the spectral density may take different forms, often including smooth additive correction terms. Second, trawl processes with similar spectral densities may exhibit very different statistical behaviors. We prove the consistency of our estimators under very general conditions and we show that a wide class of trawl processes satisfy them. This is done in particular by introducing a weighted weak dependence index that can be of independent interest. The broadband spectral estimator includes an estimator of the long memory parameter. We complete this work with numerical experiments to evaluate the finite sample size performance of this estimator for various integer valued discrete time trawl processes.
</p>projecteuclid.org/euclid.ejs/1597996811_20200821040028Fri, 21 Aug 2020 04:00 EDTAsymptotic properties for the parameter estimation in Ornstein-Uhlenbeck process with discrete observationshttps://projecteuclid.org/euclid.ejs/1599271584<strong>Hui Jiang</strong>, <strong>Hui Liu</strong>, <strong>Youzhou Zhou</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3192--3229.</p><p><strong>Abstract:</strong><br/>
In this paper, under discrete observations, we study Cramér-type moderate deviations (extended central limit theorem) for parameter estimation in Ornstein-Uhlenbeck process. Our results contain both stationary and explosive cases. For applications, we propose test statistics which can be used to construct rejection regions in the hypothesis testing for the drift coefficient, and the corresponding probability of type II error tends to zero exponentially. Simulation study shows that our test statistics have good finite-sample performances both in size and power. The main methods include the deviation inequalities for multiple Wiener-Itô integrals, as well as the asymptotic analysis techniques.
</p>projecteuclid.org/euclid.ejs/1599271584_20200904220631Fri, 04 Sep 2020 22:06 EDTVertex nomination, consistent estimation, and adversarial modificationhttps://projecteuclid.org/euclid.ejs/1599552374<strong>Joshua Agterberg</strong>, <strong>Youngser Park</strong>, <strong>Jonathan Larson</strong>, <strong>Christopher White</strong>, <strong>Carey E. Priebe</strong>, <strong>Vince Lyzinski</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3230--3267.</p><p><strong>Abstract:</strong><br/>
Given a pair of graphs $G_{1}$ and $G_{2}$ and a vertex set of interest in $G_{1}$, the vertex nomination (VN) problem seeks to find the corresponding vertices of interest in $G_{2}$ (if they exist) and produce a rank list of the vertices in $G_{2}$, with the corresponding vertices of interest in $G_{2}$ concentrating, ideally, at the top of the rank list. In this paper, we define and derive the analogue of Bayes optimality for VN with multiple vertices of interest, and we define the notion of maximal consistency classes in vertex nomination. This theory forms the foundation for a novel VN adversarial contamination model, and we demonstrate with real and simulated data that there are VN schemes that perform effectively in the uncontaminated setting, and adversarial network contamination adversely impacts the performance of our VN scheme. We further define a network regularization method for mitigating the impact of the adversarial contamination, and we demonstrate the effectiveness of regularization in both real and synthetic data.
</p>projecteuclid.org/euclid.ejs/1599552374_20200908040625Tue, 08 Sep 2020 04:06 EDTThe local partial autocorrelation function and some applicationshttps://projecteuclid.org/euclid.ejs/1599703300<strong>Rebecca Killick</strong>, <strong>Marina I. Knight</strong>, <strong>Guy P. Nason</strong>, <strong>Idris A. Eckley</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3268--3314.</p><p><strong>Abstract:</strong><br/>
The classical regular and partial autocorrelation functions are powerful tools for stationary time series modelling and analysis. However, it is increasingly recognized that many time series are not stationary and the use of classical global autocorrelations can give misleading answers. This article introduces two estimators of the local partial autocorrelation function and establishes their asymptotic properties. The article then illustrates the use of these new estimators on both simulated and real time series. The examples clearly demonstrate the strong practical benefits of local estimators for time series that exhibit nonstationarities.
</p>projecteuclid.org/euclid.ejs/1599703300_20200909220148Wed, 09 Sep 2020 22:01 EDTStatistical analysis of sparse approximate factor modelshttps://projecteuclid.org/euclid.ejs/1599789756<strong>Benjamin Poignard</strong>, <strong>Yoshikazu Terada</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3315--3365.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating sparse approximate factor models. In a first step, we jointly estimate the factor loading parameters and the error - or idiosyncratic - covariance matrix based on the Gaussian quasi-maximum likelihood method. Conditionally on these first step estimators, using the SCAD, MCP and Lasso regularisers, we obtain a sparse error covariance matrix based on a Gaussian QML and, as an alternative criterion, a least squares loss function. Under suitable regularity conditions, we derive error bounds for the regularised idiosyncratic factor model matrix for both Gaussian QML and least squares losses. Moreover, we establish the support recovery property, including the case when the regulariser is non-convex. These theoretical results are supported by empirical studies.
</p>projecteuclid.org/euclid.ejs/1599789756_20200910220248Thu, 10 Sep 2020 22:02 EDTApproximation of Bayesian models for time-to-event datahttps://projecteuclid.org/euclid.ejs/1599811211<strong>Marta Catalano</strong>, <strong>Antonio Lijoi</strong>, <strong>Igor Prünster</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3366--3395.</p><p><strong>Abstract:</strong><br/>
Random measures are the key ingredient for effective nonparametric Bayesian modeling of time-to-event data. This paper focuses on priors for the hazard rate function, a popular choice being the kernel mixture with respect to a gamma random measure. Sampling schemes are usually based on approximations of the underlying random measure, both a priori and conditionally on the data. Our main goal is the quantification of approximation errors through the Wasserstein distance. Though easy to simulate, the Wasserstein distance is generally difficult to evaluate, making tractable and informative bounds essential. Here we accomplish this task on the wider class of completely random measures, yielding a measure of discrepancy between many noteworthy random measures, including the gamma, generalized gamma and beta families. By specializing these results to gamma kernel mixtures, we achieve upper and lower bounds for the Wasserstein distance between hazard rates, cumulative hazard rates and survival functions.
</p>projecteuclid.org/euclid.ejs/1599811211_20200911040019Fri, 11 Sep 2020 04:00 EDTSmoothed residual stopping for statistical inverse problems via truncated SVD estimationhttps://projecteuclid.org/euclid.ejs/1600675506<strong>Bernhard Stankewitz</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3396--3428.</p><p><strong>Abstract:</strong><br/>
This work examines under what circumstances adaptivity for truncated SVD estimation can be achieved by an early stopping rule based on the smoothed residuals $\|(AA^{\top })^{\alpha /2}(Y-A\widehat{\mu }^{(m)})\|^{2}$. Lower and upper bounds for the risk are derived, which show that moderate smoothing of the residuals can be used to adapt over classes of signals with varying smoothness, while oversmoothing yields suboptimal convergence rates. The range of smoothness classes for which adaptation is possible can be controlled via $\alpha $. The theoretical results are illustrated by Monte-Carlo simulations.
</p>projecteuclid.org/euclid.ejs/1600675506_20200921040530Mon, 21 Sep 2020 04:05 EDTHigh dimensional classification for spatially dependent data with application to neuroimaginghttps://projecteuclid.org/euclid.ejs/1601085758<strong>Yingjie Li</strong>, <strong>Liangliang Zhang</strong>, <strong>Tapabrata Maiti</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3429--3486.</p><p><strong>Abstract:</strong><br/>
Discriminating patients with Alzheimer’s disease (AD) from healthy subjects is a crucial task in the research of Alzheimer’s disease. The task can be potentially achieved by linear discriminant analysis (LDA), which is one of the most classical and popular classification techniques. However, the classification problem becomes challenging for LDA because of the high-dimensionality and the spatial dependency of the brain imaging data. To address the challenges, researchers have proposed various ways to generalize LDA into high-dimensional context in recent years. However, these existing methods did not reach any consensus on how to incorporate spatially dependent structure. In light of the current needs and limitations, we propose a new classification method, named as Penalized Maximum Likelihood Estimation LDA (PMLE-LDA). The proposed method uses Matérn covariance function to describe the spatial correlation of brain regions. Additionally, PMLE is designed to model the sparsity of high-dimensional features. The spatial location information is used to address the singularity of the covariance. Tapering technique is introduced to reduce computational burden. We show in theory that the proposed method can not only provide consistent results of parameter estimation and feature selection, but also generate an asymptotically optimal classifier driven by high dimensional data with specific spatially dependent structure. Finally, the method is validated through simulations and an application into ADNI data for classifying Alzheimer’s patients.
</p>projecteuclid.org/euclid.ejs/1601085758_20200925220248Fri, 25 Sep 2020 22:02 EDTIs distribution-free inference possible for binary regression?https://projecteuclid.org/euclid.ejs/1601085759<strong>Rina Foygel Barber</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 14, Number 2, 3487--3524.</p><p><strong>Abstract:</strong><br/>
For a regression problem with a binary label response, we examine the problem of constructing confidence intervals for the label probability conditional on the features. In a setting where we do not have any information about the underlying distribution, we would ideally like to provide confidence intervals that are distribution-free—that is, valid with no assumptions on the distribution of the data. Our results establish an explicit lower bound on the length of any distribution-free confidence interval, and construct a procedure that can approximately achieve this length. In particular, this lower bound is independent of the sample size and holds for all distributions with no point masses, meaning that it is not possible for any distribution-free procedure to be adaptive with respect to any type of special structure in the distribution.
</p>projecteuclid.org/euclid.ejs/1601085759_20200925220248Fri, 25 Sep 2020 22:02 EDT