Electronic Journal of Statistics Articles (Project Euclid)
http://projecteuclid.org/euclid.ejs
The latest articles from Electronic Journal of Statistics on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2010 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Thu, 05 Aug 2010 15:41 EDTFri, 03 Jun 2011 09:20 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
The bias and skewness of M -estimators in regression
http://projecteuclid.org/euclid.ejs/1262876992
<strong>Christopher Withers</strong>, <strong>Saralees Nadarajah</strong><p><strong>Source: </strong>Electron. J. Statist., Volume 4, 1--14.</p><p><strong>Abstract:</strong><br/>
We consider M estimation of a regression model with a nuisance parameter and a vector of other parameters. The unknown distribution of the residuals is not assumed to be normal or symmetric. Simple and easily estimated formulas are given for the dominant terms of the bias and skewness of the parameter estimates. For the linear model these are proportional to the skewness of the ‘independent’ variables. For a nonlinear model, its linear component plays the role of these independent variables, and a second term must be added proportional to the covariance of its linear and quadratic components. For the least squares estimate with normal errors this term was derived by Box [1]. We also consider the effect of a large number of parameters, and the case of random independent variables.
</p>projecteuclid.org/euclid.ejs/1262876992_Thu, 05 Aug 2010 15:41 EDTThu, 05 Aug 2010 15:41 EDTEmpirical Bayes analysis of spike and slab posterior distributionshttps://projecteuclid.org/euclid.ejs/1544238109<strong>Ismaël Castillo</strong>, <strong>Romain Mismer</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 3953--4001.</p><p><strong>Abstract:</strong><br/>
In the sparse normal means model, convergence of the Bayesian posterior distribution associated to spike and slab prior distributions is considered. The key sparsity hyperparameter is calibrated via marginal maximum likelihood empirical Bayes. The plug-in posterior squared–$L^{2}$ norm is shown to converge at the minimax rate for the euclidean norm for appropriate choices of spike and slab distributions. Possible choices include standard spike and slab with heavy tailed slab, and the spike and slab LASSO of Ročková and George with heavy tailed slab. Surprisingly, the popular Laplace slab is shown to lead to a suboptimal rate for the empirical Bayes posterior itself. This provides a striking example where convergence of aspects of the empirical Bayes posterior such as the posterior mean or median does not entail convergence of the complete empirical Bayes posterior itself.
</p>projecteuclid.org/euclid.ejs/1544238109_20181221221108Fri, 21 Dec 2018 22:11 ESTAnalysis of a mode clustering diagramhttps://projecteuclid.org/euclid.ejs/1545123625<strong>Isabella Verdinelli</strong>, <strong>Larry Wasserman</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4288--4312.</p><p><strong>Abstract:</strong><br/>
Mode-based clustering methods define clusters in terms of the modes of a density estimate. The most common mode-based method is mean shift clustering which defines clusters to be the basins of attraction of the modes. Specifically, the gradient of the density defines a flow which is estimated using a gradient ascent algorithm. Rodriguez and Laio (2014) introduced a new method that is faster and simpler than mean shift clustering. Furthermore, they define a clustering diagram that provides a simple, two-dimensional summary of the clustering information. We study the statistical properties of this diagram and we propose some improvements and extensions. In particular, we show a connection between the diagram and robust linear regression.
</p>projecteuclid.org/euclid.ejs/1545123625_20181221221108Fri, 21 Dec 2018 22:11 ESTBandwidth selection for kernel density estimators of multivariate level sets and highest density regionshttps://projecteuclid.org/euclid.ejs/1545123626<strong>Charles R. Doss</strong>, <strong>Guangwei Weng</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4313--4376.</p><p><strong>Abstract:</strong><br/>
We consider bandwidth matrix selection for kernel density estimators of density level sets in $\mathbb{R} ^{d}$, $d\ge 2$. We also consider estimation of highest density regions, which differs from estimating level sets in that one specifies the probability content of the set rather than specifying the level directly. This complicates the problem. Bandwidth selection for KDEs is well studied, but the goal of most methods is to minimize a global loss function for the density or its derivatives. The loss we consider here is instead the measure of the symmetric difference of the true set and estimated set. We derive an asymptotic approximation to the corresponding risk. The approximation depends on unknown quantities which can be estimated, and the approximation can then be minimized to yield a choice of bandwidth, which we show in simulations performs well. We provide an R package lsbs for implementing our procedure.
</p>projecteuclid.org/euclid.ejs/1545123626_20181221221108Fri, 21 Dec 2018 22:11 ESTPeriodic dynamic factor models: estimation approaches and applicationshttps://projecteuclid.org/euclid.ejs/1545123627<strong>Changryong Baek</strong>, <strong>Richard A. Davis</strong>, <strong>Vladas Pipiras</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4377--4411.</p><p><strong>Abstract:</strong><br/>
A periodic dynamic factor model (PDFM) is introduced as a dynamic factor modeling approach to multivariate time series data exhibiting cyclical behavior and, in particular, periodic dependence structure. In the PDFM, the loading matrices are allowed to depend on the “season” and the factors are assumed to follow a periodic vector autoregressive (PVAR) model. Estimation of the loading matrices and the underlying PVAR model is studied. A simulation study is presented to assess the performance of the introduced estimation procedures, and applications to several real data sets are provided.
</p>projecteuclid.org/euclid.ejs/1545123627_20181221221108Fri, 21 Dec 2018 22:11 ESTConvergence analysis of the block Gibbs sampler for Bayesian probit linear mixed models with improper priorshttps://projecteuclid.org/euclid.ejs/1545123629<strong>Xin Wang</strong>, <strong>Vivekananda Roy</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4412--4439.</p><p><strong>Abstract:</strong><br/>
In this article, we consider Markov chain Monte Carlo (MCMC) algorithms for exploring the intractable posterior density associated with Bayesian probit linear mixed models under improper priors on the regression coefficients and variance components. In particular, we construct a two-block Gibbs sampler using the data augmentation (DA) techniques. Furthermore, we prove geometric ergodicity of the Gibbs sampler, which is the foundation for building central limit theorems for MCMC based estimators and subsequent inferences. The conditions for geometric convergence are similar to those guaranteeing posterior propriety. We also provide conditions for the propriety of posterior distributions with a general link function when the design matrices take commonly observed forms. In general, the Haar parameter expansion for DA (PX-DA) algorithm is an improvement of the DA algorithm and it has been shown that it is theoretically at least as good as the DA algorithm. Here we construct a Haar PX-DA algorithm, which has essentially the same computational cost as the two-block Gibbs sampler.
</p>projecteuclid.org/euclid.ejs/1545123629_20181221221108Fri, 21 Dec 2018 22:11 ESTConsistent change-point detection with kernelshttps://projecteuclid.org/euclid.ejs/1545123630<strong>Damien Garreau</strong>, <strong>Sylvain Arlot</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4440--4486.</p><p><strong>Abstract:</strong><br/>
In this paper we study the kernel change-point algorithm (KCP) proposed by Arlot, Celisse and Harchaoui [5], which aims at locating an unknown number of change-points in the distribution of a sequence of independent data taking values in an arbitrary set. The change-points are selected by model selection with a penalized kernel empirical criterion. We provide a non-asymptotic result showing that, with high probability, the KCP procedure retrieves the correct number of change-points, provided that the constant in the penalty is well-chosen; in addition, KCP estimates the change-points location at the optimal rate. As a consequence, when using a characteristic kernel, KCP detects all kinds of change in the distribution (not only changes in the mean or the variance), and it is able to do so for complex structured data (not necessarily in $\mathbb{R}^{d}$). Most of the analysis is conducted assuming that the kernel is bounded; part of the results can be extended when we only assume a finite second-order moment. We also demonstrate KCP on both synthetic and real data.
</p>projecteuclid.org/euclid.ejs/1545123630_20181221221108Fri, 21 Dec 2018 22:11 ESTBayesian classification of multiclass functional datahttps://projecteuclid.org/euclid.ejs/1545448229<strong>Xiuqi Li</strong>, <strong>Subhashis Ghosal</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4669--4696.</p><p><strong>Abstract:</strong><br/>
We propose a Bayesian approach to estimating parameters in multiclass functional models. Unordered multinomial probit, ordered multinomial probit and multinomial logistic models are considered. We use finite random series priors based on a suitable basis such as B-splines in these three multinomial models, and classify the functional data using the Bayes rule. We average over models based on the marginal likelihood estimated from Markov Chain Monte Carlo (MCMC) output. Posterior contraction rates for the three multinomial models are computed. We also consider Bayesian linear and quadratic discriminant analyses on the multivariate data obtained by applying a functional principal component technique on the original functional data. A simulation study is conducted to compare these methods on different types of data. We also apply these methods to a phoneme dataset.
</p>projecteuclid.org/euclid.ejs/1545448229_20181221221108Fri, 21 Dec 2018 22:11 ESTEstimating a network from multiple noisy realizationshttps://projecteuclid.org/euclid.ejs/1545448230<strong>Can M. Le</strong>, <strong>Keith Levin</strong>, <strong>Elizaveta Levina</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 12, Number 2, 4697--4740.</p><p><strong>Abstract:</strong><br/>
Complex interactions between entities are often represented as edges in a network. In practice, the network is often constructed from noisy measurements and inevitably contains some errors. In this paper we consider the problem of estimating a network from multiple noisy observations where edges of the original network are recorded with both false positives and false negatives. This problem is motivated by neuroimaging applications where brain networks of a group of patients with a particular brain condition could be viewed as noisy versions of an unobserved true network corresponding to the disease. The key to optimally leveraging these multiple observations is to take advantage of network structure, and here we focus on the case where the true network contains communities. Communities are common in real networks in general and in particular are believed to be presented in brain networks. Under a community structure assumption on the truth, we derive an efficient method to estimate the noise levels and the original network, with theoretical guarantees on the convergence of our estimates. We show on synthetic networks that the performance of our method is close to an oracle method using the true parameter values, and apply our method to fMRI brain data, demonstrating that it constructs stable and plausible estimates of the population network.
</p>projecteuclid.org/euclid.ejs/1545448230_20181221221108Fri, 21 Dec 2018 22:11 ESTLinear regression with sparsely permuted datahttps://projecteuclid.org/euclid.ejs/1546570940<strong>Martin Slawski</strong>, <strong>Emanuel Ben-David</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1--36.</p><p><strong>Abstract:</strong><br/>
In regression analysis of multivariate data, it is tacitly assumed that response and predictor variables in each observed response-predictor pair correspond to the same entity or unit. In this paper, we consider the situation of “permuted data” in which this basic correspondence has been lost. Several recent papers have considered this situation without further assumptions on the underlying permutation. In applications, the latter is often to known to have additional structure that can be leveraged. Specifically, we herein consider the common scenario of “sparsely permuted data” in which only a small fraction of the data is affected by a mismatch between response and predictors. However, an adverse effect already observed for sparsely permuted data is that the least squares estimator as well as other estimators not accounting for such partial mismatch are inconsistent. One approach studied in detail herein is to treat permuted data as outliers which motivates the use of robust regression formulations to estimate the regression parameter. The resulting estimate can subsequently be used to recover the permutation. A notable benefit of the proposed approach is its computational simplicity given the general lack of procedures for the above problem that are both statistically sound and computationally appealing.
</p>projecteuclid.org/euclid.ejs/1546570940_20190103220223Thu, 03 Jan 2019 22:02 ESTConvergence rates of latent topic models under relaxed identifiability conditionshttps://projecteuclid.org/euclid.ejs/1546570941<strong>Yining Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 37--66.</p><p><strong>Abstract:</strong><br/>
In this paper we study the frequentist convergence rate for the Latent Dirichlet Allocation (Blei, Ng and Jordan, 2003) topic models. We show that the maximum likelihood estimator converges to one of the finitely many equivalent parameters in Wasserstein’s distance metric at a rate of $n^{-1/4}$ without assuming separability or non-degeneracy of the underlying topics and/or the existence of more than three words per document, thus generalizing the previous works of Anandkumar et al. (2012, 2014) from an information-theoretical perspective. We also show that the $n^{-1/4}$ convergence rate is optimal in the worst case.
</p>projecteuclid.org/euclid.ejs/1546570941_20190103220223Thu, 03 Jan 2019 22:02 ESTGeneralised additive dependency inflated models including aggregated covariateshttps://projecteuclid.org/euclid.ejs/1546570942<strong>Young K. Lee</strong>, <strong>Enno Mammen</strong>, <strong>Jens P. Nielsen</strong>, <strong>Byeong U. Park</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 67--93.</p><p><strong>Abstract:</strong><br/>
Let us assume that $X$, $Y$ and $U$ are observed and that the conditional mean of $U$ given $X$ and $Y$ can be expressed via an additive dependency of $X$, $\lambda(X)Y$ and $X+Y$ for some unspecified function $\lambda$. This structured regression model can be transferred to a hazard model or a density model when applied on some appropriate grid, and has important forecasting applications via structured marker dependent hazards models or structured density models including age-period-cohort relationships. The structured regression model is also important when the severity of the dependent variable has a complicated dependency on waiting times $X$, $Y$ and the total waiting time $X+Y$. In case the conditional mean of $U$ approximates a density, the regression model can be used to analyse the age-period-cohort model, also when exposure data are not available. In case the conditional mean of $U$ approximates a marker dependent hazard, the regression model introduces new relevant age-period-cohort time scale interdependencies in understanding longevity. A direct use of the regression relationship introduced in this paper is the estimation of the severity of outstanding liabilities in non-life insurance companies. The technical approach taken is to use B-splines to capture the underlying one-dimensional unspecified functions. It is shown via finite sample simulation studies and an application for forecasting future asbestos related deaths in the UK that the B-spline approach works well in practice. Special consideration has been given to ensure identifiability of all models considered.
</p>projecteuclid.org/euclid.ejs/1546570942_20190103220223Thu, 03 Jan 2019 22:02 ESTExact adaptive confidence intervals for linear regression coefficientshttps://projecteuclid.org/euclid.ejs/1546570943<strong>Peter Hoff</strong>, <strong>Chaoyu Yu</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 94--119.</p><p><strong>Abstract:</strong><br/>
We propose an adaptive confidence interval procedure (CIP) for the coefficients in the normal linear regression model. This procedure has a frequentist coverage rate that is constant as a function of the model parameters, yet provides smaller intervals than the usual interval procedure, on average across regression coefficients. The proposed procedure is obtained by defining a class of CIPs that all have exact $1-\alpha $ frequentist coverage, and then selecting from this class the procedure that minimizes a prior expected interval width. We describe an adaptive approach for estimating the prior distribution from the data, so that the potential risk of a poorly specified prior is reduced. The resulting adaptive confidence intervals maintain exact non-asymptotic $1-\alpha $ coverage if two conditions are met - that the design matrix is full rank (which will be known) and that the errors are normally distributed (which can be checked empirically). No assumptions on the unknown parameters are necessary to maintain exact coverage. Additionally, in a “$p$ growing with $n$” asymptotic scenario, this adaptive FAB procedure is asymptotically Bayes-optimal among $1-\alpha $ frequentist CIPs.
</p>projecteuclid.org/euclid.ejs/1546570943_20190103220223Thu, 03 Jan 2019 22:02 ESTAuxiliary information: the raking-ratio empirical processhttps://projecteuclid.org/euclid.ejs/1546570944<strong>Mickael Albertus</strong>, <strong>Philippe Berthet</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 120--165.</p><p><strong>Abstract:</strong><br/>
We study the empirical measure associated to a sample of size $n$ and modified by $N$ iterations of the raking-ratio method. This empirical measure is adjusted to match the true probability of sets in a finite partition which changes each step. We establish asymptotic properties of the raking-ratio empirical process indexed by functions as $n\rightarrow +\infty $, for $N$ fixed. We study nonasymptotic properties by using a Gaussian approximation which yields uniform Berry-Esseen type bounds depending on $n,N$ and provides estimates of the uniform quadratic risk reduction. A closed-form expression of the limiting covariance matrices is derived as $N\rightarrow +\infty $. In the two-way contingency table case the limiting process has a simple explicit formula.
</p>projecteuclid.org/euclid.ejs/1546570944_20190103220223Thu, 03 Jan 2019 22:02 ESTTrace class Markov chains for the Normal-Gamma Bayesian shrinkage modelhttps://projecteuclid.org/euclid.ejs/1547607848<strong>Liyuan Zhang</strong>, <strong>Kshitij Khare</strong>, <strong>Zeren Xing</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 166--207.</p><p><strong>Abstract:</strong><br/>
High-dimensional data, where the number of variables exceeds or is comparable to the sample size, is now pervasive in many scientific applications. In recent years, Bayesian shrinkage models have been developed as effective and computationally feasible tools to analyze such data, especially in the context of linear regression. In this paper, we focus on the Normal-Gamma shrinkage model developed by Griffin and Brown [7]. This model subsumes the popular Bayesian lasso model, and a three-block Gibbs sampling algorithm to sample from the resulting intractable posterior distribution has been developed in [7]. We consider an alternative two-block Gibbs sampling algorithm, and rigorously demonstrate its advantage over the three-block sampler by comparing specific spectral properties. In particular, we show that the Markov operator corresponding to the two-block sampler is trace class (and hence Hilbert-Schmidt), whereas the operator corresponding to the three-block sampler is not even Hilbert-Schmidt. The trace class property for the two-block sampler implies geometric convergence for the associated Markov chain, which justifies the use of Markov chain CLT’s to obtain practical error bounds for MCMC based estimates. Additionally, it facilitates theoretical comparisons of the two-block sampler with sandwich algorithms which aim to improve performance by inserting inexpensive extra steps in between the two conditional draws of the two-block sampler.
</p>projecteuclid.org/euclid.ejs/1547607848_20190115220419Tue, 15 Jan 2019 22:04 ESTDetection of sparse mixtures: higher criticism and scan statistichttps://projecteuclid.org/euclid.ejs/1547607852<strong>Ery Arias-Castro</strong>, <strong>Andrew Ying</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 208--230.</p><p><strong>Abstract:</strong><br/>
We consider the problem of detecting a sparse mixture as studied by Ingster (1997) and Donoho and Jin (2004). We consider a wide array of base distributions. In particular, we study the situation when the base distribution has polynomial tails, a situation that has not received much attention in the literature. Perhaps surprisingly, we find that in the context of such a power-law distribution, the higher criticism does not achieve the detection boundary. However, the scan statistic does.
</p>projecteuclid.org/euclid.ejs/1547607852_20190115220419Tue, 15 Jan 2019 22:04 ESTImportance sampling the union of rare events with an application to power systems analysishttps://projecteuclid.org/euclid.ejs/1548817590<strong>Art B. Owen</strong>, <strong>Yury Maximov</strong>, <strong>Michael Chertkov</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 231--254.</p><p><strong>Abstract:</strong><br/>
We consider importance sampling to estimate the probability $\mu$ of a union of $J$ rare events $H_{j}$ defined by a random variable $\boldsymbol{x}$. The sampler we study has been used in spatial statistics, genomics and combinatorics going back at least to Karp and Luby (1983). It works by sampling one event at random, then sampling $\boldsymbol{x}$ conditionally on that event happening and it constructs an unbiased estimate of $\mu$ by multiplying an inverse moment of the number of occuring events by the union bound. We prove some variance bounds for this sampler. For a sample size of $n$, it has a variance no larger than $\mu(\bar{\mu}-\mu)/n$ where $\bar{\mu}$ is the union bound. It also has a coefficient of variation no larger than $\sqrt{(J+J^{-1}-2)/(4n)}$ regardless of the overlap pattern among the $J$ events. Our motivating problem comes from power system reliability, where the phase differences between connected nodes have a joint Gaussian distribution and the $J$ rare events arise from unacceptably large phase differences. In the grid reliability problems even some events defined by $5772$ constraints in $326$ dimensions, with probability below $10^{-22}$, are estimated with a coefficient of variation of about $0.0024$ with only $n=10{,}000$ sample values.
</p>projecteuclid.org/euclid.ejs/1548817590_20190129220635Tue, 29 Jan 2019 22:06 ESTEstimation of spectral functionals for Levy-driven continuous-time linear models with tapered datahttps://projecteuclid.org/euclid.ejs/1548817591<strong>Mamikon S. Ginovyan</strong>, <strong>Artur A. Sahakyan</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 255--283.</p><p><strong>Abstract:</strong><br/>
The paper is concerned with the nonparametric statistical estimation of linear spectral functionals for Lévy-driven continuous-time stationary linear models with tapered data. As an estimator for unknown functional we consider the averaged tapered periodogram. We analyze the bias of the estimator and obtain sufficient conditions assuring the proper rate of convergence of the bias to zero, necessary for asymptotic normality of the estimator. We prove a a central limit theorem for a suitable normalized stochastic process generated by a tapered Toeplitz type quadratic functional of the model. As a consequence of these results we obtain the asymptotic normality of our estimator.
</p>projecteuclid.org/euclid.ejs/1548817591_20190129220635Tue, 29 Jan 2019 22:06 ESTFast Bayesian variable selection for high dimensional linear models: Marginal solo spike and slab priorshttps://projecteuclid.org/euclid.ejs/1549335678<strong>Su Chen</strong>, <strong>Stephen G. Walker</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 284--309.</p><p><strong>Abstract:</strong><br/>
This paper presents a method for fast Bayesian variable selection in the normal linear regression model with high dimensional data. A novel approach is adopted in which an explicit posterior probability for including a covariate is obtained. The method is sequential but not order dependent, one deals with each covariate one by one, and a spike and slab prior is only assigned to the coefficient under investigation. We adopt the well-known spike and slab Gaussian priors with a sample size dependent variance, which achieves strong selection consistency for marginal posterior probabilities even when the number of covariates grows almost exponentially with sample size. Numerical illustrations are presented where it is shown that the new approach provides essentially equivalent results to the standard spike and slab priors, i.e. the same marginal posterior probabilities of the coefficients being nonzero, which are estimated via Gibbs sampling. Hence, we obtain the same results via the direct calculation of $p$ probabilities, compared to a stochastic search over a space of $2^{p}$ elements. Our procedure only requires $p$ probabilities to be calculated, which can be done exactly, hence parallel computation when $p$ is large is feasible.
</p>projecteuclid.org/euclid.ejs/1549335678_20190204220140Mon, 04 Feb 2019 22:01 ESTWeak dependence and GMM estimation of supOU and mixed moving average processeshttps://projecteuclid.org/euclid.ejs/1549681240<strong>Imma Valentina Curato</strong>, <strong>Robert Stelzer</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 310--360.</p><p><strong>Abstract:</strong><br/>
We consider a mixed moving average (MMA) process $X$ driven by a Lévy basis and prove that it is weakly dependent with rates computable in terms of the moving average kernel and the characteristic quadruple of the Lévy basis. Using this property, we show conditions ensuring that sample mean and autocovariances of $X$ have a limiting normal distribution. We extend these results to stochastic volatility models and then investigate a Generalized Method of Moments estimator for the supOU process and the supOU stochastic volatility model after choosing a suitable distribution for the mean reversion parameter. For these estimators, we analyze the asymptotic behavior in detail.
</p>projecteuclid.org/euclid.ejs/1549681240_20190208220053Fri, 08 Feb 2019 22:00 ESTOptimal designs for regression with spherical datahttps://projecteuclid.org/euclid.ejs/1549681241<strong>Holger Dette</strong>, <strong>Maria Konstantinou</strong>, <strong>Kirsten Schorning</strong>, <strong>Josua Gösmann</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 361--390.</p><p><strong>Abstract:</strong><br/>
In this paper optimal designs for regression problems with spherical predictors of arbitrary dimension are considered. Our work is motivated by applications in material sciences, where crystallographic textures such as the misorientation distribution or the grain boundary distribution (depending on a four dimensional spherical predictor) are represented by series of hyperspherical harmonics, which are estimated from experimental or simulated data.
For this type of estimation problems we explicitly determine optimal designs with respect to the $\Phi _{p}$-criteria introduced by Kiefer (1974) and a class of orthogonally invariant information criteria recently introduced in the literature. In particular, we show that the uniform distribution on the $m$-dimensional sphere is optimal and construct discrete and implementable designs with the same information matrices as the continuous optimal designs. Finally, we illustrate the advantages of the new designs for series estimation by hyperspherical harmonics, which are symmetric with respect to the first and second crystallographic point group.
</p>projecteuclid.org/euclid.ejs/1549681241_20190208220053Fri, 08 Feb 2019 22:00 ESTAdditive partially linear models for massive heterogeneous datahttps://projecteuclid.org/euclid.ejs/1549681242<strong>Binhuan Wang</strong>, <strong>Yixin Fang</strong>, <strong>Heng Lian</strong>, <strong>Hua Liang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 391--431.</p><p><strong>Abstract:</strong><br/>
We consider an additive partially linear framework for modelling massive heterogeneous data. The major goal is to extract multiple common features simultaneously across all sub-populations while exploring heterogeneity of each sub-population. We propose an aggregation type of estimators for the commonality parameters that possess the asymptotic optimal bounds and the asymptotic distributions as if there were no heterogeneity. This oracle result holds when the number of sub-populations does not grow too fast and the tuning parameters are selected carefully. A plug-in estimator for the heterogeneity parameter is further constructed, and shown to possess the asymptotic distribution as if the commonality information were available. Furthermore, we develop a heterogeneity test for the linear components and a homogeneity test for the non-linear components accordingly. The performance of the proposed methods is evaluated via simulation studies and an application to the Medicare Provider Utilization and Payment data.
</p>projecteuclid.org/euclid.ejs/1549681242_20190208220053Fri, 08 Feb 2019 22:00 ESTMonte Carlo modified profile likelihood in models for clustered datahttps://projecteuclid.org/euclid.ejs/1549962031<strong>Claudia Di Caterina</strong>, <strong>Giuliana Cortese</strong>, <strong>Nicola Sartori</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 432--464.</p><p><strong>Abstract:</strong><br/>
The main focus of the analysts who deal with clustered data is usually not on the clustering variables, and hence the group-specific parameters are treated as nuisance. If a fixed effects formulation is preferred and the total number of clusters is large relative to the single-group sizes, classical frequentist techniques relying on the profile likelihood are often misleading. The use of alternative tools, such as modifications to the profile likelihood or integrated likelihoods, for making accurate inference on a parameter of interest can be complicated by the presence of nonstandard modelling and/or sampling assumptions. We show here how to employ Monte Carlo simulation in order to approximate the modified profile likelihood in some of these unconventional frameworks. The proposed solution is widely applicable and is shown to retain the usual properties of the modified profile likelihood. The approach is examined in two instances particularly relevant in applications, i.e. missing-data models and survival models with unspecified censoring distribution. The effectiveness of the proposed solution is validated via simulation studies and two clinical trial applications.
</p>projecteuclid.org/euclid.ejs/1549962031_20190212040038Tue, 12 Feb 2019 04:00 ESTQuery-dependent ranking and its asymptotic propertieshttps://projecteuclid.org/euclid.ejs/1549962032<strong>Ben Dai</strong>, <strong>Junhui Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 465--488.</p><p><strong>Abstract:</strong><br/>
Ranking, also known as learning to rank in machine learning community, is to rank a number of items based on their relevance to a specific query. In literature, most ranking methods use a uniform ranking function to evaluate the relevance, which completely ignores the heterogeneity among queries. To admit different ranking functions for various queries, a general $U$-process formulation for query-dependent ranking is developed. It allows to incorporate neighborhood structure among queries via various forms of smoothing weights to improve the ranking performance. One of its salient features is its capability of producing reasonable rankings for novel queries that are absent in the training set, which is commonly encountered in practice but often neglected in the literature. The proposed method is implemented via an inexact alternating direction method of multipliers (ADMM) for each query parallelly. Its asymptotic risk bound is established, showing that it achieves desirable ranking accuracy at a fast rate for any query including the novel ones. Furthermore, simulated examples and a real application to the Yahoo! challenge dataset also support the advantage of the query-dependent ranking method against existing competitors.
</p>projecteuclid.org/euclid.ejs/1549962032_20190212040038Tue, 12 Feb 2019 04:00 ESTNon-marginal decisions: A novel Bayesian multiple testing procedurehttps://projecteuclid.org/euclid.ejs/1550134833<strong>Noirrit Kiran Chandra</strong>, <strong>Sourabh Bhattacharya</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 489--535.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider the problem of multiple testing where the hypotheses are dependent. In most of the existing literature, either Bayesian or non-Bayesian, the decision rules mainly focus on the validity of the test procedure rather than actually utilizing the dependency to increase efficiency. Moreover, the decisions regarding different hypotheses are marginal in the sense that they do not depend upon each other directly. However, in realistic situations, the hypotheses are usually dependent, and hence it is desirable that the decisions regarding the dependent hypotheses are taken jointly.
In this article, we develop a novel Bayesian multiple testing procedure that coherently takes this requirement into consideration. Our method, which is based on new notions of error and non-error terms, substantially enhances efficiency by judicious exploitation of the dependence structure among the hypotheses. We show that our method minimizes the posterior expected loss associated with an additive “0-1” loss function; we also prove theoretical results on the relevant error probabilities, establishing the coherence and usefulness of our method. The optimal decision configuration is not available in closed form and we propose an efficient simulated annealing algorithm for the purpose of optimization, which is also generically applicable to binary optimization problems.
Extensive simulation studies indicate that in dependent situations, our method performs significantly better than some existing popular conventional multiple testing methods, in terms of accuracy and power control. Moreover, application of our ideas to a real, spatial data set associated with radionuclide concentration in Rongelap islands yielded insightful results.
</p>projecteuclid.org/euclid.ejs/1550134833_20190214040043Thu, 14 Feb 2019 04:00 ESTLipschitz-Killing curvatures of excursion sets for two-dimensional random fieldshttps://projecteuclid.org/euclid.ejs/1550134834<strong>Hermine Biermé</strong>, <strong>Elena Di Bernardino</strong>, <strong>Céline Duval</strong>, <strong>Anne Estrade</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 536--581.</p><p><strong>Abstract:</strong><br/>
In the present paper we study three geometrical characteristics for the excursion sets of a two-dimensional stationary isotropic random field. First, we show that these characteristics can be estimated without bias if the considered field satisfies a kinematic formula, this is for instance the case of fields given by a function of smooth Gaussian fields or of some shot noise fields. By using the proposed estimators of these geometric characteristics, we describe some inference procedures for the estimation of the parameters of the field. An extensive simulation study illustrates the performances of each estimator. Then, we use the Euler characteristic estimator to build a test to determine whether a given field is Gaussian or not, when compared to various alternatives. The test is based on a sparse information, i.e. , the excursion sets for two different levels of the field to be tested. Finally, the proposed test is adapted to an applied case, synthesized 2D digital mammograms.
</p>projecteuclid.org/euclid.ejs/1550134834_20190214040043Thu, 14 Feb 2019 04:00 ESTGeneralized M-estimators for high-dimensional Tobit I modelshttps://projecteuclid.org/euclid.ejs/1550286094<strong>Jelena Bradic</strong>, <strong>Jiaqi Guo</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 582--645.</p><p><strong>Abstract:</strong><br/>
This paper develops robust confidence intervals in high-dimensional and left-censored regression. Type-I censored regression models, where a competing event makes the variable of interest unobservable, are extremely common in practice. In this paper, we develop smoothed estimating equations that are adaptive to censoring level and are more robust to the misspecification of the error distribution. We propose a unified class of robust estimators, including one-step Mallow’s, Schweppe’s, and Hill-Ryan’s estimator that are adaptive to the left-censored observations. In the ultra-high-dimensional setting, where the dimensionality can grow exponentially with the sample size, we show that as long as the preliminary estimator converges faster than $n^{-1/4}$, the one-step estimators inherit asymptotic distribution of fully iterated version. Moreover, we show that the size of the residuals of the Bahadur representation matches those of the pure linear models – that is, the effects of censoring disappear asymptotically. Simulation studies demonstrate that our method is adaptive to the censoring level and asymmetry in the error distribution, and does not lose efficiency when the errors are from symmetric distributions.
</p>projecteuclid.org/euclid.ejs/1550286094_20190215220146Fri, 15 Feb 2019 22:01 ESTContraction and uniform convergence of isotonic regressionhttps://projecteuclid.org/euclid.ejs/1550286095<strong>Fan Yang</strong>, <strong>Rina Foygel Barber</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 646--677.</p><p><strong>Abstract:</strong><br/>
We consider the problem of isotonic regression, where the underlying signal $x$ is assumed to satisfy a monotonicity constraint, that is, $x$ lies in the cone $\{x\in \mathbb{R}^{n}:x_{1}\leq \dots\leq x_{n}\}$. We study the isotonic projection operator (projection to this cone), and find a necessary and sufficient condition characterizing all norms with respect to which this projection is contractive. This enables a simple and non-asymptotic analysis of the convergence properties of isotonic regression, yielding uniform confidence bands that adapt to the local Lipschitz properties of the signal.
</p>projecteuclid.org/euclid.ejs/1550286095_20190215220146Fri, 15 Feb 2019 22:01 ESTSpectral clustering in the dynamic stochastic block modelhttps://projecteuclid.org/euclid.ejs/1550286096<strong>Marianna Pensky</strong>, <strong>Teng Zhang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 678--709.</p><p><strong>Abstract:</strong><br/>
In the present paper, we have studied a Dynamic Stochastic Block Model (DSBM) under the assumptions that the connection probabilities, as functions of time, are smooth and that at most $s$ nodes can switch their class memberships between two consecutive time points. We estimate the edge probability tensor by a kernel-type procedure and extract the group memberships of the nodes by spectral clustering. The procedure is computationally viable, adaptive to the unknown smoothness of the functional connection probabilities, to the rate $s$ of membership switching, and to the unknown number of clusters. In addition, it is accompanied by non-asymptotic guarantees for the precision of estimation and clustering.
</p>projecteuclid.org/euclid.ejs/1550286096_20190215220146Fri, 15 Feb 2019 22:01 ESTIsotonic regression meets LASSOhttps://projecteuclid.org/euclid.ejs/1550632213<strong>Matey Neykov</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 710--746.</p><p><strong>Abstract:</strong><br/>
This paper studies a two step procedure for monotone increasing additive single index models with Gaussian designs. The proposed procedure is simple, easy to implement with existing software, and consists of consecutively applying LASSO and isotonic regression. Aside from formalizing this procedure, we provide theoretical guarantees regarding its performance: 1) we show that our procedure controls the in-sample squared error; 2) we demonstrate that one can use the procedure for predicting new observations, by showing that the absolute prediction error can be controlled with high-probability. Our bounds show a tradeoff of two rates: the minimax rate for estimating high dimensional quadratic loss, and the minimax nonparametric rate for estimating a monotone increasing function.
</p>projecteuclid.org/euclid.ejs/1550632213_20190219221028Tue, 19 Feb 2019 22:10 ESTAsymptotic theory of penalized splineshttps://projecteuclid.org/euclid.ejs/1553133771<strong>Luo Xiao</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 747--794.</p><p><strong>Abstract:</strong><br/>
The paper gives a unified study of the large sample asymptotic theory of penalized splines including the O -splines using B-splines and an integrated squared derivative penalty [22], the P -splines which use B-splines and a discrete difference penalty [13], and the T -splines which use truncated polynomials and a ridge penalty [24]. Extending existing results for O -splines [7], it is shown that, depending on the number of knots and appropriate smoothing parameters, the $L_{2}$ risk bounds of penalized spline estimators are rate-wise similar to either those of regression splines or to those of smoothing splines and could each attain the optimal minimax rate of convergence [32]. In addition, convergence rate of the $L_{\infty }$ risk bound, and local asymptotic bias and variance are derived for all three types of penalized splines.
</p>projecteuclid.org/euclid.ejs/1553133771_20190320220301Wed, 20 Mar 2019 22:03 EDTA statistical test of isomorphism between metric-measure spaces using the distance-to-a-measure signaturehttps://projecteuclid.org/euclid.ejs/1553565705<strong>Claire Brécheteau</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 795--849.</p><p><strong>Abstract:</strong><br/>
We introduce the notion of DTM-signature, a measure on $\mathbb{R}$ that can be associated to any metric-measure space. This signature is based on the function distance to a measure (DTM) introduced in 2009 by Chazal, Cohen-Steiner and Mérigot. It leads to a pseudo-metric between metric-measure spaces, that is bounded above by the Gromov-Wasserstein distance. This pseudo-metric is used to build a statistical test of isomorphism between two metric-measure spaces, from the observation of two $N$-samples.
The test is based on subsampling methods and comes with theoretical guarantees. It is proven to be of the correct level asymptotically. Also, when the measures are supported on compact subsets of $\mathbb{R}^{d}$, rates of convergence are derived for the $L_{1}$-Wasserstein distance between the distribution of the test statistic and its subsampling approximation. These rates depend on some parameter $\rho >1$. In addition, we prove that the power is bounded above by $\exp (-CN^{1/\rho })$, with $C$ proportional to the square of the aforementioned pseudo-metric between the metric-measure spaces. Under some geometrical assumptions, we also derive lower bounds for this pseudo-metric.
An algorithm is proposed for the implementation of this statistical test, and its performance is compared to the performance of other methods through numerical experiments.
</p>projecteuclid.org/euclid.ejs/1553565705_20190325220202Mon, 25 Mar 2019 22:02 EDTInvariant test based on the modified correction to LRT for the equality of two high-dimensional covariance matriceshttps://projecteuclid.org/euclid.ejs/1553565706<strong>Qiuyan Zhang</strong>, <strong>Jiang Hu</strong>, <strong>Zhidong Bai</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 850--881.</p><p><strong>Abstract:</strong><br/>
In this paper, we propose an invariant test based on the modified correction to the likelihood ratio test (LRT) of the equality of two high-dimensional covariance matrices. It is well-known that the classical log-LRT is not well defined when the dimension is larger than or equal to one of the sample sizes. Or even the log-LRT is well-defined, it is usually perceived as a bad statistic in the high-dimensional cases because of their low powers under some alternatives. In this paper, we will justify the usefulness of the modified log-LRT, and an invariant test that works well in cases where the dimension is larger than the sample sizes. Besides, the test is established under the weakest conditions on the dimensions and the moments of the samples. The asymptotic distribution of the proposed test statistic is also obtained under the null hypothesis. What is more, we also propose a lite version of the modified LRT in the paper. A simulation study and a real data analysis show that the performances of the two proposed statistics are invariant under affine transformations.
</p>projecteuclid.org/euclid.ejs/1553565706_20190325220202Mon, 25 Mar 2019 22:02 EDTVariability and stability of the false discovery proportionhttps://projecteuclid.org/euclid.ejs/1553565707<strong>Marc Ditzhaus</strong>, <strong>Arnold Janssen</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 882--910.</p><p><strong>Abstract:</strong><br/>
Much effort has been done to control the “false discovery rate” (FDR) when $m$ hypotheses are tested simultaneously. The FDR is the expectation of the “false discovery proportion” $\text{FDP}=V/R$ given by the ratio of the number of false rejections $V$ and all rejections $R$. In this paper, we have a closer look at the FDP for adaptive linear step-up multiple tests. These tests extend the well known Benjamini and Hochberg test by estimating the unknown amount $m_{0}$ of the true null hypotheses. We give exact finite sample formulas for higher moments of the FDP and, in particular, for its variance. Using these allows us a precise discussion about the stability of the FDP, i.e., when the FDP is asymptotically close to its mean. We present sufficient and necessary conditions for this stability. They include the presence of a stable estimator for the proportion $m_{0}/m$. We apply our results to convex combinations of generalized Storey type estimators with various tuning parameters and (possibly) data-driven weights. The corresponding step-up tests allow a flexible adaptation. Moreover, these tests control the FDR at finite sample size. We compare these tests to the classical Benjamini and Hochberg test and discuss the advantages of them.
</p>projecteuclid.org/euclid.ejs/1553565707_20190325220202Mon, 25 Mar 2019 22:02 EDTInference for elliptical copula multivariate response regression modelshttps://projecteuclid.org/euclid.ejs/1553911236<strong>Yue Zhao</strong>, <strong>Christian Genest</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 911--984.</p><p><strong>Abstract:</strong><br/>
The estimation of the coefficient matrix in a multivariate response linear regression model is considered in situations where we can observe only strictly increasing transformations of the continuous responses and covariates. It is further assumed that the joint dependence between all the observed variables is characterized by an elliptical copula. Penalized estimators of the coefficient matrix are obtained in a high-dimensional setting by assuming that the coefficient matrix is either element-wise sparse or row-sparse, and by incorporating the precision matrix of the error, which is also assumed to be sparse. Estimation of the copula parameters is achieved by inversion of Kendall’s tau. It is shown that when the true coefficient matrix is row-sparse, the estimator obtained via a group penalty outperforms the one obtained via a simple element-wise penalty. Simulation studies are used to illustrate this fact and the advantage of incorporating the precision matrix of the error when the correlation among the components of the error vector is strong. Moreover, the use of the normal-score rank correlation estimator is revisited in the context of high-dimensional Gaussian copula models. It is shown that this estimator remains as the optimal estimator of the copula correlation matrix in this setting.
</p>projecteuclid.org/euclid.ejs/1553911236_20190329220055Fri, 29 Mar 2019 22:00 EDTNonparametric confidence regions for level sets: Statistical properties and geometryhttps://projecteuclid.org/euclid.ejs/1553911237<strong>Wanli Qiao</strong>, <strong>Wolfgang Polonik</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 985--1030.</p><p><strong>Abstract:</strong><br/>
This paper studies and critically discusses the construction of nonparametric confidence regions for density level sets. Methodologies based on both vertical variation and horizontal variation are considered. The investigations provide theoretical insight into the behavior of these confidence regions via large sample theory. We also discuss the geometric relationships underlying the construction of horizontal and vertical methods, and how finite sample performance of these confidence regions is influenced by geometric or topological aspects. These discussions are supported by numerical studies.
</p>projecteuclid.org/euclid.ejs/1553911237_20190329220055Fri, 29 Mar 2019 22:00 EDTCentral limit theorems for the $L_{p}$-error of smooth isotonic estimatorshttps://projecteuclid.org/euclid.ejs/1554429624<strong>Hendrik P. Lopuhaä</strong>, <strong>Eni Musta</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1031--1098.</p><p><strong>Abstract:</strong><br/>
We investigate the asymptotic behavior of the $L_{p}$-distance between a monotone function on a compact interval and a smooth estimator of this function. Our main result is a central limit theorem for the $L_{p}$-error of smooth isotonic estimators obtained by smoothing a Grenander-type estimator or isotonizing the ordinary kernel estimator. As a preliminary result we establish a similar result for ordinary kernel estimators. Our results are obtained in a general setting, which includes estimation of a monotone density, regression function and hazard rate. We also perform a simulation study for testing monotonicity on the basis of the $L_{2}$-distance between the kernel estimator and the smoothed Grenander-type estimator.
</p>projecteuclid.org/euclid.ejs/1554429624_20190404220116Thu, 04 Apr 2019 22:01 EDTOptimal experimental design that minimizes the width of simultaneous confidence bandshttps://projecteuclid.org/euclid.ejs/1554429625<strong>Satoshi Kuriki</strong>, <strong>Henry P. Wynn</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1099--1134.</p><p><strong>Abstract:</strong><br/>
We propose an optimal experimental design for a curvilinear regression model that minimizes the band-width of simultaneous confidence bands. Simultaneous confidence bands for curvilinear regression are constructed by evaluating the volume of a tube about a curve that is defined as a trajectory of a regression basis vector (Naiman, 1986). The proposed criterion is constructed based on the volume of a tube, and the corresponding optimal design that minimizes the volume of tube is referred to as the tube-volume optimal (TV-optimal) design. For Fourier and weighted polynomial regressions, the problem is formalized as one of minimization over the cone of Hankel positive definite matrices, and the criterion to minimize is expressed as an elliptic integral. We show that the Möbius group keeps our problem invariant, and hence, minimization can be conducted over cross-sections of orbits. We demonstrate that for the weighted polynomial regression and the Fourier regression with three bases, the tube-volume optimal design forms an orbit of the Möbius group containing D-optimal designs as representative elements.
</p>projecteuclid.org/euclid.ejs/1554429625_20190404220116Thu, 04 Apr 2019 22:01 EDTLeast squares estimation of spatial autoregressive models for large-scale social networkshttps://projecteuclid.org/euclid.ejs/1554429626<strong>Danyang Huang</strong>, <strong>Wei Lan</strong>, <strong>Hao Helen Zhang</strong>, <strong>Hansheng Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1135--1165.</p><p><strong>Abstract:</strong><br/>
Due to the rapid development of various social networks, the spatial autoregressive (SAR) model is becoming an important tool in social network analysis. However, major bottlenecks remain in analyzing large-scale networks (e.g., Facebook has over 700 million active users), including computational scalability, estimation consistency, and proper network sampling. To address these challenges, we propose a novel least squares estimator (LSE) for analyzing large sparse networks based on the SAR model. Computationally, the LSE is linear in the network size, making it scalable to analysis of huge networks. In theory, the LSE is $\sqrt{n}$-consistent and asymptotically normal under certain regularity conditions. A new LSE-based network sampling technique is further developed, which can automatically adjust autocorrelation between sampled and unsampled units and hence guarantee valid statistical inferences. Moreover, we generalize the LSE approach for the classical SAR model to more complex networks associated with multiple sources of social interaction effect. Numerical results for simulated and real data are presented to illustrate performance of the LSE.
</p>projecteuclid.org/euclid.ejs/1554429626_20190404220116Thu, 04 Apr 2019 22:01 EDTOrder-sensitivity and equivariance of scoring functionshttps://projecteuclid.org/euclid.ejs/1554429627<strong>Tobias Fissler</strong>, <strong>Johanna F. Ziegel</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1166--1211.</p><p><strong>Abstract:</strong><br/>
The relative performance of competing point forecasts is usually measured in terms of loss or scoring functions. It is widely accepted that these scoring function should be strictly consistent in the sense that the expected score is minimized by the correctly specified forecast for a certain statistical functional such as the mean, median, or a certain risk measure. Thus, strict consistency opens the way to meaningful forecast comparison, but is also important in regression and M-estimation. Usually strictly consistent scoring functions for an elicitable functional are not unique. To give guidance on the choice of a scoring function, this paper introduces two additional quality criteria. Order-sensitivity opens the possibility to compare two deliberately misspecified forecasts given that the forecasts are ordered in a certain sense. On the other hand, equivariant scoring functions obey similar equivariance properties as the functional at hand – such as translation invariance or positive homogeneity. In our study, we consider scoring functions for popular functionals, putting special emphasis on vector-valued functionals, e.g. the pair (mean, variance) or (Value at Risk, Expected Shortfall).
</p>projecteuclid.org/euclid.ejs/1554429627_20190404220116Thu, 04 Apr 2019 22:01 EDTFalse discovery rate control via debiased lassohttps://projecteuclid.org/euclid.ejs/1554429628<strong>Adel Javanmard</strong>, <strong>Hamid Javadi</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1212--1253.</p><p><strong>Abstract:</strong><br/>
We consider the problem of variable selection in high-dimensional statistical models where the goal is to report a set of variables, out of many predictors $X_{1},\dotsc ,X_{p}$, that are relevant to a response of interest. For linear high-dimensional model, where the number of parameters exceeds the number of samples $(p>n)$, we propose a procedure for variables selection and prove that it controls the directional false discovery rate (FDR) below a pre-assigned significance level $q\in [0,1]$. We further analyze the statistical power of our framework and show that for designs with subgaussian rows and a common precision matrix $\Omega \in{\mathbb{R}} ^{p\times p}$, if the minimum nonzero parameter $\theta_{\min }$ satisfies \[\sqrt{n}\theta_{\min }-\sigma \sqrt{2(\max_{i\in [p]}\Omega_{ii})\log \left(\frac{2p}{qs_{0}}\right)}\to \infty \,,\] then this procedure achieves asymptotic power one.
Our framework is built upon the debiasing approach and assumes the standard condition $s_{0}=o(\sqrt{n}/(\log p)^{2})$, where $s_{0}$ indicates the number of true positives among the $p$ features. Notably, this framework achieves exact directional FDR control without any assumption on the amplitude of unknown regression parameters, and does not require any knowledge of the distribution of covariates or the noise level. We test our method in synthetic and real data experiments to assess its performance and to corroborate our theoretical results.
</p>projecteuclid.org/euclid.ejs/1554429628_20190404220116Thu, 04 Apr 2019 22:01 EDTSimplified vine copula models: Approximations based on the simplifying assumptionhttps://projecteuclid.org/euclid.ejs/1554429629<strong>Fabian Spanhel</strong>, <strong>Malte S. Kurz</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1254--1291.</p><p><strong>Abstract:</strong><br/>
Vine copulas, or pair-copula constructions, have become an important tool in high-dimensional dependence modeling. Commonly, it is assumed that the data generating copula can be represented by a simplified vine copula (SVC). In this paper, we study the simplifying assumption and investigate the approximation of multivariate copulas by SVCs. We introduce the partial vine copula (PVC) which is a particular SVC where to any edge a $j$-th order partial copula is assigned. The PVC generalizes the partial correlation matrix and plays a major role in the approximation of copulas by SVCs. We investigate to what extent the PVC describes the dependence structure of the underlying copula. We show that, in general, the PVC does not minimize the Kullback-Leibler divergence from the true copula if the simplifying assumption does not hold. However, under regularity conditions, stepwise estimators of pair-copula constructions converge to the PVC irrespective of whether the simplifying assumption holds or not. Moreover, we elucidate why the PVC is often the best feasible SVC approximation in practice.
</p>projecteuclid.org/euclid.ejs/1554429629_20190404220116Thu, 04 Apr 2019 22:01 EDTCoarse-to-fine multiple testing strategieshttps://projecteuclid.org/euclid.ejs/1554451243<strong>Kamel Lahouel</strong>, <strong>Donald Geman</strong>, <strong>Laurent Younes</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1292--1328.</p><p><strong>Abstract:</strong><br/>
We analyze control of the familywise error rate (FWER) in a multiple testing scenario with a great many null hypotheses about the distribution of a high-dimensional random variable among which only a very small fraction are false, or “active”. In order to improve power relative to conservative Bonferroni bounds, we explore a coarse-to-fine procedure adapted to a situation in which tests are partitioned into subsets, or “cells”, and active hypotheses tend to cluster within cells. We develop procedures for a non-parametric case based on generalized permutation testing and a linear Gaussian model, and demonstrate higher power than Bonferroni estimates at the same FWER when the active hypotheses do cluster. The main technical difficulty arises from the correlation between the test statistics at the individual and cell levels, which increases the likelihood of a hypothesis being falsely discovered when the cell that contains it is falsely discovered (survivorship bias). This requires sharp estimates of certain quadrant probabilities when a cell is inactive.
</p>projecteuclid.org/euclid.ejs/1554451243_20190405040056Fri, 05 Apr 2019 04:00 EDTThe nonparametric LAN expansion for discretely observed diffusionshttps://projecteuclid.org/euclid.ejs/1554451244<strong>Sven Wang</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1329--1358.</p><p><strong>Abstract:</strong><br/>
Consider a scalar reflected diffusion $(X_{t}:t\geq 0)$, where the unknown drift function $b$ is modelled nonparametrically. We show that in the low frequency sampling case, when the sample consists of $(X_{0},X_{\Delta },...,X_{n\Delta })$ for some fixed sampling distance $\Delta >0$, the model satisfies the local asymptotic normality (LAN) property, assuming that $b$ satisfies some mild regularity assumptions. This is established by using the connections of diffusion processes to elliptic and parabolic PDEs. The key tools used are regularity estimates for certain parabolic PDEs as well as a detailed analysis of the spectral properties of the elliptic differential operator related to $(X_{t}:t\geq 0)$.
</p>projecteuclid.org/euclid.ejs/1554451244_20190405040056Fri, 05 Apr 2019 04:00 EDTEstimating the reach of a manifoldhttps://projecteuclid.org/euclid.ejs/1555056153<strong>Eddie Aamari</strong>, <strong>Jisu Kim</strong>, <strong>Frédéric Chazal</strong>, <strong>Bertrand Michel</strong>, <strong>Alessandro Rinaldo</strong>, <strong>Larry Wasserman</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1359--1399.</p><p><strong>Abstract:</strong><br/>
Various problems in manifold estimation make use of a quantity called the reach , denoted by $\tau_{M}$, which is a measure of the regularity of the manifold. This paper is the first investigation into the problem of how to estimate the reach. First, we study the geometry of the reach through an approximation perspective. We derive new geometric results on the reach for submanifolds without boundary. An estimator $\hat{\tau }$ of $\tau_{M}$ is proposed in an oracle framework where tangent spaces are known, and bounds assessing its efficiency are derived. In the case of i.i.d. random point cloud $\mathbb{X}_{n}$, $\hat{\tau }(\mathbb{X}_{n})$ is showed to achieve uniform expected loss bounds over a $\mathcal{C}^{3}$-like model. Finally, we obtain upper and lower bounds on the minimax rate for estimating the reach.
</p>projecteuclid.org/euclid.ejs/1555056153_20190412040257Fri, 12 Apr 2019 04:02 EDTImproved inference in generalized mean-reverting processes with multiple change-pointshttps://projecteuclid.org/euclid.ejs/1555380048<strong>Sévérien Nkurunziza</strong>, <strong>Kang Fu</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1400--1442.</p><p><strong>Abstract:</strong><br/>
In this paper, we consider inference problem about the drift parameter vector in generalized mean reverting processes with multiple and unknown change-points. In particular, we study the case where the parameter may satisfy uncertain restriction. As compared to the results in literature, we generalize some findings in five ways. First, we consider the model which incorporates the uncertain prior knowledge. Second, we derive the unrestricted estimator (UE) and the restricted estimator (RE) and we study their asymptotic properties. Third, we derive a test for testing the hypothesized restriction and we derive its asymptotic local power. We also prove that the proposed test is consistent. Fourth, we construct a class of shrinkage type estimators (SEs) which encloses the UE, the RE and classical SEs. Fifth, we derive the relative risk dominance of the proposed estimators. More precisely, we prove that the SEs dominate the UE. Finally, we present some simulation results which corroborate the established theoretical findings.
</p>projecteuclid.org/euclid.ejs/1555380048_20190415220137Mon, 15 Apr 2019 22:01 EDTMixed-normal limit theorems for multiple Skorohod integrals in high-dimensions, with application to realized covariancehttps://projecteuclid.org/euclid.ejs/1555380049<strong>Yuta Koike</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1443--1522.</p><p><strong>Abstract:</strong><br/>
This paper develops mixed-normal approximations for probabilities that vectors of multiple Skorohod integrals belong to random convex polytopes when the dimensions of the vectors possibly diverge to infinity. We apply the developed theory to establish the asymptotic mixed normality of the realized covariance matrix of a high-dimensional continuous semimartingale observed at a high-frequency, where the dimension can be much larger than the sample size. We also present an application of this result to testing the residual sparsity of a high-dimensional continuous-time factor model.
</p>projecteuclid.org/euclid.ejs/1555380049_20190415220137Mon, 15 Apr 2019 22:01 EDTAdaptive confidence sets for kink estimationhttps://projecteuclid.org/euclid.ejs/1555380050<strong>Viktor Bengs</strong>, <strong>Hajo Holzmann</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1523--1579.</p><p><strong>Abstract:</strong><br/>
We consider estimation of the location and the height of the jump in the $\gamma $-th derivative - a kink of order $\gamma $ - of a regression curve, which is assumed to be Hölder smooth of order $s\geq \gamma +1$ away from the kink. Optimal convergence rates as well as the joint asymptotic normal distribution of estimators based on the zero-crossing-time technique are established. Further, we construct joint as well as marginal asymptotic confidence sets for these parameters which are honest and adaptive with respect to the smoothness parameter $s$ over subsets of the Hölder classes. The finite-sample performance is investigated in a simulation study, and a real data illustration is given to a series of annual global surface temperatures.
</p>projecteuclid.org/euclid.ejs/1555380050_20190415220137Mon, 15 Apr 2019 22:01 EDTA preferential attachment model for the stellar initial mass functionhttps://projecteuclid.org/euclid.ejs/1555380051<strong>Jessi Cisewski-Kehe</strong>, <strong>Grant Weller</strong>, <strong>Chad Schafer</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1580--1607.</p><p><strong>Abstract:</strong><br/>
Accurate specification of a likelihood function is becoming increasingly difficult in many inference problems in astronomy. As sample sizes resulting from astronomical surveys continue to grow, deficiencies in the likelihood function lead to larger biases in key parameter estimates. These deficiencies result from the oversimplification of the physical processes that generated the data, and from the failure to account for observational limitations. Unfortunately, realistic models often do not yield an analytical form for the likelihood. The estimation of a stellar initial mass function (IMF) is an important example. The stellar IMF is the mass distribution of stars initially formed in a given cluster of stars, a population which is not directly observable due to stellar evolution and other disruptions and observational limitations of the cluster. There are several difficulties with specifying a likelihood in this setting since the physical processes and observational challenges result in measurable masses that cannot legitimately be considered independent draws from an IMF. This work improves inference of the IMF by using an approximate Bayesian computation approach that both accounts for observational and astrophysical effects and incorporates a physically-motivated model for star cluster formation. The methodology is illustrated via a simulation study, demonstrating that the proposed approach can recover the true posterior in realistic situations, and applied to observations from astrophysical simulation data.
</p>projecteuclid.org/euclid.ejs/1555380051_20190415220137Mon, 15 Apr 2019 22:01 EDTFully Bayesian estimation under informative samplinghttps://projecteuclid.org/euclid.ejs/1555466479<strong>Luis G. León-Novelo</strong>, <strong>Terrance D. Savitsky</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1608--1645.</p><p><strong>Abstract:</strong><br/>
Survey data are often collected under informative sampling designs where subject inclusion probabilities are designed to be correlated with the response variable of interest. The data modeler seeks to estimate the parameters of a population model they specify from these data. Sampling weights constructed from marginal inclusion probabilities are typically used to form an exponentiated pseudo likelihood as a plug-in estimator in a partially Bayesian pseudo posterior. We introduce the first fully Bayesian alternative, based on a Bayes rule construction, that simultaneously performs weight smoothing and estimates the population model parameters in a construction that treats the response variable(s) and inclusion probabilities as jointly randomly generated from a population distribution. We formulate conditions on known marginal and pairwise inclusion probabilities that define a class of sampling designs where $L_{1}$ consistency of the joint posterior is guaranteed. We compare performances between the two approaches on synthetic data. We demonstrate that the credibility intervals under our fully Bayesian method achieve nominal coverage. We apply our method to data from the National Health and Nutrition Examination Survey to explore the relationship between caffeine consumption and systolic blood pressure.
</p>projecteuclid.org/euclid.ejs/1555466479_20190416220149Tue, 16 Apr 2019 22:01 EDTStrong consistency of the least squares estimator in regression models with adaptive learninghttps://projecteuclid.org/euclid.ejs/1555466480<strong>Norbert Christopeit</strong>, <strong>Michael Massmann</strong>. <p><strong>Source: </strong>Electronic Journal of Statistics, Volume 13, Number 1, 1646--1693.</p><p><strong>Abstract:</strong><br/>
This paper looks at the strong consistency of the ordinary least squares (OLS) estimator in linear regression models with adaptive learning. It is a companion to Christopeit & Massmann (2018) which considers the estimator’s convergence in distribution and its weak consistency in the same setting. Under constant gain learning, the model is closely related to stationary, (alternating) unit root or explosive autoregressive processes. Under decreasing gain learning, the regressors in the model are asymptotically collinear. The paper examines, first, the issue of strong convergence of the learning recursion: It is argued that, under constant gain learning, the recursion does not converge in any probabilistic sense, while for decreasing gain learning rates are derived at which the recursion converges almost surely to the rational expectations equilibrium. Secondly, the paper establishes the strong consistency of the OLS estimators, under both constant and decreasing gain learning, as well as rates at which the estimators converge almost surely. In the constant gain model, separate estimators for the intercept and slope parameters are juxtaposed to the joint estimator, drawing on the recent literature on explosive autoregressive models. Thirdly, it is emphasised that strong consistency is obtained in all models although the near-optimal condition for the strong consistency of OLS in linear regression models with stochastic regressors, established by Lai & Wei (1982a), is not always met.
</p>projecteuclid.org/euclid.ejs/1555466480_20190416220149Tue, 16 Apr 2019 22:01 EDT