<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>The Annals of Statistics Articles (Project Euclid)</title>
    <link>http://projecteuclid.org/euclid.aos</link>
    <description>The latest articles from The Annals of Statistics on Project Euclid, a site for mathematics and statistics resources.</description>
    <language>en-us</language>
    <copyright>Copyright 2010 Cornell University Library</copyright>
    <webMaster>Euclid-L@cornell.edu (Project Euclid Team)</webMaster>
    <pubDate>Thu, 05 Aug 2010 15:41 EDT</pubDate>
    <lastBuildDate>Tue, 07 Jun 2011 09:09 EDT</lastBuildDate>
    <image>
      <url>http://projecteuclid.org/collection/euclid/images/logo_linking_100.gif</url>
      <title>Project Euclid</title>
      <link>http://projecteuclid.org/</link>
    </image>
    <item>
      <title>Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem</title>
      <link>http://projecteuclid.org/euclid.aos/1278861454</link>
      <description>&lt;strong&gt;James G. Scott&lt;/strong&gt;, &lt;strong&gt;James O. Berger&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 38, Number 5, 2587--2619.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This paper studies the multiplicity-correction effect of standard Bayesian variable-selection priors in linear regression. Our first goal is to clarify when, and how, multiplicity correction happens automatically in Bayesian analysis, and to distinguish this correction from the Bayesian Ockham’s-razor effect. Our second goal is to contrast empirical-Bayes and fully Bayesian approaches to variable selection through examples, theoretical results and simulations. Considerable differences between the two approaches are found. In particular, we prove a theorem that characterizes a surprising aymptotic discrepancy between fully Bayes and empirical Bayes. This discrepancy arises from a different source than the failure to account for hyperparameter uncertainty in the empirical-Bayes estimate. Indeed, even at the extreme, when the empirical-Bayes estimate converges asymptotically to the true variable-inclusion probability, the potential for a serious difference remains.
 
 &lt;/p&gt;</description>
      <guid isPermaLink="false">projecteuclid.org/euclid.aos/1278861454_Thu, 05 Aug 2010 15:41 EDT</guid>
      <pubDate>Thu, 05 Aug 2010 15:41 EDT</pubDate>
    </item>
    
    
    
    
    
    
    
    
    
  <item><title>Regularized rank-based estimation of high-dimensional nonparanormal graphical models</title><link>http://projecteuclid.org/euclid.aos/1359987530</link><description>&lt;strong&gt;Lingzhou Xue&lt;/strong&gt;, &lt;strong&gt;Hui Zou&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2541--2571.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
A sparse precision matrix can be directly translated into a sparse Gaussian graphical model under the assumption that the data follow a joint normal distribution. This neat property makes high-dimensional precision matrix estimation very appealing in many applications. However, in practice we often face nonnormal data, and variable transformation is often used to achieve normality. In this paper we consider the nonparanormal model that assumes that the variables follow a joint normal distribution after a set of unknown monotone transformations. The nonparanormal model is much more flexible than the normal model while retaining the good interpretability of the latter in that each zero entry in the sparse precision matrix of the nonparanormal model corresponds to a pair of conditionally independent variables. In this paper we show that the nonparanormal graphical model can be efficiently estimated by using a rank-based estimation scheme which does not require estimating these unknown transformation functions. In particular, we study the rank-based graphical lasso, the rank-based neighborhood Dantzig selector and the rank-based CLIME. We establish their theoretical properties in the setting where the dimension is nearly exponentially large relative to the sample size. It is shown that the proposed rank-based estimators work as well as their oracle counterparts defined with the oracle data. Furthermore, the theory motivates us to consider the adaptive version of the rank-based neighborhood Dantzig selector and the rank-based CLIME that are shown to enjoy graphical model selection consistency without assuming the irrepresentable condition for the oracle and rank-based graphical lasso. Simulated and real data are used to demonstrate the finite performance of the rank-based estimators.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987530_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>On false discovery rate thresholding for classification under sparsity</title><link>http://projecteuclid.org/euclid.aos/1359987531</link><description>&lt;strong&gt;Pierre Neuvial&lt;/strong&gt;, &lt;strong&gt;Etienne Roquain&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2572--2600.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We study the properties of false discovery rate (FDR) thresholding, viewed as a classification procedure. The “$0$”-class (null) is assumed to have a known density while the “$1$”-class (alternative) is obtained from the “$0$”-class either by translation or by scaling. Furthermore, the “$1$”-class is assumed to have a small number of elements w.r.t. the “$0$”-class (sparsity). We focus on densities of the Subbotin family, including Gaussian and Laplace models. Nonasymptotic oracle inequalities are derived for the excess risk of FDR thresholding. These inequalities lead to explicit rates of convergence of the excess risk to zero, as the number $m$ of items to be classified tends to infinity and in a regime where the power of the Bayes rule is away from $0$ and $1$. Moreover, these theoretical investigations suggest an explicit choice for the target level $\alpha_{m}$ of FDR thresholding, as a function of $m$. Our oracle inequalities show theoretically that the resulting FDR thresholding adapts to the unknown sparsity regime contained in the data. This property is illustrated with numerical experiments.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987531_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Nonparametric regression for locally stationary time series</title><link>http://projecteuclid.org/euclid.aos/1359987532</link><description>&lt;strong&gt;Michael Vogt&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2601--2633.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
In this paper, we study nonparametric models allowing for locally stationary regressors and a regression function that changes smoothly over time. These models are a natural extension of time series models with time-varying coefficients. We introduce a kernel-based method to estimate the time-varying regression function and provide asymptotic theory for our estimates. Moreover, we show that the main conditions of the theory are satisfied for a large class of nonlinear autoregressive processes with a time-varying regression function. Finally, we examine structured models where the regression function splits up into time-varying additive components. As will be seen, estimation in these models does not suffer from the curse of dimensionality.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987532_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Multivariate varying coefficient model for functional responses</title><link>http://projecteuclid.org/euclid.aos/1359987533</link><description>&lt;strong&gt;Hongtu Zhu&lt;/strong&gt;, &lt;strong&gt;Runze Li&lt;/strong&gt;, &lt;strong&gt;Linglong Kong&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2634--2666.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Motivated by recent work studying massive imaging data in the neuroimaging literature, we propose multivariate varying coefficient models (MVCM) for modeling the relation between multiple functional responses and a set of covariates. We develop several statistical inference procedures for MVCM and systematically study their theoretical properties. We first establish the weak convergence of the local linear estimate of coefficient functions, as well as its asymptotic bias and variance, and then we derive asymptotic bias and mean integrated squared error of smoothed individual functions and their uniform convergence rate. We establish the uniform convergence rate of the estimated covariance function of the individual functions and its associated eigenvalue and eigenfunctions. We propose a global test for linear hypotheses of varying coefficient functions, and derive its asymptotic distribution under the null hypothesis. We also propose a simultaneous confidence band for each individual effect curve. We conduct Monte Carlo simulation to examine the finite-sample performance of the proposed procedures. We apply MVCM to investigate the development of white matter diffusivities along the genu tract of the corpus callosum in a clinical study of neurodevelopment.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987533_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Tight conditions for consistency of variable selection in the context of high dimensionality</title><link>http://projecteuclid.org/euclid.aos/1359987534</link><description>&lt;strong&gt;Laëtitia Comminges&lt;/strong&gt;, &lt;strong&gt;Arnak S. Dalalyan&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2667--2696.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We address the issue of variable selection in the regression model with very high ambient dimension, that is, when the number of variables is very large. The main focus is on the situation where the number of relevant variables, called intrinsic dimension, is much smaller than the ambient dimension $d$. Without assuming any parametric form of the underlying regression function, we get tight conditions making it possible to consistently estimate the set of relevant variables. These conditions relate the intrinsic dimension to the ambient dimension and to the sample size. The procedure that is provably consistent under these tight conditions is based on comparing quadratic functionals of the empirical Fourier coefficients with appropriately chosen threshold values.
 
 
The asymptotic analysis reveals the presence of two quite different re gimes. The first regime is when the intrinsic dimension is fixed. In this case the situation in nonparametric regression is the same as in linear regression, that is, consistent variable selection is possible if and only if $\log d$ is small compared to the sample size $n$. The picture is different in the second regime, that is, when the number of relevant variables denoted by $s$ tends to infinity as $n\to\infty$. Then we prove that consistent variable selection in nonparametric set-up is possible only if $s+\log\log d$ is small compared to $\log n$. We apply these results to derive minimax separation rates for the problem of variable selection.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987534_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Asymptotic properties of the maximum likelihood estimation in misspecified hidden Markov models</title><link>http://projecteuclid.org/euclid.aos/1359987535</link><description>&lt;strong&gt;Randal Douc&lt;/strong&gt;, &lt;strong&gt;Eric Moulines&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2697--2732.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Let $(Y_{k})_{k\in\mathbb{Z}}$ be a stationary sequence on a probability space $(\Omega,\mathcal{A},\mathbb{P})$ taking values in a standard Borel space $\mathsf{Y}$. Consider the associated maximum likelihood estimator with respect to a parametrized family of hidden Markov models such that the law of the observations $(Y_{k})_{k\in\mathbb{Z}}$ is not assumed to be described by any of the hidden Markov models of this family. In this paper we investigate the consistency of this estimator in such misspecified models under mild assumptions.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987535_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Optimal weighted nearest neighbour classifiers</title><link>http://projecteuclid.org/euclid.aos/1359987536</link><description>&lt;strong&gt;Richard J. Samworth&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 5, 2733--2763.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We derive an asymptotic expansion for the excess risk (regret) of a weighted nearest-neighbour classifier. This allows us to find the asymptotically optimal vector of nonnegative weights, which has a rather simple form. We show that the ratio of the regret of this classifier to that of an unweighted $k$-nearest neighbour classifier depends asymptotically only on the dimension $d$ of the feature vectors, and not on the underlying populations. The improvement is greatest when $d=4$, but thereafter decreases as $d\rightarrow\infty$. The popular bagged nearest neighbour classifier can also be regarded as a weighted nearest neighbour classifier, and we show that its corresponding weights are somewhat suboptimal when $d$ is small (in particular, worse than those of the unweighted $k$-nearest neighbour classifier when $d=1$), but are close to optimal when $d$ is large. Finally, we argue that improvements in the rate of convergence are possible under stronger smoothness assumptions, provided we allow negative weights. Our findings are supported by an empirical performance comparison on both simulated and real data sets.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1359987536_Mon, 04 Feb 2013 09:20 EST</guid><pubDate>Mon, 04 Feb 2013 09:20 EST</pubDate></item><item><title>Adaptive functional linear regression</title><link>http://projecteuclid.org/euclid.aos/1360332183</link><description>&lt;strong&gt;Fabienne Comte&lt;/strong&gt;, &lt;strong&gt;Jan Johannes&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2765--2797.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We consider the estimation of the slope function in functional linear regression, where scalar responses are modeled in dependence of random functions. Cardot and Johannes [ J. Multivariate Anal. 101 (2010) 395–408] have shown that a thresholded projection estimator can attain up to a constant minimax-rates of convergence in a general framework which allows us to cover the prediction problem with respect to the mean squared prediction error as well as the estimation of the slope function and its derivatives. This estimation procedure, however, requires an optimal choice of a tuning parameter with regard to certain characteristics of the slope function and the covariance operator associated with the functional regressor. As this information is usually inaccessible in practice, we investigate a fully data-driven choice of the tuning parameter which combines model selection and Lepski’s method. It is inspired by the recent work of Goldenshluger and Lepski [ Ann. Statist. 39 (2011) 1608–1632]. The tuning parameter is selected as minimizer of a stochastic penalized contrast function imitating Lepski’s method among a random collection of admissible values. This choice of the tuning parameter depends only on the data and we show that within the general framework the resulting data-driven thresholded projection estimator can attain minimax-rates up to a constant over a variety of classes of slope functions and covariance operators. The results are illustrated considering different configurations which cover in particular the prediction problem as well as the estimation of the slope and its derivatives. A simulation study shows the reasonable performance of the fully data-driven estimation procedure.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332183_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>On the uniform asymptotic validity of subsampling and the bootstrap</title><link>http://projecteuclid.org/euclid.aos/1360332184</link><description>&lt;strong&gt;Joseph P. Romano&lt;/strong&gt;, &lt;strong&gt;Azeem M. Shaikh&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2798--2822.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This paper provides conditions under which subsampling and the bootstrap can be used to construct estimators of the quantiles of the distribution of a root that behave well uniformly over a large class of distributions $\mathbf{P}$. These results are then applied (i) to construct confidence regions that behave well uniformly over $\mathbf{P}$ in the sense that the coverage probability tends to at least the nominal level uniformly over $\mathbf{P}$ and (ii) to construct tests that behave well uniformly over $\mathbf{P}$ in the sense that the size tends to no greater than the nominal level uniformly over $\mathbf{P}$. Without these stronger notions of convergence, the asymptotic approximations to the coverage probability or size may be poor, even in very large samples. Specific applications include the multivariate mean, testing moment inequalities, multiple testing, the empirical process and $U$-statistics.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332184_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Convergence analysis of the Gibbs sampler for Bayesian general linear mixed models with improper priors</title><link>http://projecteuclid.org/euclid.aos/1360332185</link><description>&lt;strong&gt;Jorge Carlos Román&lt;/strong&gt;, &lt;strong&gt;James P. Hobert&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2823--2849.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Bayesian analysis of data from the general linear mixed model is challenging because any nontrivial prior leads to an intractable posterior density. However, if a conditionally conjugate prior density is adopted, then there is a simple Gibbs sampler that can be employed to explore the posterior density. A popular default among the conditionally conjugate priors is an improper prior that takes a product form with a flat prior on the regression parameter, and so-called power priors on each of the variance components. In this paper, a convergence rate analysis of the corresponding Gibbs sampler is undertaken. The main result is a simple, easily-checked sufficient condition for geometric ergodicity of the Gibbs–Markov chain. This result is close to the best possible result in the sense that the sufficient condition is only slightly stronger than what is required to ensure posterior propriety. The theory developed in this paper is extremely important from a practical standpoint because it guarantees the existence of central limit theorems that allow for the computation of valid asymptotic standard errors for the estimates computed using the Gibbs sampler.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332185_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Optimal two-stage procedures for estimating location and size of the maximum of a multivariate regression function</title><link>http://projecteuclid.org/euclid.aos/1360332186</link><description>&lt;strong&gt;Eduard Belitser&lt;/strong&gt;, &lt;strong&gt;Subhashis Ghosal&lt;/strong&gt;, &lt;strong&gt;Harry van Zanten&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2850--2876.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We propose a two-stage procedure for estimating the location $\mathbf{\mu}$ and size $M$ of the maximum of a smooth $d$-variate regression function $f$. In the first stage, a preliminary estimator of $\mathbf{\mu}$ obtained from a standard nonparametric smoothing method is used. At the second stage, we “zoom-in” near the vicinity of the preliminary estimator and make further observations at some design points in that vicinity. We fit an appropriate polynomial regression model to estimate the location and size of the maximum. We establish that, under suitable smoothness conditions and appropriate choice of the zooming, the second stage estimators have better convergence rates than the corresponding first stage estimators of $\mathbf{\mu}$ and $M$. More specifically, for $\alpha$-smooth regression functions, the optimal nonparametric rates $n^{-(\alpha-1)/(2\alpha+d)}$ and $n^{-\alpha/(2\alpha+d)}$ at the first stage can be improved to $n^{-(\alpha-1)/(2\alpha)}$ and $n^{-1/2}$, respectively, for $\alpha&amp;gt;1+\sqrt{1+d/2}$. These rates are optimal in the class of all possible sequential estimators. Interestingly, the two-stage procedure resolves “the curse of the dimensionality” problem to some extent, as the dimension $d$ does not control the second stage convergence rates, provided that the function class is sufficiently smooth. We consider a multi-stage generalization of our procedure that attains the optimal rate for any smoothness level $\alpha&amp;gt;2$ starting with a preliminary estimator with any power-law rate at the first stage.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332186_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Parametric estimation. Finite sample theory</title><link>http://projecteuclid.org/euclid.aos/1360332187</link><description>&lt;strong&gt;Vladimir Spokoiny&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2877--2909.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
The paper aims at reconsidering the famous Le Cam LAN theory. The main features of the approach which make it different from the classical one are as follows: (1) the study is nonasymptotic, that is, the sample size is fixed and does not tend to infinity; (2) the parametric assumption is possibly misspecified and the underlying data distribution can lie beyond the given parametric family. These two features enable to bridge the gap between parametric and nonparametric theory and to build a unified framework for statistical estimation. The main results include large deviation bounds for the (quasi) maximum likelihood and the local quadratic bracketing of the log-likelihood process. The latter yields a number of important corollaries for statistical inference: concentration, confidence and risk bounds, expansion of the maximum likelihood estimate, etc. All these corollaries are stated in a nonclassical way admitting a model misspecification and finite samples. However, the classical asymptotic results including the efficiency bounds can be easily derived as corollaries of the obtained nonasymptotic statements. At the same time, the new bracketing device works well in the situations with large or growing parameter dimension in which the classical parametric theory fails. The general results are illustrated for the i.i.d. setup as well as for generalized linear and median estimation. The results apply for any dimension of the parameter space and provide a quantitative lower bound on the sample size yielding the root-n accuracy.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332187_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Rotation and scale space random fields and the Gaussian kinematic formula</title><link>http://projecteuclid.org/euclid.aos/1360332188</link><description>&lt;strong&gt;Robert J. Adler&lt;/strong&gt;, &lt;strong&gt;Eliran Subag&lt;/strong&gt;, &lt;strong&gt;Jonathan E. Taylor&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2910--2942.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We provide a new approach, along with extensions, to results in two important papers of Worsley, Siegmund and coworkers closely tied to the statistical analysis of fMRI (functional magnetic resonance imaging) brain data. These papers studied approximations for the exceedence probabilities of scale and rotation space random fields, the latter playing an important role in the statistical analysis of fMRI data. The techniques used there came either from the Euler characteristic heuristic or via tube formulae, and to a large extent were carefully attuned to the specific examples of the paper.
 
 
This paper treats the same problem, but via calculations based on the so-called Gaussian kinematic formula. This allows for extensions of the Worsley–Siegmund results to a wide class of non-Gaussian cases. In addition, it allows one to obtain results for rotation space random fields in any dimension via reasonably straightforward Riemannian geometric calculations. Previously only the two-dimensional case could be covered, and then only via computer algebra.
 
 
By adopting this more structured approach to this particular problem, a solution path for other, related problems becomes clearer.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332188_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Two-step spline estimating equations for generalized additive partially linear models with large cluster sizes</title><link>http://projecteuclid.org/euclid.aos/1360332189</link><description>&lt;strong&gt;Shujie Ma&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2943--2972.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We propose a two-step estimating procedure for generalized additive partially linear models with clustered data using estimating equations. Our proposed method applies to the case that the number of observations per cluster is allowed to increase with the number of independent subjects. We establish oracle properties for the two-step estimator of each function component such that it performs as well as the univariate function estimator by assuming that the parametric vector and all other function components are known. Asymptotic distributions and consistency properties of the estimators are obtained. Finite-sample experiments with both simulated continuous and binary response variables confirm the asymptotic results. We illustrate the methods with an application to a U.S. unemployment data set.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332189_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Independent component analysis via nonparametric maximum likelihood estimation</title><link>http://projecteuclid.org/euclid.aos/1360332190</link><description>&lt;strong&gt;Richard J. Samworth&lt;/strong&gt;, &lt;strong&gt;Ming Yuan&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 2973--3002.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Independent Component Analysis (ICA) models are very popular semiparametric models in which we observe independent copies of a random vector $X=AS$, where $A$ is a non-singular matrix and $S$ has independent components. We propose a new way of estimating the unmixing matrix $W=A^{-1}$ and the marginal distributions of the components of $S$ using nonparametric maximum likelihood. Specifically, we study the projection of the empirical distribution onto the subset of ICA distributions having log-concave marginals. We show that, from the point of view of estimating the unmixing matrix, it makes no difference whether or not the log-concavity is correctly specified. The approach is further justified by both theoretical results and a simulation study.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332190_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Asymptotic optimality and efficient computation of the leave-subject-out cross-validation</title><link>http://projecteuclid.org/euclid.aos/1360332191</link><description>&lt;strong&gt;Ganggang Xu&lt;/strong&gt;, &lt;strong&gt;Jianhua Z. Huang&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3003--3030.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Although the leave-subject-out cross-validation (CV) has been widely used in practice for tuning parameter selection for various nonparametric and semiparametric models of longitudinal data, its theoretical property is unknown and solving the associated optimization problem is computationally expensive, especially when there are multiple tuning parameters. In this paper, by focusing on the penalized spline method, we show that the leave-subject-out CV is optimal in the sense that it is asymptotically equivalent to the empirical squared error loss function minimization. An efficient Newton-type algorithm is developed to compute the penalty parameters that optimize the CV criterion. Simulated and real data are used to demonstrate the effectiveness of the leave-subject-out CV in selecting both the penalty parameters and the working correlation matrix.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332191_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>The transfer principle: A tool for complete case analysis</title><link>http://projecteuclid.org/euclid.aos/1360332192</link><description>&lt;strong&gt;Hira L. Koul&lt;/strong&gt;, &lt;strong&gt;Ursula U. Müller&lt;/strong&gt;, &lt;strong&gt;Anton Schick&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3031--3049.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This paper gives a general method for deriving limiting distributions of complete case statistics for missing data models from corresponding results for the model where all data are observed. This provides a convenient tool for obtaining the asymptotic behavior of complete case versions of established full data methods without lengthy proofs.
 
 
The methodology is illustrated by analyzing three inference procedures for partially linear regression models with responses missing at random. We first show that complete case versions of asymptotically efficient estimators of the slope parameter for the full model are efficient, thereby solving the problem of constructing efficient estimators of the slope parameter for this model. Second, we derive an asymptotically distribution free test for fitting a normal distribution to the errors. Finally, we obtain an asymptotically distribution free test for linearity, that is, for testing that the nonparametric component of these models is a constant. This test is new both when data are fully observed and when data are missing at random.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1360332192_Fri, 08 Feb 2013 09:03 EST</guid><pubDate>Fri, 08 Feb 2013 09:03 EST</pubDate></item><item><title>Variable transformation to obtain geometric ergodicity in the random-walk Metropolis algorithm</title><link>http://projecteuclid.org/euclid.aos/1361542074</link><description>&lt;strong&gt;Leif T. Johnson&lt;/strong&gt;, &lt;strong&gt;Charles J. Geyer&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3050--3076.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
A random-walk Metropolis sampler is geometrically ergodic if its equilibrium density is super-exponentially light and satisfies a curvature condition [ Stochastic Process. Appl. 85 (2000) 341–361]. Many applications, including Bayesian analysis with conjugate priors of logistic and Poisson regression and of log-linear models for categorical data result in posterior distributions that are not super-exponentially light. We show how to apply the change-of-variable formula for diffeomorphisms to obtain new densities that do satisfy the conditions for geometric ergodicity. Sampling the new variable and mapping the results back to the old gives a geometrically ergodic sampler for the original variable. This method of obtaining geometric ergodicity has very wide applicability.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1361542074_Fri, 22 Feb 2013 09:08 EST</guid><pubDate>Fri, 22 Feb 2013 09:08 EST</pubDate></item><item><title>Accuracy guaranties for $\ell_{1}$ recovery of block-sparse signals</title><link>http://projecteuclid.org/euclid.aos/1361542075</link><description>&lt;strong&gt;Anatoli Juditsky&lt;/strong&gt;, &lt;strong&gt;Fatma Kılınç Karzan&lt;/strong&gt;, &lt;strong&gt;Arkadi Nemirovski&lt;/strong&gt;, &lt;strong&gt;Boris Polyak&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3077--3107.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We introduce a general framework to handle structured models (sparse and block-sparse with possibly overlapping blocks). We discuss new methods for their recovery from incomplete observation, corrupted with deterministic and stochastic noise, using block-$\ell_{1}$ regularization. While the current theory provides promising bounds for the recovery errors under a number of different, yet mostly hard to verify conditions, our emphasis is on verifiable conditions on the problem parameters (sensing matrix and the block structure) which guarantee accurate recovery. Verifiability of our conditions not only leads to efficiently computable bounds for the recovery error but also allows us to optimize these error bounds with respect to the method parameters, and therefore construct estimators with improved statistical properties. To justify our approach, we also provide an oracle inequality, which links the properties of the proposed recovery algorithms and the best estimation performance. Furthermore, utilizing these verifiable conditions, we develop a computationally cheap alternative to block-$\ell_{1}$ minimization, the non-Euclidean Block Matching Pursuit algorithm. We close by presenting a numerical study to investigate the effect of different block regularizations and demonstrate the performance of the proposed recoveries.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1361542075_Fri, 22 Feb 2013 09:08 EST</guid><pubDate>Fri, 22 Feb 2013 09:08 EST</pubDate></item><item><title>Estimation in functional linear quantile regression</title><link>http://projecteuclid.org/euclid.aos/1361542076</link><description>&lt;strong&gt;Kengo Kato&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3108--3136.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This paper studies estimation in functional linear quantile regression in which the dependent variable is scalar while the covariate is a function, and the conditional quantile for each fixed quantile index is modeled as a linear functional of the covariate. Here we suppose that covariates are discretely observed and sampling points may differ across subjects, where the number of measurements per subject increases as the sample size. Also, we allow the quantile index to vary over a given subset of the open unit interval, so the slope function is a function of two variables: (typically) time and quantile index. Likewise, the conditional quantile function is a function of the quantile index and the covariate. We consider an estimator for the slope function based on the principal component basis. An estimator for the conditional quantile function is obtained by a plug-in method. Since the so-constructed plug-in estimator not necessarily satisfies the monotonicity constraint with respect to the quantile index, we also consider a class of monotonized estimators for the conditional quantile function. We establish rates of convergence for these estimators under suitable norms, showing that these rates are optimal in a minimax sense under some smoothness assumptions on the covariance kernel of the covariate and the slope function. Empirical choice of the cutoff level is studied by using simulations.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1361542076_Fri, 22 Feb 2013 09:08 EST</guid><pubDate>Fri, 22 Feb 2013 09:08 EST</pubDate></item><item><title>Improved multivariate normal mean estimation with unknown covariance when $p$ is greater than $n$</title><link>http://projecteuclid.org/euclid.aos/1361542077</link><description>&lt;strong&gt;Didier Chételat&lt;/strong&gt;, &lt;strong&gt;Martin T. Wells&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3137--3160.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We consider the problem of estimating the mean vector of a $p$-variate normal $(\theta,\Sigma)$ distribution under invariant quadratic loss, $(\delta-\theta)'\Sigma^{-1}(\delta-\theta)$, when the covariance is unknown. We propose a new class of estimators that dominate the usual estimator $\delta^{0}(X)=X$. The proposed estimators of $\theta$ depend upon $X$ and an independent Wishart matrix $S$ with $n$ degrees of freedom, however, $S$ is singular almost surely when $p&amp;gt;n$. The proof of domination involves the development of some new unbiased estimators of risk for the $p&amp;gt;n$ setting. We also find some relationships between the amount of domination and the magnitudes of $n$ and $p$.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1361542077_Fri, 22 Feb 2013 09:08 EST</guid><pubDate>Fri, 22 Feb 2013 09:08 EST</pubDate></item><item><title>A code arithmetic approach for quaternary code designs and its application to (1/64)th-fractions</title><link>http://projecteuclid.org/euclid.aos/1361542078</link><description>&lt;strong&gt;Frederick K. H. Phoa&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 40, Number 6, 3161--3175.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
The study of good nonregular fractional factorial designs has received significant attention over the last two decades. Recent research indicates that designs constructed from quaternary codes (QC) are very promising in this regard. The present paper aims at exploring the fundamental structure and developing a theory to characterize the wordlengths and aliasing indexes for a general $(1/4)^{p}$th-fraction QC design. Then the theory is applied to $(1/64)$th-fraction QC designs. Examples are given, indicating that there exist some QC designs that have better design properties, and are thus more cost-efficient, than the regular fractional factorial designs of the same size. In addition, a result about the periodic structure of $(1/64)$th-fraction QC designs regarding resolution is stated.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1361542078_Fri, 22 Feb 2013 09:08 EST</guid><pubDate>Fri, 22 Feb 2013 09:08 EST</pubDate></item><item><title>The linear stochastic order and directed inference for multivariate ordered distributions</title><link>http://projecteuclid.org/euclid.aos/1362493038</link><description>&lt;strong&gt;Ori Davidov&lt;/strong&gt;, &lt;strong&gt;Shyamal Peddada&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 1--40.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Researchers are often interested in drawing inferences regarding the order between two experimental groups on the basis of multivariate response data. Since standard multivariate methods are designed for two-sided alternatives, they may not be ideal for testing for order between two groups. In this article we introduce the notion of the linear stochastic order and investigate its properties. Statistical theory and methodology are developed to both estimate the direction which best separates two arbitrary ordered distributions and to test for order between the two groups. The new methodology generalizes Roy’s classical largest root test to the nonparametric setting and is applicable to random vectors with discrete and/or continuous components. The proposed methodology is illustrated using data obtained from a 90-day pre-chronic rodent cancer bioassay study conducted by the National Toxicology Program (NTP).
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493038_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>Spatially-adaptive sensing in nonparametric regression</title><link>http://projecteuclid.org/euclid.aos/1362493039</link><description>&lt;strong&gt;Adam D. Bull&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 41--62.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
While adaptive sensing has provided improved rates of convergence in sparse regression and classification, results in nonparametric regression have so far been restricted to quite specific classes of functions. In this paper, we describe an adaptive-sensing algorithm which is applicable to general nonparametric-regression problems. The algorithm is spatially adaptive, and achieves improved rates of convergence over spatially inhomogeneous functions. Over standard function classes, it likewise retains the spatial adaptivity properties of a uniform design.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493039_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>Universally optimal crossover designs under subject dropout</title><link>http://projecteuclid.org/euclid.aos/1362493040</link><description>&lt;strong&gt;Wei Zheng&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 63--90.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Subject dropout is very common in practical applications of crossover designs. However, there is very limited design literature taking this into account. Optimality results have not yet been well established due to the complexity of the problem. This paper establishes feasible, as well as necessary and sufficient conditions for a crossover design to be universally optimal in approximate design theory in the presence of subject dropout. These conditions are essentially linear equations with respect to proportions of all possible treatment sequences being applied to subjects and hence they can be easily solved. A general algorithm is proposed to derive exact designs which are shown to be efficient and robust.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493040_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>Convergence rate of Markov chain methods for genomic motif discovery</title><link>http://projecteuclid.org/euclid.aos/1362493041</link><description>&lt;strong&gt;Dawn B. Woodard&lt;/strong&gt;, &lt;strong&gt;Jeffrey S. Rosenthal&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 91--124.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We analyze the convergence rate of a simplified version of a popular Gibbs sampling method used for statistical discovery of gene regulatory binding motifs in DNA sequences. This sampler satisfies a very strong form of ergodicity (uniform). However, we show that, due to multimodality of the posterior distribution, the rate of convergence often decreases exponentially as a function of the length of the DNA sequence. Specifically, we show that this occurs whenever there is more than one true repeating pattern in the data. In practice there are typically multiple such patterns in biological data, the goal being to detect the most well-conserved and frequently-occurring of these. Our findings match empirical results, in which the motif-discovery Gibbs sampler has exhibited such poor convergence that it is used only for finding modes of the posterior distribution (candidate motifs) rather than for obtaining samples from that distribution. Ours are some of the first meaningful bounds on the convergence rate of a Markov chain method for sampling from a multimodal posterior distribution, as a function of statistical quantities like the number of observations.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493041_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>An algorithm to compute the power of Monte Carlo tests with guaranteed precision</title><link>http://projecteuclid.org/euclid.aos/1362493042</link><description>&lt;strong&gt;Axel Gandy&lt;/strong&gt;, &lt;strong&gt;Patrick Rubin-Delanchy&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 125--142.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This article presents an algorithm that generates a conservative confidence interval of a specified length and coverage probability for the power of a Monte Carlo test (such as a bootstrap or permutation test). It is the first method that achieves this aim for almost any Monte Carlo test. Previous research has focused on obtaining as accurate a result as possible for a fixed computational effort, without providing a guaranteed precision in the above sense. The algorithm we propose does not have a fixed effort and runs until a confidence interval with a user-specified length and coverage probability can be constructed. We show that the expected effort required by the algorithm is finite in most cases of practical interest, including situations where the distribution of the $p$-value is absolutely continuous or discrete with finite support. The algorithm is implemented in the R-package simctest , available on CRAN.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493042_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>Optimal design for linear models with correlated observations</title><link>http://projecteuclid.org/euclid.aos/1362493043</link><description>&lt;strong&gt;Holger Dette&lt;/strong&gt;, &lt;strong&gt;Andrey Pepelyshev&lt;/strong&gt;, &lt;strong&gt;Anatoly Zhigljavsky&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 143--176.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
In the common linear regression model the problem of determining optimal designs for least squares estimation is considered in the case where the observations are correlated. A necessary condition for the optimality of a given design is provided, which extends the classical equivalence theory for optimal designs in models with uncorrelated errors to the case of dependent data. If the regression functions are eigenfunctions of an integral operator defined by the covariance kernel, it is shown that the corresponding measure defines a universally optimal design. For several models universally optimal designs can be identified explicitly. In particular, it is proved that the uniform distribution is universally optimal for a class of trigonometric regression models with a broad class of covariance kernels and that the arcsine distribution is universally optimal for the polynomial regression model with correlation structure defined by the logarithmic potential. To the best knowledge of the authors these findings provide the first explicit results on optimal designs for regression models with correlated observations, which are not restricted to the location scale model.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493043_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>The subset argument and consistency of MLE in GLMM: Answer to an open problem and beyond</title><link>http://projecteuclid.org/euclid.aos/1362493044</link><description>&lt;strong&gt;Jiming Jiang&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 177--195.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We give answer to an open problem regarding consistency of the maximum likelihood estimators (MLEs) in generalized linear mixed models (GLMMs) involving crossed random effects. The solution to the open problem introduces an interesting, nonstandard approach to proving consistency of the MLEs in cases of dependent observations. Using the new technique, we extend the results to MLEs under a general GLMM. An example is used to further illustrate the technique.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1362493044_Tue, 05 Mar 2013 09:18 EST</guid><pubDate>Tue, 05 Mar 2013 09:18 EST</pubDate></item><item><title>On the definition of a confounder</title><link>http://projecteuclid.org/euclid.aos/1364302740</link><description>&lt;strong&gt;Tyler J. VanderWeele&lt;/strong&gt;, &lt;strong&gt;Ilya Shpitser&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 196--220.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
The causal inference literature has provided a clear formal definition of confounding expressed in terms of counterfactual independence. The literature has not, however, come to any consensus on a formal definition of a confounder, as it has given priority to the concept of confounding over that of a confounder. We consider a number of candidate definitions arising from various more informal statements made in the literature. We consider the properties satisfied by each candidate definition, principally focusing on (i) whether under the candidate definition control for all “confounders” suffices to control for “confounding” and (ii) whether each confounder in some context helps eliminate or reduce confounding bias. Several of the candidate definitions do not have these two properties. Only one candidate definition of those considered satisfies both properties. We propose that a “confounder” be defined as a pre-exposure covariate $C$ for which there exists a set of other covariates $X$ such that effect of the exposure on the outcome is unconfounded conditional on $(X,C)$ but such that for no proper subset of $(X,C)$ is the effect of the exposure on the outcome unconfounded given the subset. We also provide a conditional analogue of the above definition; and we propose a variable that helps reduce bias but not eliminate bias be referred to as a “surrogate confounder.” These definitions are closely related to those given by Robins and Morgenstern [ Comput. Math. Appl. 14 (1987) 869–916]. The implications that hold among the various candidate definitions are discussed.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302740_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>A general theory for nonlinear sufficient dimension reduction: Formulation and estimation</title><link>http://projecteuclid.org/euclid.aos/1364302741</link><description>&lt;strong&gt;Kuang-Yao Lee&lt;/strong&gt;, &lt;strong&gt;Bing Li&lt;/strong&gt;, &lt;strong&gt;Francesca Chiaromonte&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 221--249.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
In this paper we introduce a general theory for nonlinear sufficient dimension reduction, and explore its ramifications and scope. This theory subsumes recent work employing reproducing kernel Hilbert spaces, and reveals many parallels between linear and nonlinear sufficient dimension reduction. Using these parallels we analyze the properties of existing methods and develop new ones. We begin by characterizing dimension reduction at the general level of $\sigma$-fields and proceed to that of classes of functions, leading to the notions of sufficient, complete and central dimension reduction classes. We show that, when it exists, the complete and sufficient class coincides with the central class, and can be unbiasedly and exhaustively estimated by a generalized sliced inverse regression estimator (GSIR). When completeness does not hold, this estimator captures only part of the central class. However, in these cases we show that a generalized sliced average variance estimator (GSAVE) can capture a larger portion of the class. Both estimators require no numerical optimization because they can be computed by spectral decomposition of linear operators. Finally, we compare our estimators with existing methods by simulation and on actual data sets.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302741_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Efficient estimation in sufficient dimension reduction</title><link>http://projecteuclid.org/euclid.aos/1364302742</link><description>&lt;strong&gt;Yanyuan Ma&lt;/strong&gt;, &lt;strong&gt;Liping Zhu&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 250--268.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We develop an efficient estimation procedure for identifying and estimating the central subspace. Using a new way of parameterization, we convert the problem of identifying the central subspace to the problem of estimating a finite dimensional parameter in a semiparametric model. This conversion allows us to derive an efficient estimator which reaches the optimal semiparametric efficiency bound. The resulting efficient estimator can exhaustively estimate the central subspace without imposing any distributional assumptions. Our proposed efficient estimation also provides a possibility for making inference of parameters that uniquely identify the central subspace. We conduct simulation studies and a real data analysis to demonstrate the finite sample performance in comparison with several existing methods.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302742_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Weighted likelihood estimation under two-phase sampling</title><link>http://projecteuclid.org/euclid.aos/1364302743</link><description>&lt;strong&gt;Takumi Saegusa&lt;/strong&gt;, &lt;strong&gt;Jon A. Wellner&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 269--295.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We develop asymptotic theory for weighted likelihood estimators (WLE) under two-phase stratified sampling without replacement. We also consider several variants of WLEs involving estimated weights and calibration. A set of empirical process tools are developed including a Glivenko–Cantelli theorem, a theorem for rates of convergence of $M$-estimators, and a Donsker theorem for the inverse probability weighted empirical processes under two-phase sampling and sampling without replacement at the second phase. Using these general results, we derive asymptotic distributions of the WLE of a finite-dimensional parameter in a general semiparametric model where an estimator of a nuisance parameter is estimable either at regular or nonregular rates. We illustrate these results and methods in the Cox model with right censoring and interval censoring. We compare the methods via their asymptotic variances under both sampling without replacement and the more usual (and easier to analyze) assumption of Bernoulli sampling at the second phase.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302743_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>A Cramér moderate deviation theorem for Hotelling’s $T^{2}$-statistic with applications to global tests</title><link>http://projecteuclid.org/euclid.aos/1364302744</link><description>&lt;strong&gt;Weidong Liu&lt;/strong&gt;, &lt;strong&gt;Qi-Man Shao&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 296--322.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
A Cramér moderate deviation theorem for Hotelling’s $T^{2}$-statistic is proved under a finite $(3+\delta)$th moment. The result is applied to large scale tests on the equality of mean vectors and is shown that the number of tests can be as large as $e^{o(n^{1/3})}$ before the chi-squared distribution calibration becomes inaccurate. As an application of the moderate deviation results, a global test on the equality of $m$ mean vectors based on the maximum of Hotelling’s $T^{2}$-statistics is developed and its asymptotic null distribution is shown to be an extreme value type I distribution. A novel intermediate approximation to the null distribution is proposed to improve the slow convergence rate of the extreme distribution approximation. Numerical studies show that the new test procedure works well even for a small sample size and performs favorably in analyzing a breast cancer dataset.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302744_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Fiducial theory and optimal inference</title><link>http://projecteuclid.org/euclid.aos/1364302745</link><description>&lt;strong&gt;Gunnar Taraldsen&lt;/strong&gt;, &lt;strong&gt;Bo Henry Lindqvist&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 323--341.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
It is shown that the fiducial distribution in a group model, or more generally a quasigroup model, determines the optimal equivariant frequentist inference procedures. The proof does not rely on existence of invariant measures, and generalizes results corresponding to the choice of the right Haar measure as a Bayesian prior. Classical and more recent examples show that fiducial arguments can be used to give good candidates for exact or approximate confidence distributions. It is here suggested that the fiducial algorithm can be considered as an alternative to the Bayesian algorithm for the construction of good frequentist inference procedures more generally.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302745_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data</title><link>http://projecteuclid.org/euclid.aos/1364302746</link><description>&lt;strong&gt;Xuming He&lt;/strong&gt;, &lt;strong&gt;Lan Wang&lt;/strong&gt;, &lt;strong&gt;Hyokyoung Grace Hong&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 342--369.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We introduce a quantile-adaptive framework for nonlinear variable screening with high-dimensional heterogeneous data. This framework has two distinctive features: (1) it allows the set of active variables to vary across quantiles, thus making it more flexible to accommodate heterogeneity; (2) it is model-free and avoids the difficult task of specifying the form of a statistical model in a high dimensional space. Our nonlinear independence screening procedure employs spline approximations to model the marginal effects at a quantile level of interest. Under appropriate conditions on the quantile functions without requiring the existence of any moments, the new procedure is shown to enjoy the sure screening property in ultra-high dimensions. Furthermore, the quantile-adaptive framework can naturally handle censored data arising in survival analysis. We prove that the sure screening property remains valid when the response variable is subject to random right censoring. Numerical studies confirm the fine performance of the proposed method for various semiparametric models and its effectiveness to extract quantile-specific information from heteroscedastic data.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302746_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Convergence of latent mixing measures in finite and infinite mixture models</title><link>http://projecteuclid.org/euclid.aos/1364302747</link><description>&lt;strong&gt;XuanLong Nguyen&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 1, 370--400.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
This paper studies convergence behavior of latent mixing measures that arise in finite and infinite mixture models, using transportation distances (i.e., Wasserstein metrics). The relationship between Wasserstein distances on the space of mixing measures and $f$-divergence functionals such as Hellinger and Kullback–Leibler distances on the space of mixture distributions is investigated in detail using various identifiability conditions. Convergence in Wasserstein metrics for discrete measures implies convergence of individual atoms that provide support for the measures, thereby providing a natural interpretation of convergence of clusters in clustering applications where mixture models are typically employed. Convergence rates of posterior distributions for latent mixing measures are established, for both finite mixtures of multivariate distributions and infinite mixtures based on the Dirichlet process.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1364302747_Tue, 26 Mar 2013 09:00 EDT</guid><pubDate>Tue, 26 Mar 2013 09:00 EDT</pubDate></item><item><title>Learning loopy graphical models with latent variables: Efficient methods and guarantees</title><link>http://projecteuclid.org/euclid.aos/1366138196</link><description>&lt;strong&gt;Animashree Anandkumar&lt;/strong&gt;, &lt;strong&gt;Ragupathyraj Valluvan&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 401--435.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
The problem of structure estimation in graphical models with latent variables is considered. We characterize conditions for tractable graph estimation and develop efficient methods with provable guarantees. We consider models where the underlying Markov graph is locally tree-like, and the model is in the regime of correlation decay. For the special case of the Ising model, the number of samples $n$ required for structural consistency of our method scales as $n=\Omega(\theta_{\min}^{-\delta\eta(\eta+1)-2}\log p)$, where $p$ is the number of variables, $\theta_{\min}$ is the minimum edge potential, $\delta$ is the depth (i.e., distance from a hidden node to the nearest observed nodes), and $\eta$ is a parameter which depends on the bounds on node and edge potentials in the Ising model. Necessary conditions for structural consistency under any algorithm are derived and our method nearly matches the lower bound on sample requirements. Further, the proposed method is practical to implement and provides flexibility to control the number of latent variables and the cycle lengths in the output graph.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366138196_Tue, 16 Apr 2013 14:50 EDT</guid><pubDate>Tue, 16 Apr 2013 14:50 EDT</pubDate></item><item><title>Geometry of the faithfulness assumption in causal inference</title><link>http://projecteuclid.org/euclid.aos/1366138197</link><description>&lt;strong&gt;Caroline Uhler&lt;/strong&gt;, &lt;strong&gt;Garvesh Raskutti&lt;/strong&gt;, &lt;strong&gt;Peter Bühlmann&lt;/strong&gt;, &lt;strong&gt;Bin Yu&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 436--463.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Many algorithms for inferring causality rely heavily on the faithfulness assumption. The main justification for imposing this assumption is that the set of unfaithful distributions has Lebesgue measure zero, since it can be seen as a collection of hypersurfaces in a hypercube. However, due to sampling error the faithfulness condition alone is not sufficient for statistical estimation, and strong-faithfulness has been proposed and assumed to achieve uniform or high-dimensional consistency. In contrast to the plain faithfulness assumption, the set of distributions that is not strong-faithful has nonzero Lebesgue measure and in fact, can be surprisingly large as we show in this paper. We study the strong-faithfulness condition from a geometric and combinatorial point of view and give upper and lower bounds on the Lebesgue measure of strong-faithful distributions for various classes of directed acyclic graphs. Our results imply fundamental limitations for the PC-algorithm and potentially also for other algorithms based on partial correlation testing in the Gaussian case.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366138197_Tue, 16 Apr 2013 14:50 EDT</guid><pubDate>Tue, 16 Apr 2013 14:50 EDT</pubDate></item><item><title>On the conditional distributions of low-dimensional projections from high-dimensional data</title><link>http://projecteuclid.org/euclid.aos/1366138198</link><description>&lt;strong&gt;Hannes Leeb&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 464--483.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We study the conditional distribution of low-dimensional projections from high-dimensional data, where the conditioning is on other low-dimensional projections. To fix ideas, consider a random $d$-vector $Z$ that has a Lebesgue density and that is standardized so that $\mathbb{E} Z=0$ and $\mathbb{E} ZZ'=I_{d}$. Moreover, consider two projections defined by unit-vectors $\alpha$ and $\beta$, namely a response $y=\alpha'Z$ and an explanatory variable $x=\beta'Z$. It has long been known that the conditional mean of $y$ given $x$ is approximately linear in $x$, under some regularity conditions; cf. Hall and Li [ Ann. Statist. 21 (1993) 867–889]. However, a corresponding result for the conditional variance has not been available so far. We here show that the conditional variance of $y$ given $x$ is approximately constant in $x$ (again, under some regularity conditions). These results hold uniformly in $\alpha$ and for most $\beta$’s, provided only that the dimension of $Z$ is large. In that sense, we see that most linear submodels of a high-dimensional overall model are approximately correct. Our findings provide new insights in a variety of modeling scenarios. We discuss several examples, including sliced inverse regression, sliced average variance estimation, generalized linear models under potential link violation, and sparse linear modeling.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366138198_Tue, 16 Apr 2013 14:50 EDT</guid><pubDate>Tue, 16 Apr 2013 14:50 EDT</pubDate></item><item><title>Exact and asymptotically robust permutation tests</title><link>http://projecteuclid.org/euclid.aos/1366138199</link><description>&lt;strong&gt;EunYi Chung&lt;/strong&gt;, &lt;strong&gt;Joseph P. Romano&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 484--507.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Given independent samples from $P$ and $Q$, two-sample permutation tests allow one to construct exact level tests when the null hypothesis is $P=Q$. On the other hand, when comparing or testing particular parameters $\theta$ of $P$ and $Q$, such as their means or medians, permutation tests need not be level $\alpha$, or even approximately level $\alpha$ in large samples. Under very weak assumptions for comparing estimators, we provide a general test procedure whereby the asymptotic validity of the permutation test holds while retaining the exact rejection probability $\alpha$ in finite samples when the underlying distributions are identical. The ideas are broadly applicable and special attention is given to the $k$-sample problem of comparing general parameters, whereby a permutation test is constructed which is exact level $\alpha$ under the hypothesis of identical distributions, but has asymptotic rejection probability $\alpha$ under the more general null hypothesis of equality of parameters. A Monte Carlo simulation study is performed as well. A quite general theory is possible based on a coupling construction, as well as a key contiguity argument for the multinomial and multivariate hypergeometric distributions.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366138199_Tue, 16 Apr 2013 14:50 EDT</guid><pubDate>Tue, 16 Apr 2013 14:50 EDT</pubDate></item><item><title>Consistency under sampling of exponential random graph models</title><link>http://projecteuclid.org/euclid.aos/1366980556</link><description>&lt;strong&gt;Cosma Rohilla Shalizi&lt;/strong&gt;, &lt;strong&gt;Alessandro Rinaldo&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 508--535.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
The growing availability of network data and of scientific interest in distributed systems has led to the rapid development of statistical models of network structure. Typically, however, these are models for the entire network, while the data consists only of a sampled sub-network. Parameters for the whole network, which is what is of interest, are estimated by applying the model to the sub-network. This assumes that the model is consistent under sampling , or, in terms of the theory of stochastic processes, that it defines a projective family. Focusing on the popular class of exponential random graph models (ERGMs), we show that this apparently trivial condition is in fact violated by many popular and scientifically appealing models, and that satisfying it drastically limits ERGM’s expressive power. These results are actually special cases of more general results about exponential families of dependent random variables, which we also prove. Using such results, we offer easily checked conditions for the consistency of maximum likelihood estimation in ERGMs, and discuss some possible constructive responses.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980556_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>$\ell_{0}$-penalized maximum likelihood for sparse directed acyclic graphs</title><link>http://projecteuclid.org/euclid.aos/1366980557</link><description>&lt;strong&gt;Sara van de Geer&lt;/strong&gt;, &lt;strong&gt;Peter Bühlmann&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 536--567.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We consider the problem of regularized maximum likelihood estimation for the structure and parameters of a high-dimensional, sparse directed acyclic graphical (DAG) model with Gaussian distribution, or equivalently, of a Gaussian structural equation model. We show that the $\ell_{0}$-penalized maximum likelihood estimator of a DAG has about the same number of edges as the minimal-edge I-MAP (a DAG with minimal number of edges representing the distribution), and that it converges in Frobenius norm. We allow the number of nodes $p$ to be much larger than sample size $n$ but assume a sparsity condition and that any representation of the true DAG has at least a fixed proportion of its nonzero edge weights above the noise level. Our results do not rely on the faithfulness assumption nor on the restrictive strong faithfulness condition which are required for methods based on conditional independence testing such as the PC-algorithm.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980557_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>Fourier analysis of stationary time series in function space</title><link>http://projecteuclid.org/euclid.aos/1366980558</link><description>&lt;strong&gt;Victor M. Panaretos&lt;/strong&gt;, &lt;strong&gt;Shahin Tavakoli&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 568--603.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We develop the basic building blocks of a frequency domain framework for drawing statistical inferences on the second-order structure of a stationary sequence of functional data. The key element in such a context is the spectral density operator, which generalises the notion of a spectral density matrix to the functional setting, and characterises the second-order dynamics of the process. Our main tool is the functional Discrete Fourier Transform (fDFT). We derive an asymptotic Gaussian representation of the fDFT, thus allowing the transformation of the original collection of dependent random functions into a collection of approximately independent complex-valued Gaussian random functions. Our results are then employed in order to construct estimators of the spectral density operator based on smoothed versions of the periodogram kernel, the functional generalisation of the periodogram matrix. The consistency and asymptotic law of these estimators are studied in detail. As immediate consequences, we obtain central limit theorems for the mean and the long-run covariance operator of a stationary functional time series. Our results do not depend on structural modelling assumptions, but only functional versions of classical cumulant mixing conditions, and are shown to be stable under discrete observation of the individual curves.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980558_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>Low rank estimation of smooth kernels on graphs</title><link>http://projecteuclid.org/euclid.aos/1366980559</link><description>&lt;strong&gt;Vladimir Koltchinskii&lt;/strong&gt;, &lt;strong&gt;Pedro Rangel&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 604--640.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Let $(V,A)$ be a weighted graph with a finite vertex set $V$, with a symmetric matrix of nonnegative weights $A$ and with Laplacian $\Delta$. Let $S_{\ast}: V\times V\mapsto{\mathbb{R}}$ be a symmetric kernel defined on the vertex set $V$. Consider $n$ i.i.d. observations $(X_{j},X_{j}',Y_{j})$, $j=1,\ldots,n$, where $X_{j}$, $X_{j}'$ are independent random vertices sampled from the uniform distribution in $V$ and $Y_{j}\in{\mathbb{R}}$ is a real valued response variable such that ${\mathbb{E}}(Y_{j}|X_{j},X_{j}')=S_{\ast}(X_{j},X_{j}')$, $j=1,\ldots,n$. The goal is to estimate the kernel $S_{\ast}$ based on the data $(X_{1},X_{1}',Y_{1}),\ldots,(X_{n},X_{n}',Y_{n})$ and under the assumption that $S_{\ast}$ is low rank and, at the same time, smooth on the graph (the smoothness being characterized by discrete Sobolev norms defined in terms of the graph Laplacian). We obtain several results for such problems including minimax lower bounds on the $L_{2}$-error and upper bounds for penalized least squares estimators both with nonconvex and with convex penalties.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980559_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>Moderate deviations for a nonparametric estimator of sample coverage</title><link>http://projecteuclid.org/euclid.aos/1366980560</link><description>&lt;strong&gt;Fuqing Gao&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 641--669.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
In this paper, we consider moderate deviations for Good’s coverage estimator. The moderate deviation principle and the self-normalized moderate deviation principle for Good’s coverage estimator are established. The results are also applied to the hypothesis testing problem and the confidence interval for the coverage.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980560_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>Sequential multi-sensor change-point detection</title><link>http://projecteuclid.org/euclid.aos/1366980561</link><description>&lt;strong&gt;Yao Xie&lt;/strong&gt;, &lt;strong&gt;David Siegmund&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 670--692.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We develop a mixture procedure to monitor parallel streams of data for a change-point that affects only a subset of them, without assuming a spatial structure relating the data streams to one another. Observations are assumed initially to be independent standard normal random variables. After a change-point the observations in a subset of the streams of data have nonzero mean values. The subset and the post-change means are unknown. The procedure we study uses stream specific generalized likelihood ratio statistics, which are combined to form an overall detection statistic in a mixture model that hypothesizes an assumed fraction $p_{0}$ of affected data streams. An analytic expression is obtained for the average run length (ARL) when there is no change and is shown by simulations to be very accurate. Similarly, an approximation for the expected detection delay (EDD) after a change-point is also obtained. Numerical examples are given to compare the suggested procedure to other procedures for unstructured problems and in one case where the problem is assumed to have a well-defined geometric structure. Finally we discuss sensitivity of the procedure to the assumed value of $p_{0}$ and suggest a generalization.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980561_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>The multi-armed bandit problem with covariates</title><link>http://projecteuclid.org/euclid.aos/1366980562</link><description>&lt;strong&gt;Vianney Perchet&lt;/strong&gt;, &lt;strong&gt;Philippe Rigollet&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 693--721.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
We consider a multi-armed bandit problem in a setting where each arm produces a noisy reward realization which depends on an observable random covariate . As opposed to the traditional static multi-armed bandit problem, this setting allows for dynamically changing rewards that better describe applications where side information is available. We adopt a nonparametric model where the expected rewards are smooth functions of the covariate and where the hardness of the problem is captured by a margin parameter. To maximize the expected cumulative reward, we introduce a policy called Adaptively Binned Successive Elimination (ABSE) that adaptively decomposes the global problem into suitably “localized” static bandit problems. This policy constructs an adaptive partition using a variant of the Successive Elimination (SE) policy. Our results include sharper regret bounds for the SE policy in a static bandit problem and minimax optimal regret bounds for the ABSE policy in the dynamic problem.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1366980562_Fri, 26 Apr 2013 08:50 EDT</guid><pubDate>Fri, 26 Apr 2013 08:50 EDT</pubDate></item><item><title>Adaptive confidence intervals for regression functions under shape constraints</title><link>http://projecteuclid.org/euclid.aos/1368018171</link><description>&lt;strong&gt;T. Tony Cai&lt;/strong&gt;, &lt;strong&gt;Mark G. Low&lt;/strong&gt;, &lt;strong&gt;Yin Xia&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 722--750.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Adaptive confidence intervals for regression functions are constructed under shape constraints of monotonicity and convexity. A natural benchmark is established for the minimum expected length of confidence intervals at a given function in terms of an analytic quantity, the local modulus of continuity. This bound depends not only on the function but also the assumed function class. These benchmarks show that the constructed confidence intervals have near minimum expected length for each individual function, while maintaining a given coverage probability for functions within the class. Such adaptivity is much stronger than adaptive minimaxity over a collection of large parameter spaces.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1368018171_Wed, 08 May 2013 09:03 EDT</guid><pubDate>Wed, 08 May 2013 09:03 EDT</pubDate></item><item><title>Density-sensitive semisupervised inference</title><link>http://projecteuclid.org/euclid.aos/1368018172</link><description>&lt;strong&gt;Martin Azizyan&lt;/strong&gt;, &lt;strong&gt;Aarti Singh&lt;/strong&gt;, &lt;strong&gt;Larry Wasserman&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 751--771.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Semisupervised methods are techniques for using labeled data $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ together with unlabeled data $X_{n+1},\ldots,X_{N}$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_{X}$ of $X$ to the regression function $f(x)$. For example, it is common to assume that $f$ is very smooth over high density regions of $P_{X}$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_{X}$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1368018172_Wed, 08 May 2013 09:03 EDT</guid><pubDate>Wed, 08 May 2013 09:03 EDT</pubDate></item><item><title>Sparse principal component analysis and iterative thresholding</title><link>http://projecteuclid.org/euclid.aos/1368018173</link><description>&lt;strong&gt;Zongming Ma&lt;/strong&gt;&lt;p&gt;&lt;strong&gt;Source: &lt;/strong&gt;Ann. Statist., Volume 41, Number 2, 772--801.&lt;/p&gt;&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;br/&gt; 
 
Principal component analysis (PCA) is a classical dimension reduction method which projects data onto the principal subspace spanned by the leading eigenvectors of the covariance matrix. However, it behaves poorly when the number of features $p$ is comparable to, or even much larger than, the sample size $n$. In this paper, we propose a new iterative thresholding approach for estimating principal subspaces in the setting where the leading eigenvectors are sparse. Under a spiked covariance model, we find that the new approach recovers the principal subspace and leading eigenvectors consistently, and even optimally, in a range of high-dimensional sparse settings. Simulated examples also demonstrate its competitive performance.
 
 &lt;/p&gt;</description><guid isPermaLink="false">projecteuclid.org/euclid.aos/1368018173_Wed, 08 May 2013 09:03 EDT</guid><pubDate>Wed, 08 May 2013 09:03 EDT</pubDate></item></channel>
</rss>
