Bernoulli Articles (Project Euclid)
http://projecteuclid.org/euclid.bj
The latest articles from Bernoulli on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2010 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Thu, 05 Aug 2010 15:41 EDTTue, 05 Apr 2011 09:14 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
A new method for obtaining sharp compound Poisson approximation error estimates for sums of locally dependent random variables
http://projecteuclid.org/euclid.bj/1274821072
<strong>Michael V. Boutsikas</strong>, <strong>Eutichia Vaggelatou</strong><p><strong>Source: </strong>Bernoulli, Volume 16, Number 2, 301--330.</p><p><strong>Abstract:</strong><br/>
Let X 1 , X 2 , …, X n be a sequence of independent or locally dependent random variables taking values in ℤ + . In this paper, we derive sharp bounds, via a new probabilistic method, for the total variation distance between the distribution of the sum ∑ i =1 n X i and an appropriate Poisson or compound Poisson distribution. These bounds include a factor which depends on the smoothness of the approximating Poisson or compound Poisson distribution. This “smoothness factor” is of order O( σ −2 ), according to a heuristic argument, where σ 2 denotes the variance of the approximating distribution. In this way, we offer sharp error estimates for a large range of values of the parameters. Finally, specific examples concerning appearances of rare runs in sequences of Bernoulli trials are presented by way of illustration.
</p>projecteuclid.org/euclid.bj/1274821072_Thu, 05 Aug 2010 15:41 EDTThu, 05 Aug 2010 15:41 EDTEntropy production in nonlinear recombination modelshttps://projecteuclid.org/euclid.bj/1524038754<strong>Pietro Caputo</strong>, <strong>Alistair Sinclair</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3246--3282.</p><p><strong>Abstract:</strong><br/>
We study the convergence to equilibrium of a class of nonlinear recombination models. In analogy with Boltzmann’s H-theorem from kinetic theory, and in contrast with previous analysis of these models, convergence is measured in terms of relative entropy. The problem is formulated within a general framework that we refer to as Reversible Quadratic Systems. Our main result is a tight quantitative estimate for the entropy production functional. Along the way, we establish some new entropy inequalities generalizing Shearer’s and related inequalities.
</p>projecteuclid.org/euclid.bj/1524038754_20180418040607Wed, 18 Apr 2018 04:06 EDTBounded size biased couplings, log concave distributions and concentration of measure for occupancy modelshttps://projecteuclid.org/euclid.bj/1524038755<strong>Jay Bartroff</strong>, <strong>Larry Goldstein</strong>, <strong>Ümit Işlak</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3283--3317.</p><p><strong>Abstract:</strong><br/>
Threshold-type counts based on multivariate occupancy models with log concave marginals admit bounded size biased couplings under weak conditions, leading to new concentration of measure results for random graphs, germ-grain models in stochastic geometry and multinomial allocation models. The results obtained compare favorably with classical methods, including the use of McDiarmid’s inequality, negative association, and self bounding functions.
</p>projecteuclid.org/euclid.bj/1524038755_20180418040607Wed, 18 Apr 2018 04:06 EDTParametric inference for nonsynchronously observed diffusion processes in the presence of market microstructure noisehttps://projecteuclid.org/euclid.bj/1524038756<strong>Teppei Ogihara</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3318--3383.</p><p><strong>Abstract:</strong><br/>
We study parametric inference for diffusion processes when observations occur nonsynchronously and are contaminated by market microstructure noise. We construct a quasi-likelihood function and study asymptotic mixed normality of maximum-likelihood- and Bayes-type estimators based on it. We also prove the local asymptotic normality of the model and asymptotic efficiency of our estimator when the diffusion coefficients are deterministic and noise follows a normal distribution. We conjecture that our estimator is asymptotically efficient even when the latent process is a general diffusion process. An estimator for the quadratic covariation of the latent process is also constructed. Some numerical examples show that this estimator performs better compared to existing estimators of the quadratic covariation.
</p>projecteuclid.org/euclid.bj/1524038756_20180418040607Wed, 18 Apr 2018 04:06 EDTThe Gamma Stein equation and noncentral de Jong theoremshttps://projecteuclid.org/euclid.bj/1524038757<strong>Christian Döbler</strong>, <strong>Giovanni Peccati</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3384--3421.</p><p><strong>Abstract:</strong><br/>
We study the Stein equation associated with the one-dimensional Gamma distribution, and provide novel bounds, allowing one to effectively deal with test functions supported by the whole real line. We apply our estimates to derive new quantitative results involving random variables that are non-linear functionals of random fields, namely: (i) a non-central quantitative de Jong theorem for sequences of degenerate $U$-statistics satisfying minimal uniform integrability conditions, significantly extending previous findings by de Jong ( J. Multivariate Anal. 34 (1990) 275–289), Nourdin, Peccati and Reinert ( Ann. Probab. 38 (2010) 1947–1985) and Döbler and Peccati ( Electron. J. Probab. 22 (2017) no. 2), (ii) a new Gamma approximation bound on the Poisson space, refining previous estimates by Peccati and Thäle ( ALEA Lat. Am. J. Probab. Math. Stat. 10 (2013) 525–560) and (iii) new Gamma bounds on a Gaussian space, strengthening estimates by Nourdin and Peccati ( Probab. Theory Related Fields 145 (2009) 75–118). As a by-product of our analysis, we also deduce a new inequality for Gamma approximations via exchangeable pairs, that is of independent interest.
</p>projecteuclid.org/euclid.bj/1524038757_20180418040607Wed, 18 Apr 2018 04:06 EDTExpected number and height distribution of critical points of smooth isotropic Gaussian random fieldshttps://projecteuclid.org/euclid.bj/1524038758<strong>Dan Cheng</strong>, <strong>Armin Schwartzman</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3422--3446.</p><p><strong>Abstract:</strong><br/>
We obtain formulae for the expected number and height distribution of critical points of smooth isotropic Gaussian random fields parameterized on Euclidean space or spheres of arbitrary dimension. The results hold in general in the sense that there are no restrictions on the covariance function of the field except for smoothness and isotropy. The results are based on a characterization of the distribution of the Hessian of the Gaussian field by means of the family of Gaussian orthogonally invariant (GOI) matrices, of which the Gaussian orthogonal ensemble (GOE) is a special case. The obtained formulae depend on the covariance function only through a single parameter (Euclidean space) or two parameters (spheres), and include the special boundary case of random Laplacian eigenfunctions.
</p>projecteuclid.org/euclid.bj/1524038758_20180418040607Wed, 18 Apr 2018 04:06 EDTA unified matrix model including both CCA and F matrices in multivariate analysis: The largest eigenvalue and its applicationshttps://projecteuclid.org/euclid.bj/1524038759<strong>Xiao Han</strong>, <strong>Guangming Pan</strong>, <strong>Qing Yang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3447--3468.</p><p><strong>Abstract:</strong><br/>
Let $\mathbf{Z}_{M_{1}\times N}=\mathbf{T}^{\frac{1}{2}}\mathbf{X}$ where $(\mathbf{T}^{\frac{1}{2}})^{2}=\mathbf{T}$ is a positive definite matrix and $\mathbf{X}$ consists of independent random variables with mean zero and variance one. This paper proposes a unified matrix model \[\mathbf{\Omega}=(\mathbf{Z}\mathbf{U}_{2}\mathbf{U}_{2}^{T}\mathbf{Z}^{T})^{-1}\mathbf{Z}\mathbf{U}_{1}\mathbf{U}_{1}^{T}\mathbf{Z}^{T},\] where $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ are isometric with dimensions $N\times N_{1}$ and $N\times(N-N_{2})$ respectively such that $\mathbf{U}_{1}^{T}\mathbf{U}_{1}=\mathbf{I}_{N_{1}}$, $\mathbf{U}_{2}^{T}\mathbf{U}_{2}=\mathbf{I}_{N-N_{2}}$ and $\mathbf{U}_{1}^{T}\mathbf{U}_{2}=0$. Moreover, $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ (random or non-random) are independent of $\mathbf{Z}_{M_{1}\times N}$ and with probability tending to one, $\operatorname{rank}(\mathbf{U}_{1})=N_{1}$ and $\operatorname{rank}(\mathbf{U}_{2})=N-N_{2}$. We establish the asymptotic Tracy–Widom distribution for its largest eigenvalue under moment assumptions on $\mathbf{X}$ when $N_{1},N_{2}$ and $M_{1}$ are comparable.
The asymptotic distributions of the maximum eigenvalues of the matrices used in Canonical Correlation Analysis (CCA) and of F matrices (including centered and non-centered versions) can be both obtained from that of $\mathbf{\Omega}$ by selecting appropriate matrices $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$. Moreover, via appropriate matrices $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$, this matrix $\mathbf{\Omega}$ can be applied to some multivariate testing problems that cannot be done by both types of matrices. To see this, we explore two more applications. One is in the MANOVA approach for testing the equivalence of several high-dimensional mean vectors, where $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ are chosen to be two nonrandom matrices. The other one is in the multivariate linear model for testing the unknown parameter matrix, where $\mathbf{U}_{1}$ and $\mathbf{U}_{2}$ are random. For each application, theoretical results are developed and various numerical studies are conducted to investigate the empirical performance.
</p>projecteuclid.org/euclid.bj/1524038759_20180418040607Wed, 18 Apr 2018 04:06 EDTStatistical inference for the doubly stochastic self-exciting processhttps://projecteuclid.org/euclid.bj/1524038760<strong>Simon Clinet</strong>, <strong>Yoann Potiron</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3469--3493.</p><p><strong>Abstract:</strong><br/>
We introduce and show the existence of a Hawkes self-exciting point process with exponentially-decreasing kernel and where parameters are time-varying. The quantity of interest is defined as the integrated parameter $T^{-1}\int_{0}^{T}\theta_{t}^{*}\,dt$, where $\theta_{t}^{*}$ is the time-varying parameter, and we consider the high-frequency asymptotics. To estimate it naïvely, we chop the data into several blocks, compute the maximum likelihood estimator (MLE) on each block, and take the average of the local estimates. The asymptotic bias explodes asymptotically, thus we provide a non-naïve estimator which is constructed as the naïve one when applying a first-order bias reduction to the local MLE. We show the associated central limit theorem. Monte Carlo simulations show the importance of the bias correction and that the method performs well in finite sample, whereas the empirical study discusses the implementation in practice and documents the stochastic behavior of the parameters.
</p>projecteuclid.org/euclid.bj/1524038760_20180418040607Wed, 18 Apr 2018 04:06 EDTSmall deviations of a Galton–Watson process with immigrationhttps://projecteuclid.org/euclid.bj/1524038761<strong>Nadia Sidorova</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3494--3521.</p><p><strong>Abstract:</strong><br/>
We consider a Galton–Watson process with immigration $(\mathcal{Z}_{n})$, with offspring probabilities $(p_{i})$ and immigration probabilities $(q_{i})$. In the case when $p_{0}=0$, $p_{1}\neq0$, $q_{0}=0$ (that is, when $\operatorname{essinf}(\mathcal{Z}_{n})$ grows linearly in $n$), we establish the asymptotics of the left tail $\mathbb{P}\{\mathcal{W}<\varepsilon\}$, as $\varepsilon\downarrow0$, of the martingale limit $\mathcal{W}$ of the process $(\mathcal{Z}_{n})$. Further, we consider the first generation $\mathcal{K}$ such that $\mathcal{Z}_{\mathcal{K}}\operatorname{essinf}(\mathcal{Z}_{\mathcal{K}})$ and study the asymptotic behaviour of $\mathcal{K}$ conditionally on $\{\mathcal{W}<\varepsilon\}$, as $\varepsilon\downarrow 0$. We find the growth scale and the fluctuations of $\mathcal{K}$ and compare the results with those for standard Galton–Watson processes.
</p>projecteuclid.org/euclid.bj/1524038761_20180418040607Wed, 18 Apr 2018 04:06 EDTTesting for simultaneous jumps in case of asynchronous observationshttps://projecteuclid.org/euclid.bj/1524038762<strong>Ole Martin</strong>, <strong>Mathias Vetter</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3522--3567.</p><p><strong>Abstract:</strong><br/>
This paper proposes a novel test for simultaneous jumps in a bivariate Itô semimartingale when observation times are asynchronous and irregular. Inference is built on a realized correlation coefficient for the squared jumps of the two processes which is estimated using bivariate power variations of Hayashi–Yoshida type without an additional synchronization step. An associated central limit theorem is shown whose asymptotic distribution is assessed using a bootstrap procedure. Simulations show that the test works remarkably well in comparison with the much simpler case of regular observations.
</p>projecteuclid.org/euclid.bj/1524038762_20180418040607Wed, 18 Apr 2018 04:06 EDTStatistical estimation of the Oscillating Brownian Motionhttps://projecteuclid.org/euclid.bj/1524038763<strong>Antoine Lejay</strong>, <strong>Paolo Pigato</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3568--3602.</p><p><strong>Abstract:</strong><br/>
We study the asymptotic behavior of estimators of a two-valued, discontinuous diffusion coefficient in a Stochastic Differential Equation, called an Oscillating Brownian Motion. Using the relation of the latter process with the Skew Brownian Motion, we propose two natural consistent estimators, which are variants of the integrated volatility estimator and take the occupation times into account. We show the stable convergence of the renormalized errors’ estimations toward some Gaussian mixture, possibly corrected by a term that depends on the local time. These limits stem from the lack of ergodicity as well as the behavior of the local time at zero of the process. We test both estimators on simulated processes, finding a complete agreement with the theoretical predictions.
</p>projecteuclid.org/euclid.bj/1524038763_20180418040607Wed, 18 Apr 2018 04:06 EDTCorrelated continuous time random walks and fractional Pearson diffusionshttps://projecteuclid.org/euclid.bj/1524038764<strong>N.N. Leonenko</strong>, <strong>I. Papić</strong>, <strong>A. Sikorskii</strong>, <strong>N. Šuvak</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3603--3627.</p><p><strong>Abstract:</strong><br/>
Continuous time random walks have random waiting times between particle jumps. We define the correlated continuous time random walks (CTRWs) that converge to fractional Pearson diffusions (fPDs). The jumps in these CTRWs are obtained from Markov chains through the Bernoulli urn-scheme model and Wright–Fisher model. The jumps are correlated so that the limiting processes are not Lévy but diffusion processes with non-independent increments. The waiting times are selected from the domain of attraction of a stable law.
</p>projecteuclid.org/euclid.bj/1524038764_20180418040607Wed, 18 Apr 2018 04:06 EDTDetecting Markov random fields hidden in white noisehttps://projecteuclid.org/euclid.bj/1524038765<strong>Ery Arias-Castro</strong>, <strong>Sébastien Bubeck</strong>, <strong>Gábor Lugosi</strong>, <strong>Nicolas Verzelen</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3628--3656.</p><p><strong>Abstract:</strong><br/>
Motivated by change point problems in time series and the detection of textured objects in images, we consider the problem of detecting a piece of a Gaussian Markov random field hidden in white Gaussian noise. We derive minimax lower bounds and propose near-optimal tests.
</p>projecteuclid.org/euclid.bj/1524038765_20180418040607Wed, 18 Apr 2018 04:06 EDTLarge volatility matrix estimation with factor-based diffusion model for high-frequency financial datahttps://projecteuclid.org/euclid.bj/1524038766<strong>Donggyu Kim</strong>, <strong>Yi Liu</strong>, <strong>Yazhen Wang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3657--3682.</p><p><strong>Abstract:</strong><br/>
Large volatility matrices are involved in many finance practices, and estimating large volatility matrices based on high-frequency financial data encounters the “curse of dimensionality”. It is a common approach to impose a sparsity assumption on the large volatility matrices to produce consistent volatility matrix estimators. However, due to the existence of common factors, assets are highly correlated with each other, and it is not reasonable to assume the volatility matrices are sparse in financial applications. This paper incorporates factor influence in the asset pricing model and investigates large volatility matrix estimation under the factor price model together with some sparsity assumption. We propose to model asset prices by assuming that asset prices are governed by common factors and that the assets with similar characteristics share the same association with the factors. We then impose some reasonable sparsity condition on the part of the volatility matrices after accounting for the factor contribution. Under the proposed factor-based model and sparsity assumption, we develop an estimation scheme called “blocking and regularizing”. Asymptotic properties of the proposed estimator are studied, and its finite sample performance is tested via extensive numerical studies to support theoretical results.
</p>projecteuclid.org/euclid.bj/1524038766_20180418040607Wed, 18 Apr 2018 04:06 EDTAdaptive estimation of high-dimensional signal-to-noise ratioshttps://projecteuclid.org/euclid.bj/1524038767<strong>Nicolas Verzelen</strong>, <strong>Elisabeth Gassiat</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3683--3710.</p><p><strong>Abstract:</strong><br/>
We consider the equivalent problems of estimating the residual variance, the proportion of explained variance $\eta$ and the signal strength in a high-dimensional linear regression model with Gaussian random design. Our aim is to understand the impact of not knowing the sparsity of the vector of regression coefficients and not knowing the distribution of the design on minimax estimation rates of $\eta$. Depending on the sparsity $k$ of the vector regression coefficients, optimal estimators of $\eta$ either rely on estimating the vector of regression coefficients or are based on $U$-type statistics. In the important situation where $k$ is unknown, we build an adaptive procedure whose convergence rate simultaneously achieves the minimax risk over all $k$ up to a logarithmic loss which we prove to be non avoidable. Finally, the knowledge of the design distribution is shown to play a critical role. When the distribution of the design is unknown, consistent estimation of explained variance is indeed possible in much narrower regimes than for known design distribution.
</p>projecteuclid.org/euclid.bj/1524038767_20180418040607Wed, 18 Apr 2018 04:06 EDTEfficient strategy for the Markov chain Monte Carlo in high-dimension with heavy-tailed target probability distributionhttps://projecteuclid.org/euclid.bj/1524038768<strong>Kengo Kamatani</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3711--3750.</p><p><strong>Abstract:</strong><br/>
The purpose of this paper is to introduce a new Markov chain Monte Carlo method and to express its effectiveness by simulation and high-dimensional asymptotic theory. The key fact is that our algorithm has a reversible proposal kernel, which is designed to have a heavy-tailed invariant probability distribution. A high-dimensional asymptotic theory is studied for a class of heavy-tailed target probability distributions. When the number of dimensions of the state space passes to infinity, we will show that our algorithm has a much higher convergence rate than the pre-conditioned Crank–Nicolson (pCN) algorithm and the random-walk Metropolis algorithm.
</p>projecteuclid.org/euclid.bj/1524038768_20180418040607Wed, 18 Apr 2018 04:06 EDTThe class of multivariate max-id copulas with $\ell_{1}$-norm symmetric exponent measurehttps://projecteuclid.org/euclid.bj/1524038769<strong>Christian Genest</strong>, <strong>Johanna G. Nešlehová</strong>, <strong>Louis-Paul Rivest</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3751--3790.</p><p><strong>Abstract:</strong><br/>
Members of the well-known family of bivariate Galambos copulas can be expressed in a closed form in terms of the univariate Fréchet distribution. This formula extends to any dimension and can be used to define a whole new class of tractable multivariate copulas that are generated by suitable univariate distributions. This paper gives necessary and sufficient conditions on the underlying univariate distribution which ensure that the resulting copula exists. It is also shown that these new copulas are in fact dependence structures of certain max-id distributions with $\ell_{1}$-norm symmetric exponent measure. The basic dependence properties of this new class of multivariate exchangeable copulas is investigated, and an efficient algorithm is provided for generating observations from distributions in this class.
</p>projecteuclid.org/euclid.bj/1524038769_20180418040607Wed, 18 Apr 2018 04:06 EDTOptimal estimation of a large-dimensional covariance matrix under Stein’s losshttps://projecteuclid.org/euclid.bj/1524038770<strong>Olivier Ledoit</strong>, <strong>Michael Wolf</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3791--3832.</p><p><strong>Abstract:</strong><br/>
This paper introduces a new method for deriving covariance matrix estimators that are decision-theoretically optimal within a class of nonlinear shrinkage estimators. The key is to employ large-dimensional asymptotics: the matrix dimension and the sample size go to infinity together, with their ratio converging to a finite, nonzero limit. As the main focus, we apply this method to Stein’s loss. Compared to the estimator of Stein (Estimation of a covariance matrix (1975); J. Math. Sci. 34 (1986) 1373–1403), ours has five theoretical advantages: (1) it asymptotically minimizes the loss itself, instead of an estimator of the expected loss; (2) it does not necessitate post-processing via an ad hoc algorithm (called “isotonization”) to restore the positivity or the ordering of the covariance matrix eigenvalues; (3) it does not ignore any terms in the function to be minimized; (4) it does not require normality; and (5) it is not limited to applications where the sample size exceeds the dimension. In addition to these theoretical advantages, our estimator also improves upon Stein’s estimator in terms of finite-sample performance, as evidenced via extensive Monte Carlo simulations. To further demonstrate the effectiveness of our method, we show that some previously suggested estimators of the covariance matrix and its inverse are decision-theoretically optimal in the large-dimensional asymptotic limit with respect to the Frobenius loss function.
</p>projecteuclid.org/euclid.bj/1524038770_20180418040607Wed, 18 Apr 2018 04:06 EDTCovariance estimation via sparse Kronecker structureshttps://projecteuclid.org/euclid.bj/1524038771<strong>Chenlei Leng</strong>, <strong>Guangming Pan</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3833--3863.</p><p><strong>Abstract:</strong><br/>
The problem of estimating covariance matrices is central to statistical analysis and is extensively addressed when data are vectors. This paper studies a novel Kronecker-structured approach for estimating such matrices when data are matrices and arrays. Focusing on matrix-variate data, we present simple approaches to estimate the row and the column correlation matrices, formulated separately via convex optimization. We also discuss simple thresholding estimators motivated by the recent development in the literature. Non-asymptotic results show that the proposed method greatly outperforms methods that ignore the matrix structure of the data. In particular, our framework allows the dimensionality of data to be arbitrary order even for fixed sample size, and works for flexible distributions beyond normality. Simulations and data analysis further confirm the competitiveness of the method. An extension to general array-data is also outlined.
</p>projecteuclid.org/euclid.bj/1524038771_20180418040607Wed, 18 Apr 2018 04:06 EDTRobust dimension-free Gram operator estimateshttps://projecteuclid.org/euclid.bj/1524038772<strong>Ilaria Giulini</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3864--3923.</p><p><strong>Abstract:</strong><br/>
In this paper, we investigate the question of estimating the Gram operator by a robust estimator from an i.i.d. sample in a separable Hilbert space and we present uniform bounds that hold under weak moment assumptions. The approach consists in first obtaining non-asymptotic dimension-free bounds in finite-dimensional spaces using some PAC-Bayesian inequalities related to Gaussian perturbations of the parameter and then in generalizing the results in a separable Hilbert space. We show both from a theoretical point of view and with the help of some simulations that such a robust estimator improves the behavior of the classical empirical one in the case of heavy tail data distributions.
</p>projecteuclid.org/euclid.bj/1524038772_20180418040607Wed, 18 Apr 2018 04:06 EDTUniform dimension results for a family of Markov processeshttps://projecteuclid.org/euclid.bj/1524038773<strong>Xiaobin Sun</strong>, <strong>Yimin Xiao</strong>, <strong>Lihu Xu</strong>, <strong>Jianliang Zhai</strong>. <p><strong>Source: </strong>Bernoulli, Volume 24, Number 4B, 3924--3951.</p><p><strong>Abstract:</strong><br/>
In this paper, we prove uniform Hausdorff and packing dimension results for the images of a large family of Markov processes. The main tools are the two covering principles in Xiao (In Fractal Geometry and Applications: A Jubilee of Benoît Mandelbrot, Part 2 (2004) 261–338 Amer. Math. Soc.). As applications, uniform Hausdorff and packing dimension results for certain classes of Lévy processes, stable jump diffusions and non-symmetric stable-type processes are obtained.
</p>projecteuclid.org/euclid.bj/1524038773_20180418040607Wed, 18 Apr 2018 04:06 EDTAdaptive risk bounds in unimodal regressionhttps://projecteuclid.org/euclid.bj/1544605234<strong>Sabyasachi Chatterjee</strong>, <strong>John Lafferty</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 1--25.</p><p><strong>Abstract:</strong><br/>
We study the statistical properties of the least squares estimator in unimodal sequence estimation. Although closely related to isotonic regression, unimodal regression has not been as extensively studied. We show that the unimodal least squares estimator is adaptive in the sense that the risk scales as a function of the number of values in the true underlying sequence. Such adaptivity properties have been shown for isotonic regression by Chatterjee et al. ( Ann. Statist. 43 (2015) 1774–1800) and Bellec (Sharp oracle inequalities for Least Squares estimators in shape restricted regression (2016)). A technical complication in unimodal regression is the non-convexity of the underlying parameter space. We develop a general variational representation of the risk that holds whenever the parameter space can be expressed as a finite union of convex sets, using techniques that may be of interest in other settings.
</p>projecteuclid.org/euclid.bj/1544605234_20181212040117Wed, 12 Dec 2018 04:01 ESTLeading the field: Fortune favors the bold in Thurstonian choice modelshttps://projecteuclid.org/euclid.bj/1544605237<strong>Steven N. Evans</strong>, <strong>Ronald L. Rivest</strong>, <strong>Philip B. Stark</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 26--46.</p><p><strong>Abstract:</strong><br/>
Schools with the highest average student performance are often the smallest schools; localities with the highest rates of some cancers are frequently small; and the effects observed in clinical trials are likely to be largest for the smallest numbers of subjects. Informal explanations of this “small-schools phenomenon” point to the fact that the sample means of smaller samples have higher variances. But this cannot be a complete explanation: If we draw two samples from a diffuse distribution that is symmetric about some point, then the chance that the smaller sample has larger mean is 50%. A particular consequence of results proved below is that if one draws three or more samples of different sizes from the same normal distribution, then the sample mean of the smallest sample is most likely to be highest, the sample mean of the second smallest sample is second most likely to be highest, and so on; this is true even though for any pair of samples, each one of the pair is equally likely to have the larger sample mean. The same effect explains why heteroscedasticity can result in misleadingly small nominal $p$-values in nonparametric tests of association.
Our conclusions are relevant to certain stochastic choice models, including the following generalization of Thurstone’s Law of Comparative Judgment. There are $n$ items. Item $i$ is preferred to item $j$ if $Z_{i}<Z_{j}$, where $Z$ is a random $n$-vector of preference scores. Suppose $\mathbb{P}\{Z_{i}=Z_{j}\}=0$ for $i\ne j$, so there are no ties. Item $k$ is the favorite if $Z_{k}<\min_{i\ne k}Z_{i}$. Let $p_{i}$ denote the chance that item $i$ is the favorite. We characterize a large class of distributions for $Z$ for which $p_{1}>p_{2}>\cdots>p_{n}$. Our results are most surprising when $\mathbb{P}\{Z_{i}<Z_{j}\}=\mathbb{P}\{Z_{i}>Z_{j}\}=\frac{1}{2}$ for $i\ne j$, so neither of any two items is likely to be preferred over the other in a pairwise comparison. Then, under suitable assumptions, $p_{1}>p_{2}>\cdots>p_{n}$ when the variability of $Z_{i}$ decreases with $i$ in an appropriate sense. Our conclusions echo the proverb “Fortune favors the bold.”
</p>projecteuclid.org/euclid.bj/1544605237_20181212040117Wed, 12 Dec 2018 04:01 ESTSelf-consistent confidence sets and tests of composite hypotheses applicable to restricted parametershttps://projecteuclid.org/euclid.bj/1544605238<strong>David R. Bickel</strong>, <strong>Alexandre G. Patriota</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 47--74.</p><p><strong>Abstract:</strong><br/>
Frequentist methods, without the coherence guarantees of fully Bayesian methods, are known to yield self-contradictory inferences in certain settings. The framework introduced in this paper provides a simple adjustment to $p$ values and confidence sets to ensure the mutual consistency of all inferences without sacrificing frequentist validity. Based on a definition of the compatibility of a composite hypothesis with the observed data given any parameter restriction and on the requirement of self-consistency, the adjustment leads to the possibility and necessity measures of possibility theory rather than to the posterior probability distributions of Bayesian and fiducial inference.
</p>projecteuclid.org/euclid.bj/1544605238_20181212040117Wed, 12 Dec 2018 04:01 ESTRigid stationary determinantal processes in non-Archimedean fieldshttps://projecteuclid.org/euclid.bj/1544605239<strong>Yanqi Qiu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 75--88.</p><p><strong>Abstract:</strong><br/>
Let $F$ be a non-discrete non-Archimedean local field. For any subset $S\subset F$ with finite Haar measure, there is a stationary determinantal point process on $F$ with correlation kernel $\widehat{\mathbh{1}}_{S}(x-y)$, where $\widehat{\mathbh{1}}_{S}$ is the Fourier transform of the indicator function $\mathbh{1}_{S}$. In this note, we give a geometrical condition on the subset $S$, such that the associated determinantal point process is rigid in the sense of Ghosh and Peres. Our geometrical condition is very different from the Euclidean case.
</p>projecteuclid.org/euclid.bj/1544605239_20181212040117Wed, 12 Dec 2018 04:01 ESTStein’s method and approximating the quantum harmonic oscillatorhttps://projecteuclid.org/euclid.bj/1544605240<strong>Ian W. McKeague</strong>, <strong>Erol A. Peköz</strong>, <strong>Yvik Swan</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 89--111.</p><p><strong>Abstract:</strong><br/>
Hall et al. [ Phys. Rev. X 4 (2014) 041013] recently proposed that quantum theory can be understood as the continuum limit of a deterministic theory in which there is a large, but finite, number of classical “worlds.” A resulting Gaussian limit theorem for particle positions in the ground state, agreeing with quantum theory, was conjectured in Hall et al. [ Phys. Rev. X 4 (2014) 041013] and proven by McKeague and Levin [ Ann. Appl. Probab. 26 (2016) 2540–2555] using Stein’s method. In this article we show how quantum position probability densities for higher energy levels beyond the ground state may arise as distributional fixed points in a new generalization of Stein’s method. These are then used to obtain a rate of distributional convergence for conjectured particle positions in the first energy level above the ground state to the (two-sided) Maxwell distribution; new techniques must be developed for this setting where the usual “density approach” Stein solution (see Chatterjee and Shao [ Ann. Appl. Probab. 21 (2011) 464–483] has a singularity.
</p>projecteuclid.org/euclid.bj/1544605240_20181212040117Wed, 12 Dec 2018 04:01 ESTVerifiable conditions for the irreducibility and aperiodicity of Markov chains by analyzing underlying deterministic modelshttps://projecteuclid.org/euclid.bj/1544605241<strong>Alexandre Chotard</strong>, <strong>Anne Auger</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 112--147.</p><p><strong>Abstract:</strong><br/>
We consider Markov chains that obey the following general non-linear state space model: $\Phi_{k+1}=F(\Phi_{k},\alpha(\Phi_{k},U_{k+1}))$ where the function $F$ is $C^{1}$ while $\alpha$ is typically discontinuous and $\{U_{k}:k\in\mathbb{Z}_{>0}\}$ is an independent and identically distributed process. We assume that for all $x$, the random variable $\alpha(x,U_{1})$ admits a density $p_{x}$ such that $(x,w)\mapsto p_{x}(w)$ is lower semi-continuous.
We generalize and extend previous results that connect properties of the underlying deterministic control model to provide conditions for the chain to be $\varphi$-irreducible and aperiodic. By building on those results, we show that if a rank condition on the controllability matrix is satisfied for all $x$, there is equivalence between the existence of a globally attracting state for the control model and $\varphi$-irreducibility of the Markov chain. Additionally, under the same rank condition on the controllability matrix, we prove that there is equivalence between the existence of a steadily attracting state and the $\varphi$-irreducibility and aperiodicity of the chain. The notion of steadily attracting state is new. We additionally derive practical conditions by showing that the rank condition on the controllability matrix needs to be verified only at a globally attracting state (resp. steadily attracting state) for the chain to be a $\varphi$-irreducible $T$-chain (resp. $\varphi$-irreducible aperiodic $T$-chain).
Those results hold under considerably weaker assumptions on the model than previous ones that would require $(x,u)\mapsto F(x,\alpha(x,u))$ to be $C^{\infty}$ (while it can be discontinuous here). Additionally the establishment of a necessary and sufficient condition on the control model for the $\varphi$-irreducibility and aperiodicity without a structural assumption on the control set is novel – even for Markov chains where $(x,u)\mapsto F(x,\alpha(x,u))$ is $C^{\infty}$.
We illustrate that the conditions are easy to verify on a non-trivial and non-artificial example of Markov chain arising in the context of adaptive stochastic search algorithms to optimize continuous functions in a black-box scenario.
</p>projecteuclid.org/euclid.bj/1544605241_20181212040117Wed, 12 Dec 2018 04:01 ESTRecovering the Brownian coalescent point process from the Kingman coalescent by conditional samplinghttps://projecteuclid.org/euclid.bj/1544605242<strong>Amaury Lambert</strong>, <strong>Emmanuel Schertzer</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 148--173.</p><p><strong>Abstract:</strong><br/>
We consider a continuous population whose dynamics is described by the standard stationary Fleming–Viot process, so that the genealogy of $n$ uniformly sampled individuals is distributed as the Kingman $n$-coalescent. In this note, we study some genealogical properties of this population when the sample is conditioned to fall entirely into a subpopulation with most recent common ancestor (MRCA) shorter than $\varepsilon$. First, using the comb representation of the total genealogy (Lambert and Uribe Bravo ( P-Adic Numbers Ultrametric Anal. Appl. 9 (2017) 22–38)), we show that the genealogy of the descendance of the MRCA of the sample on the timescale $\varepsilon$ converges as $\varepsilon\to0$. The limit is the so-called Brownian coalescent point process (CPP) stopped at an independent Gamma random variable with parameter $n$, which can be seen as the genealogy at a large time of the total population of a rescaled critical birth–death process, biased by the $n$th power of its size. Second, we show that in this limit the coalescence times of the $n$ sampled individuals are i.i.d. uniform random variables in $(0,1)$. These results provide a coupling between two standard models for the genealogy of a random exchangeable population: the Kingman coalescent and the Brownian CPP.
</p>projecteuclid.org/euclid.bj/1544605242_20181212040117Wed, 12 Dec 2018 04:01 ESTSubexponential decay in kinetic Fokker–Planck equation: Weak hypocoercivityhttps://projecteuclid.org/euclid.bj/1544605243<strong>Shulan Hu</strong>, <strong>Xinyu Wang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 174--188.</p><p><strong>Abstract:</strong><br/>
We consider here quantitative convergence to equilibrium for the kinetic Fokker–Planck equation. We present a weak hypocoercivity approach à la Villani, using weak Poincaré inequality, ensuring subexponential convergence to equilibrium in $\mathcal{H}^{1}$ sense or in $L^{2}$ sense.
</p>projecteuclid.org/euclid.bj/1544605243_20181212040117Wed, 12 Dec 2018 04:01 ESTPólya urns with immigration at random timeshttps://projecteuclid.org/euclid.bj/1544605244<strong>Erol Peköz</strong>, <strong>Adrian Röllin</strong>, <strong>Nathan Ross</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 189--220.</p><p><strong>Abstract:</strong><br/>
We study the number of white balls in a classical Pólya urn model with the additional feature that, at random times, a black ball is added to the urn. The number of draws between these random times are i.i.d. and, under certain moment conditions on the inter-arrival distribution, we characterize the limiting distribution of the (properly scaled) number of white balls as the number of draws goes to infinity. The possible limiting distributions obtained in this way vary considerably depending on the inter-arrival distribution and are difficult to describe explicitly. However, we show that the limits are fixed points of certain probabilistic distributional transformations, and this fact provides a proof of convergence and leads to properties of the limits. The model can alternatively be viewed as a preferential attachment random graph model where added vertices initially have a random number of edges, and from this perspective, our results describe the limit of the degree of a fixed vertex.
</p>projecteuclid.org/euclid.bj/1544605244_20181212040117Wed, 12 Dec 2018 04:01 ESTFeller property of the multiplicative coalescent with linear deletionhttps://projecteuclid.org/euclid.bj/1544605245<strong>Balázs Ráth</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 221--240.</p><p><strong>Abstract:</strong><br/>
We modify the definition of Aldous’ multiplicative coalescent process ( Ann. Probab. 25 (1997) 812–854) and introduce the multiplicative coalescent with linear deletion (MCLD). A state of this process is a square-summable decreasing sequence of cluster sizes. Pairs of clusters merge with a rate equal to the product of their sizes and clusters are deleted with a rate linearly proportional to their size. We prove that the MCLD is a Feller process. This result is a key ingredient in the description of scaling limits of the evolution of component sizes of the mean field frozen percolation model ( J. Stat. Phys. 137 (2009) 459–499) and the so-called rigid representation of such scaling limits ( Electron. J. Probab. To appear).
</p>projecteuclid.org/euclid.bj/1544605245_20181212040117Wed, 12 Dec 2018 04:01 ESTAsymptotic power of Rao’s score test for independence in high dimensionshttps://projecteuclid.org/euclid.bj/1544605246<strong>Dennis Leung</strong>, <strong>Qiman Shao</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 241--263.</p><p><strong>Abstract:</strong><br/>
Let $\mathbf{R}$ be the Pearson correlation matrix of $m$ normal random variables. The Rao’s score test for the independence hypothesis $H_{0}:\mathbf{R}=\mathbf{I}_{m}$, where $\mathbf{I}_{m}$ is the identity matrix of dimension $m$, was first considered by Schott ( Biometrika 92 (2005) 951–956) in the high dimensional setting. In this paper, we study the exact power function of this test, under an asymptotic regime in which both $m$ and the sample size $n$ tend to infinity with the ratio $m/n$ upper bounded by a constant. In particular, our result implies that the Rao’s score test is minimax rate-optimal for detecting the dependency signal $\Vert\mathbf{R}-\mathbf{I}_{m}\Vert_{F}$ of order $\sqrt{m/n}$, where $\Vert\cdot\Vert_{F}$ is the matrix Frobenius norm.
</p>projecteuclid.org/euclid.bj/1544605246_20181212040117Wed, 12 Dec 2018 04:01 ESTExtreme M-quantiles as risk measures: From $L^{1}$ to $L^{p}$ optimizationhttps://projecteuclid.org/euclid.bj/1544605247<strong>Abdelaati Daouia</strong>, <strong>Stéphane Girard</strong>, <strong>Gilles Stupfler</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 264--309.</p><p><strong>Abstract:</strong><br/>
The class of quantiles lies at the heart of extreme-value theory and is one of the basic tools in risk management. The alternative family of expectiles is based on squared rather than absolute error loss minimization. It has recently been receiving a lot of attention in actuarial science, econometrics and statistical finance. Both quantiles and expectiles can be embedded in a more general class of M-quantiles by means of $L^{p}$ optimization. These generalized $L^{p}$-quantiles steer an advantageous middle course between ordinary quantiles and expectiles without sacrificing their virtues too much for $1<p<2$. In this paper, we investigate their estimation from the perspective of extreme values in the class of heavy-tailed distributions. We construct estimators of the intermediate $L^{p}$-quantiles and establish their asymptotic normality in a dependence framework motivated by financial and actuarial applications, before extrapolating these estimates to the very far tails. We also investigate the potential of extreme $L^{p}$-quantiles as a tool for estimating the usual quantiles and expectiles themselves. We show the usefulness of extreme $L^{p}$-quantiles and elaborate the choice of $p$ through applications to some simulated and financial real data.
</p>projecteuclid.org/euclid.bj/1544605247_20181212040117Wed, 12 Dec 2018 04:01 ESTError bounds for sequential Monte Carlo samplers for multimodal distributionshttps://projecteuclid.org/euclid.bj/1544605248<strong>Daniel Paulin</strong>, <strong>Ajay Jasra</strong>, <strong>Alexandre Thiery</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 310--340.</p><p><strong>Abstract:</strong><br/>
In this paper, we provide bounds on the asymptotic variance for a class of sequential Monte Carlo (SMC) samplers designed for approximating multimodal distributions. Such methods combine standard SMC methods and Markov chain Monte Carlo (MCMC) kernels. Our bounds improve upon previous results, and unlike some earlier work, they also apply in the case when the MCMC kernels can move between the modes. We apply our results to the Potts model from statistical physics. In this case, the problem of sharp peaks is encountered. Earlier methods, such as parallel tempering, are only able to sample from it at an exponential (in an important parameter of the model) cost. We propose a sequence of interpolating distributions called interpolation to independence , and show that the SMC sampler based on it is able to sample from this target distribution at a polynomial cost. We believe that our method is generally applicable to many other distributions as well.
</p>projecteuclid.org/euclid.bj/1544605248_20181212040117Wed, 12 Dec 2018 04:01 ESTOn the convex Poincaré inequality and weak transportation inequalitieshttps://projecteuclid.org/euclid.bj/1544605249<strong>Radosław Adamczak</strong>, <strong>Michał Strzelecki</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 341--374.</p><p><strong>Abstract:</strong><br/>
We prove that for a probability measure on $\mathbb{R}^{n}$, the Poincaré inequality for convex functions is equivalent to the weak transportation inequality with a quadratic-linear cost. This generalizes recent results by Gozlan, Roberto, Samson, Shu, Tetali and Feldheim, Marsiglietti, Nayar, Wang, concerning probability measures on the real line.
The proof relies on modified logarithmic Sobolev inequalities of Bobkov–Ledoux type for convex and concave functions, which are of independent interest.
We also present refined concentration inequalities for general (not necessarily Lipschitz) convex functions, complementing recent results by Bobkov, Nayar, and Tetali.
</p>projecteuclid.org/euclid.bj/1544605249_20181212040117Wed, 12 Dec 2018 04:01 ESTOn the longest gap between power-rate arrivalshttps://projecteuclid.org/euclid.bj/1544605250<strong>Søren Asmussen</strong>, <strong>Jevgenijs Ivanovs</strong>, <strong>Johan Segers</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 375--394.</p><p><strong>Abstract:</strong><br/>
Let $L_{t}$ be the longest gap before time $t$ in an inhomogeneous Poisson process with rate function $\lambda_{t}$ proportional to $t^{\alpha-1}$ for some $\alpha\in(0,1)$. It is shown that $\lambda_{t}L_{t}-b_{t}$ has a limiting Gumbel distribution for suitable constants $b_{t}$ and that the distance of this longest gap from $t$ is asymptotically of the form $(t/\log t)E$ for an exponential random variable $E$. The analysis is performed via weak convergence of related point processes. Subject to a weak technical condition, the results are extended to include a slowly varying term in $\lambda_{t}$.
</p>projecteuclid.org/euclid.bj/1544605250_20181212040117Wed, 12 Dec 2018 04:01 ESTNonparametric depth and quantile regression for functional datahttps://projecteuclid.org/euclid.bj/1544605251<strong>Joydeep Chowdhury</strong>, <strong>Probal Chaudhuri</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 395--423.</p><p><strong>Abstract:</strong><br/>
We investigate nonparametric regression methods based on spatial depth and quantiles when the response and the covariate are both functions. As in classical quantile regression for finite dimensional data, regression techniques developed here provide insight into the influence of the functional covariate on different parts, like the center as well as the tails, of the conditional distribution of the functional response. Depth and quantile based nonparametric regression methods are useful to detect heteroscedasticity in functional regression. We derive the asymptotic behavior of the nonparametric depth and quantile regression estimates, which depend on the small ball probabilities in the covariate space. Our nonparametric regression procedures are used to analyze a dataset about the influence of per capita GDP on saving rates for 125 countries, and another dataset on the effects of per capita net disposable income on the sale of cigarettes in some states in the US.
</p>projecteuclid.org/euclid.bj/1544605251_20181212040117Wed, 12 Dec 2018 04:01 ESTEstimation and hypotheses testing in boundary regression modelshttps://projecteuclid.org/euclid.bj/1544605252<strong>Holger Drees</strong>, <strong>Natalie Neumeyer</strong>, <strong>Leonie Selk</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 424--463.</p><p><strong>Abstract:</strong><br/>
Consider a nonparametric regression model with one-sided errors and regression function in a general Hölder class. We estimate the regression function via minimization of the local integral of a polynomial approximation. We show uniform rates of convergence for the simple regression estimator as well as for a smooth version. These rates carry over to mean regression models with a symmetric and bounded error distribution. In such a setting, one obtains faster rates for irregular error distributions concentrating sufficient mass near the endpoints than for the usual regular distributions. The results are applied to prove asymptotic $\sqrt{n}$-equivalence of a residual-based (sequential) empirical distribution function to the (sequential) empirical distribution function of unobserved errors in the case of irregular error distributions. This result is remarkably different from corresponding results in mean regression with regular errors. It can readily be applied to develop goodness-of-fit tests for the error distribution. We present some examples and investigate the small sample performance in a simulation study. We further discuss asymptotically distribution-free hypotheses tests for independence of the error distribution from the points of measurement and for monotonicity of the boundary function as well.
</p>projecteuclid.org/euclid.bj/1544605252_20181212040117Wed, 12 Dec 2018 04:01 ESTConsistent order estimation for nonparametric hidden Markov modelshttps://projecteuclid.org/euclid.bj/1544605253<strong>Luc Lehéricy</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 464--498.</p><p><strong>Abstract:</strong><br/>
We consider the problem of estimating the number of hidden states (the order ) of a nonparametric hidden Markov model (HMM). We propose two different methods and prove their almost sure consistency without any prior assumption, be it on the order or on the emission distributions. This is the first time a consistency result is proved in such a general setting without using restrictive assumptions such as a priori upper bounds on the order or parametric restrictions on the emission distributions. Our main method relies on the minimization of a penalized least squares criterion. In addition to the consistency of the order estimation, we also prove that this method yields rate minimax adaptive estimators of the parameters of the HMM – up to a logarithmic factor. Our second method relies on estimating the rank of a matrix obtained from the distribution of two consecutive observations. Finally, numerical experiments are used to compare both methods and study their ability to select the right order in several situations.
</p>projecteuclid.org/euclid.bj/1544605253_20181212040117Wed, 12 Dec 2018 04:01 ESTCentral limit theorem for Fourier transform and periodogram of random fieldshttps://projecteuclid.org/euclid.bj/1544605254<strong>Magda Peligrad</strong>, <strong>Na Zhang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 499--520.</p><p><strong>Abstract:</strong><br/>
In this paper, we show that the limiting distribution of the real and the imaginary part of the Fourier transform of a stationary random field is almost surely an independent vector with Gaussian marginal distributions, whose variance is, up to a constant, the field’s spectral density. The dependence structure of the random field is general and we do not impose any restrictions on the speed of convergence to zero of the covariances, or smoothness of the spectral density. The only condition required is that the variables are adapted to a commuting filtration and are regular in some sense. The results go beyond the Bernoulli fields and apply to both short range and long range dependence. They can be easily applied to derive the asymptotic behavior of the periodogram associated to the random field. The method of proof is based on new probabilistic methods involving martingale approximations and also on borrowed and new tools from harmonic analysis. Several examples to linear, Volterra and Gaussian random fields will be presented.
</p>projecteuclid.org/euclid.bj/1544605254_20181212040117Wed, 12 Dec 2018 04:01 ESTA multidimensional analogue of the arcsine law for the number of positive terms in a random walkhttps://projecteuclid.org/euclid.bj/1544605255<strong>Zakhar Kabluchko</strong>, <strong>Vladislav Vysotsky</strong>, <strong>Dmitry Zaporozhets</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 521--548.</p><p><strong>Abstract:</strong><br/>
Consider a random walk $S_{i}=\xi_{1}+\cdots+\xi_{i}$, $i\in\mathbb{N}$, whose increments $\xi_{1},\xi_{2},\ldots$ are independent identically distributed random vectors in $\mathbb{R}^{d}$ such that $\xi_{1}$ has the same law as $-\xi_{1}$ and $\mathbb{P}[\xi_{1}\in H]=0$ for every affine hyperplane $H\subset\mathbb{R}^{d}$. Our main result is the distribution-free formula
\[\mathbb{E}\bigg[\sum_{1\leq i_{1}<\cdots<i_{k}\leq n}\mathbb{1}_{\{0\notin\operatorname{Conv}(S_{i_{1}},\ldots,S_{i_{k}})\}}\bigg]=2\binom{n}{k}\frac{B(k,d-1)+B(k,d-3)+\cdots}{2^{k}k!},\] where the $B(k,j)$’s are defined by their generating function $(t+1)(t+3)\ldots(t+2k-1)=\sum_{j=0}^{k}B(k,j)t^{j}$. The expected number of $k$-tuples above admits the following geometric interpretation: it is the expected number of $k$-dimensional faces of a randomly and uniformly sampled open Weyl chamber of type $B_{n}$ that are not intersected by a generic linear subspace $L\subset\mathbb{R}^{n}$ of codimension $d$. The case $d=1$ turns out to be equivalent to the classical discrete arcsine law for the number of positive terms in a one-dimensional random walk with continuous symmetric distribution of increments. We also prove similar results for random bridges with no central symmetry assumption required.
</p>projecteuclid.org/euclid.bj/1544605255_20181212040117Wed, 12 Dec 2018 04:01 ESTLimit properties of the monotone rearrangement for density and regression function estimationhttps://projecteuclid.org/euclid.bj/1544605256<strong>Dragi Anevski</strong>, <strong>Anne-Laure Fougères</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 549--583.</p><p><strong>Abstract:</strong><br/>
The monotone rearrrangement algorithm was introduced by Hardy, Littlewood and Pólya as a sorting device for functions. Assuming that $x$ is a monotone function and that an estimate $x_{n}$ of $x$ is given, consider the monotone rearrangement $\hat{x}_{n}$ of $x_{n}$. This new estimator is shown to be uniformly consistent as soon as $x_{n}$ is. Under suitable assumptions, pointwise limit distribution results for $\hat{x}_{n}$ are obtained. The framework is general and allows for weakly dependent and long range dependent stationary data. Applications in monotone density and regression function estimation are detailed. Asymptotics for rearrangement estimators with vanishing derivatives are also obtained in these two contexts.
</p>projecteuclid.org/euclid.bj/1544605256_20181212040117Wed, 12 Dec 2018 04:01 ESTSequential Monte Carlo as approximate sampling: bounds, adaptive resampling via $\infty$-ESS, and an application to particle Gibbshttps://projecteuclid.org/euclid.bj/1544605257<strong>Jonathan H. Huggins</strong>, <strong>Daniel M. Roy</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 584--622.</p><p><strong>Abstract:</strong><br/>
Sequential Monte Carlo (SMC) algorithms were originally designed for estimating intractable conditional expectations within state-space models, but are now routinely used to generate approximate samples in the context of general-purpose Bayesian inference. In particular, SMC algorithms are often used as subroutines within larger Monte Carlo schemes, and in this context, the demands placed on SMC are different: control of mean-squared error is insufficient—one needs to control the divergence from the target distribution directly. Towards this goal, we introduce the conditional adaptive resampling particle filter, building on the work of Gordon, Salmond, and Smith (1993), Andrieu, Doucet, and Holenstein (2010), and Whiteley, Lee, and Heine (2016). By controlling a novel notion of effective sample size, the $\infty$-ESS, we establish the efficiency of the resulting SMC sampling algorithm, providing an adaptive resampling extension of the work of Andrieu, Lee, and Vihola (2018). We apply our results to arrive at new divergence bounds for SMC samplers with adaptive resampling as well as an adaptive resampling version of the Particle Gibbs algorithm with the same geometric-ergodicity guarantees as its nonadaptive counterpart.
</p>projecteuclid.org/euclid.bj/1544605257_20181212040117Wed, 12 Dec 2018 04:01 ESTOptimal rates of statistical seriationhttps://projecteuclid.org/euclid.bj/1544605258<strong>Nicolas Flammarion</strong>, <strong>Cheng Mao</strong>, <strong>Philippe Rigollet</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 623--653.</p><p><strong>Abstract:</strong><br/>
Given a matrix, the seriation problem consists in permuting its rows in such way that all its columns have the same shape, for example, they are monotone increasing. We propose a statistical approach to this problem where the matrix of interest is observed with noise and study the corresponding minimax rate of estimation of the matrices. Specifically, when the columns are either unimodal or monotone, we show that the least squares estimator is optimal up to logarithmic factors and adapts to matrices with a certain natural structure. Finally, we propose a computationally efficient estimator in the monotonic case and study its performance both theoretically and experimentally. Our work is at the intersection of shape constrained estimation and recent work that involves permutation learning, such as graph denoising and ranking.
</p>projecteuclid.org/euclid.bj/1544605258_20181212040117Wed, 12 Dec 2018 04:01 ESTSecond order correctness of perturbation bootstrap M-estimator of multiple linear regression parameterhttps://projecteuclid.org/euclid.bj/1544605259<strong>Debraj Das</strong>, <strong>S.N. Lahiri</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 654--682.</p><p><strong>Abstract:</strong><br/>
Consider the multiple linear regression model $y_{i}=\mathbf{x}'_{i}\boldsymbol{\beta}+\varepsilon_{i}$, where $\varepsilon_{i}$’s are independent and identically distributed random variables, $\mathbf{x}_{i}$’s are known design vectors and $\boldsymbol{\beta}$ is the $p\times1$ vector of parameters. An effective way of approximating the distribution of the M-estimator $\bar{\boldsymbol{\beta}}_{n}$, after proper centering and scaling, is the Perturbation Bootstrap Method. In this current work, second order results of this non-naive bootstrap method have been investigated. Second order correctness is important for reducing the approximation error uniformly to $o(n^{-1/2})$ to get better inferences. We show that the classical studentized version of the bootstrapped estimator fails to be second order correct. We introduce an innovative modification in the studentized version of the bootstrapped statistic and show that the modified bootstrapped pivot is second order correct (S.O.C.) for approximating the distribution of the studentized M-estimator. Additionally, we show that the Perturbation Bootstrap continues to be S.O.C. when the errors $\varepsilon_{i}$’s are independent, but may not be identically distributed. These findings establish perturbation Bootstrap approximation as a significant improvement over asymptotic normality in the regression M-estimation.
</p>projecteuclid.org/euclid.bj/1544605259_20181212040117Wed, 12 Dec 2018 04:01 ESTRandom polymers on the complete graphhttps://projecteuclid.org/euclid.bj/1544605260<strong>Francis Comets</strong>, <strong>Gregorio Moreno</strong>, <strong>Alejandro F. Ramí rez</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 683--711.</p><p><strong>Abstract:</strong><br/>
Consider directed polymers in a random environment on the complete graph of size $N$. This model can be formulated as a product of i.i.d. $N\times N$ random matrices and its large time asymptotics is captured by Lyapunov exponents and the Furstenberg measure. We detail this correspondence, derive the long-time limit of the model and obtain a co-variant distribution for the polymer path.
Next, we observe that the model becomes exactly solvable when the disorder variables are located on edges of the complete graph and follow a totally asymmetric stable law of index $\alpha\in(0,1)$. Then, a certain notion of mean height of the polymer behaves like a random walk and we show that the height function is distributed around this mean according to an explicit law. Large $N$ asymptotics can be taken in this setting, for instance, for the free energy of the system and for the invariant law of the polymer height with a shift. Moreover, we give some perturbative results for environments which are close to the totally asymmetric stable laws.
</p>projecteuclid.org/euclid.bj/1544605260_20181212040117Wed, 12 Dec 2018 04:01 ESTSum rules and large deviations for spectral matrix measureshttps://projecteuclid.org/euclid.bj/1544605261<strong>Fabrice Gamboa</strong>, <strong>Jan Nagel</strong>, <strong>Alain Rouault</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 712--741.</p><p><strong>Abstract:</strong><br/>
In the paradigm of random matrices, one of the most classical object under study is the empirical spectral distribution. This random measure is the uniform distribution supported by the eigenvalues of the random matrix. In this paper, we give large deviation theorems for another popular object built on Hermitian random matrices: the spectral measure. This last probability measure is a random weighted version of the empirical spectral distribution. The weights involve the eigenvectors of the random matrix. We have previously studied the large deviations of the spectral measure in the case of scalar weights. Here, we will focus on matrix valued weights. Our probabilistic results lead to deterministic ones called “sum rules” in spectral theory. A sum rule relative to a reference measure on $\mathbb{R}$ is a relationship between the reversed Kullback–Leibler divergence of a positive measure on $\mathbb{R}$ and some non-linear functional built on spectral elements related to this measure. By using only probabilistic tools of large deviations, we extend the sum rules to the case of Hermitian matrix-valued measures.
</p>projecteuclid.org/euclid.bj/1544605261_20181212040117Wed, 12 Dec 2018 04:01 ESTWeak subordination of multivariate Lévy processes and variance generalised gamma convolutionshttps://projecteuclid.org/euclid.bj/1544605262<strong>Boris Buchmann</strong>, <strong>Kevin W. Lu</strong>, <strong>Dilip B. Madan</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 742--770.</p><p><strong>Abstract:</strong><br/>
Subordinating a multivariate Lévy process, the subordinate, with a univariate subordinator gives rise to a pathwise construction of a new Lévy process, provided the subordinator and the subordinate are independent processes. The variance-gamma model in finance was generated accordingly from a Brownian motion and a gamma process. Alternatively, multivariate subordination can be used to create Lévy processes, but this requires the subordinate to have independent components. In this paper, we show that there exists another operation acting on pairs $(T,X)$ of Lévy processes which creates a Lévy process $X\odot T$. Here, $T$ is a subordinator, but $X$ is an arbitrary Lévy process with possibly dependent components. We show that this method is an extension of both univariate and multivariate subordination and provide two applications. We illustrate our methods giving a weak formulation of the variance-$\boldsymbol{\alpha}$-gamma process that exhibits a wider range of dependence than using traditional subordination. Also, the variance generalised gamma convolution class of Lévy processes formed by subordinating Brownian motion with Thorin subordinators is further extended using weak subordination.
</p>projecteuclid.org/euclid.bj/1544605262_20181212040117Wed, 12 Dec 2018 04:01 ESTEstimating the interaction graph of stochastic neural dynamicshttps://projecteuclid.org/euclid.bj/1544605263<strong>Aline Duarte</strong>, <strong>Antonio Galves</strong>, <strong>Eva Löcherbach</strong>, <strong>Guilherme Ost</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 1, 771--792.</p><p><strong>Abstract:</strong><br/>
In this paper, we address the question of statistical model selection for a class of stochastic models of biological neural nets. Models in this class are systems of interacting chains with memory of variable length. Each chain describes the activity of a single neuron, indicating whether it spikes or not at a given time. The spiking probability of a given neuron depends on the time evolution of its presynaptic neurons since its last spike time. When a neuron spikes, its potential is reset to a resting level and postsynaptic current pulses are generated, modifying the membrane potential of all its postsynaptic neurons . The relationship between a neuron and its pre- and postsynaptic neurons defines an oriented graph, the interaction graph of the model. The goal of this paper is to estimate this graph based on the observation of the spike activity of a finite set of neurons over a finite time. We provide explicit exponential upper bounds for the probabilities of under- and overestimating the interaction graph restricted to the observed set and obtain the strong consistency of the estimator. Our result does not require stationarity nor uniqueness of the invariant measure of the process.
</p>projecteuclid.org/euclid.bj/1544605263_20181212040117Wed, 12 Dec 2018 04:01 ESTExpansion for moments of regression quantiles with applications to nonparametric testinghttps://projecteuclid.org/euclid.bj/1551862835<strong>Enno Mammen</strong>, <strong>Ingrid Van Keilegom</strong>, <strong>Kyusang Yu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 793--827.</p><p><strong>Abstract:</strong><br/>
We discuss nonparametric tests for parametric specifications of regression quantiles. The test is based on the comparison of parametric and nonparametric fits of these quantiles. The nonparametric fit is a Nadaraya–Watson quantile smoothing estimator.
An asymptotic treatment of the test statistic requires the development of new mathematical arguments. An approach that makes only use of plugging in a Bahadur expansion of the nonparametric estimator is not satisfactory. It requires too strong conditions on the dimension and the choice of the bandwidth.
Our alternative mathematical approach requires the calculation of moments of Nadaraya–Watson quantile regression estimators. This calculation is done by application of higher order Edgeworth expansions.
</p>projecteuclid.org/euclid.bj/1551862835_20190306040120Wed, 06 Mar 2019 04:01 ESTOn squared Bessel particle systemshttps://projecteuclid.org/euclid.bj/1551862836<strong>Piotr Graczyk</strong>, <strong>Jacek Małecki</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 828--847.</p><p><strong>Abstract:</strong><br/>
We study the existence and uniqueness of solutions of SDEs describing squared Bessel particle systems in full generality. We define nonnegative and non-colliding squared Bessel particle systems and we study their properties. Particle systems dissatisfying non-colliding and unicity properties are pointed out. The structure of squared Bessel particle systems is described.
</p>projecteuclid.org/euclid.bj/1551862836_20190306040120Wed, 06 Mar 2019 04:01 ESTSmooth, identifiable supermodels of discrete DAG models with latent variableshttps://projecteuclid.org/euclid.bj/1551862837<strong>Robin J. Evans</strong>, <strong>Thomas S. Richardson</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 848--876.</p><p><strong>Abstract:</strong><br/>
We provide a parameterization of the discrete nested Markov model, which is a supermodel that approximates DAG models (Bayesian network models) with latent variables. Such models are widely used in causal inference and machine learning. We explicitly evaluate their dimension, show that they are curved exponential families of distributions, and fit them to data. The parameterization avoids the irregularities and unidentifiability of latent variable models. The parameters used are all fully identifiable and causally-interpretable quantities.
</p>projecteuclid.org/euclid.bj/1551862837_20190306040120Wed, 06 Mar 2019 04:01 ESTBayesian consistency for a nonparametric stationary Markov modelhttps://projecteuclid.org/euclid.bj/1551862838<strong>Minwoo Chae</strong>, <strong>Stephen G. Walker</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 877--901.</p><p><strong>Abstract:</strong><br/>
We consider posterior consistency for a Markov model with a novel class of nonparametric prior. In this model, the transition density is parameterized via a mixing distribution function. Therefore, the Wasserstein distance between mixing measures can be used to construct neighborhoods of a transition density. The Wasserstein distance is sufficiently strong, for example, if the mixing distributions are compactly supported, it dominates the sup-$L_{1}$ metric. We provide sufficient conditions for posterior consistency with respect to the Wasserstein metric provided that the true transition density is also parametrized via a mixing distribution. In general, when it is not be parameterized by a mixing distribution, we show the posterior distribution is consistent with respect to the average $L_{1}$ metric. Also, we provide a prior whose support is sufficiently large to contain most smooth transition densities.
</p>projecteuclid.org/euclid.bj/1551862838_20190306040120Wed, 06 Mar 2019 04:01 ESTLow-frequency estimation of continuous-time moving average Lévy processeshttps://projecteuclid.org/euclid.bj/1551862839<strong>Denis Belomestny</strong>, <strong>Vladimir Panov</strong>, <strong>Jeannette H.C. Woerner</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 902--931.</p><p><strong>Abstract:</strong><br/>
In this paper, we study the problem of statistical inference for a continuous-time moving average Lévy process of the form
\[Z_{t}=\int_{\mathbb{R}}\mathcal{K}(t-s)\,dL_{s},\qquad t\in\mathbb{R},\] with a deterministic kernel $\mathcal{K}$ and a Lévy process $L$. Especially the estimation of the Lévy measure $\nu$ of $L$ from low-frequency observations of the process $Z$ is considered. We construct a consistent estimator, derive its convergence rates and illustrate its performance by a numerical example. On the mathematical level, we establish some new results on exponential mixing for continuous-time moving average Lévy processes.
</p>projecteuclid.org/euclid.bj/1551862839_20190306040120Wed, 06 Mar 2019 04:01 ESTFréchet means and Procrustes analysis in Wasserstein spacehttps://projecteuclid.org/euclid.bj/1551862840<strong>Yoav Zemel</strong>, <strong>Victor M. Panaretos</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 932--976.</p><p><strong>Abstract:</strong><br/>
We consider two statistical problems at the intersection of functional and non-Euclidean data analysis: the determination of a Fréchet mean in the Wasserstein space of multivariate distributions; and the optimal registration of deformed random measures and point processes. We elucidate how the two problems are linked, each being in a sense dual to the other. We first study the finite sample version of the problem in the continuum. Exploiting the tangent bundle structure of Wasserstein space, we deduce the Fréchet mean via gradient descent. We show that this is equivalent to a Procrustes analysis for the registration maps, thus only requiring successive solutions to pairwise optimal coupling problems. We then study the population version of the problem, focussing on inference and stability: in practice, the data are i.i.d. realisations from a law on Wasserstein space, and indeed their observation is discrete, where one observes a proxy finite sample or point process. We construct regularised nonparametric estimators, and prove their consistency for the population mean, and uniform consistency for the population Procrustes registration maps.
</p>projecteuclid.org/euclid.bj/1551862840_20190306040120Wed, 06 Mar 2019 04:01 ESTAre there needles in a moving haystack? Adaptive sensing for detection of dynamically evolving signalshttps://projecteuclid.org/euclid.bj/1551862841<strong>Rui M. Castro</strong>, <strong>Ervin Tánczos</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 977--1012.</p><p><strong>Abstract:</strong><br/>
In this paper, we investigate the problem of detecting dynamically evolving signals. We model the signal as an $n$ dimensional vector that is either zero or has $s$ non-zero components. At each time step $t\in\mathbb{N}$ the nonzero components change their location independently with probability $p$. The statistical problem is to decide whether the signal is a zero vector or in fact it has non-zero components. This decision is based on $m$ noisy observations of individual signal components collected at times $t=1,\ldots,m$. We consider two different sensing paradigms, namely adaptive and non-adaptive sensing. For non-adaptive sensing, the choice of components to measure has to be decided before the data collection process started, while for adaptive sensing one can adjust the sensing process based on observations collected earlier. We characterize the difficulty of this detection problem in both sensing paradigms in terms of the aforementioned parameters, with special interest to the speed of change of the active components. In addition, we provide an adaptive sensing algorithm for this problem and contrast its performance to that of non-adaptive detection algorithms.
</p>projecteuclid.org/euclid.bj/1551862841_20190306040120Wed, 06 Mar 2019 04:01 ESTTowards a general theory for nonlinear locally stationary processeshttps://projecteuclid.org/euclid.bj/1551862842<strong>Rainer Dahlhaus</strong>, <strong>Stefan Richter</strong>, <strong>Wei Biao Wu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1013--1044.</p><p><strong>Abstract:</strong><br/>
In this paper, some general theory is presented for locally stationary processes based on the stationary approximation and the stationary derivative. Laws of large numbers, central limit theorems as well as deterministic and stochastic bias expansions are proved for processes obeying an expansion in terms of the stationary approximation and derivative. In addition it is shown that this applies to some general nonlinear non-stationary Markov-models. In addition the results are applied to derive the asymptotic properties of maximum likelihood estimates of parameter curves in such models.
</p>projecteuclid.org/euclid.bj/1551862842_20190306040120Wed, 06 Mar 2019 04:01 ESTProperties of switching jump diffusions: Maximum principles and Harnack inequalitieshttps://projecteuclid.org/euclid.bj/1551862843<strong>Xiaoshan Chen</strong>, <strong>Zhen-Qing Chen</strong>, <strong>Ky Tran</strong>, <strong>George Yin</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1045--1075.</p><p><strong>Abstract:</strong><br/>
This work examines a class of switching jump diffusion processes. The main effort is devoted to proving the maximum principle and obtaining the Harnack inequalities. Compared with the diffusions and switching diffusions, the associated operators for switching jump diffusions are non-local, resulting in more difficulty in treating such systems. Our study is carried out by taking into consideration of the interplay of stochastic processes and the associated systems of integro-differential equations.
</p>projecteuclid.org/euclid.bj/1551862843_20190306040120Wed, 06 Mar 2019 04:01 ESTError bounds in local limit theorems using Stein’s methodhttps://projecteuclid.org/euclid.bj/1551862844<strong>A.D. Barbour</strong>, <strong>Adrian Röllin</strong>, <strong>Nathan Ross</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1076--1104.</p><p><strong>Abstract:</strong><br/>
We provide a general result for bounding the difference between point probabilities of integer supported distributions and the translated Poisson distribution, a convenient alternative to the discretized normal. We illustrate our theorem in the context of the Hoeffding combinatorial central limit theorem with integer valued summands, of the number of isolated vertices in an Erdős–Rényi random graph, and of the Curie–Weiss model of magnetism, where we provide optimal or near optimal rates of convergence in the local limit metric. In the Hoeffding example, even the discrete normal approximation bounds seem to be new. The general result follows from Stein’s method, and requires a new bound on the Stein solution for the Poisson distribution, which is of general interest.
</p>projecteuclid.org/euclid.bj/1551862844_20190306040120Wed, 06 Mar 2019 04:01 ESTStability for gains from large investors’ strategies in $M_{1}$/$J_{1}$ topologieshttps://projecteuclid.org/euclid.bj/1551862845<strong>Dirk Becherer</strong>, <strong>Todor Bilarev</strong>, <strong>Peter Frentrup</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1105--1140.</p><p><strong>Abstract:</strong><br/>
We prove continuity of a controlled SDE solution in Skorokhod’s $M_{1}$ and $J_{1}$ topologies and also uniformly, in probability, as a nonlinear functional of the control strategy. The functional comes from a finance problem to model price impact of a large investor in an illiquid market. We show that $M_{1}$-continuity is the key to ensure that proceeds and wealth processes from (self-financing) càdlàg trading strategies are determined as the continuous extensions for those from continuous strategies. We demonstrate by examples how continuity properties are useful to solve different stochastic control problems on optimal liquidation and to identify asymptotically realizable proceeds.
</p>projecteuclid.org/euclid.bj/1551862845_20190306040120Wed, 06 Mar 2019 04:01 ESTConvergence rates for a class of estimators based on Stein’s methodhttps://projecteuclid.org/euclid.bj/1551862846<strong>Chris J. Oates</strong>, <strong>Jon Cockayne</strong>, <strong>François-Xavier Briol</strong>, <strong>Mark Girolami</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1141--1159.</p><p><strong>Abstract:</strong><br/>
Gradient information on the sampling distribution can be used to reduce the variance of Monte Carlo estimators via Stein’s method. An important application is that of estimating an expectation of a test function along the sample path of a Markov chain, where gradient information enables convergence rate improvement at the cost of a linear system which must be solved. The contribution of this paper is to establish theoretical bounds on convergence rates for a class of estimators based on Stein’s method. Our analysis accounts for (i) the degree of smoothness of the sampling distribution and test function, (ii) the dimension of the state space, and (iii) the case of non-independent samples arising from a Markov chain. These results provide insight into the rapid convergence of gradient-based estimators observed for low-dimensional problems, as well as clarifying a curse-of-dimension that appears inherent to such methods.
</p>projecteuclid.org/euclid.bj/1551862846_20190306040120Wed, 06 Mar 2019 04:01 ESTMallows and generalized Mallows model for matchingshttps://projecteuclid.org/euclid.bj/1551862847<strong>Ekhine Irurozki</strong>, <strong>Borja Calvo</strong>, <strong>Jose A. Lozano</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1160--1188.</p><p><strong>Abstract:</strong><br/>
The Mallows and Generalized Mallows Models are two of the most popular probability models for distributions on permutations. In this paper, we consider both models under the Hamming distance. This models can be seen as models for matchings instead of models for rankings. These models cannot be factorized, which contrasts with the popular MM and GMM under Kendall’s-$\tau$ and Cayley distances. In order to overcome the computational issues that the models involve, we introduce a novel method for computing the partition function. By adapting this method we can compute the expectation, joint and conditional probabilities. All these methods are the basis for three sampling algorithms, which we propose and analyze. Moreover, we also propose a learning algorithm. All the algorithms are analyzed both theoretically and empirically, using synthetic and real data from the context of e-learning and Massive Open Online Courses (MOOC).
</p>projecteuclid.org/euclid.bj/1551862847_20190306040120Wed, 06 Mar 2019 04:01 ESTStable limit theorems for empirical processes under conditional neighborhood dependencehttps://projecteuclid.org/euclid.bj/1551862848<strong>Ji Hyung Lee</strong>, <strong>Kyungchul Song</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1189--1224.</p><p><strong>Abstract:</strong><br/>
This paper introduces a new concept of stochastic dependence among many random variables which we call conditional neighborhood dependence (CND). Suppose that there are a set of random variables and a set of sigma algebras where both sets are indexed by the same set endowed with a neighborhood system. When the set of random variables satisfies CND, any two non-adjacent sets of random variables are conditionally independent given sigma algebras having indices in one of the two sets’ neighborhood. Random variables with CND include those with conditional dependency graphs and a class of Markov random fields with a global Markov property. The CND property is useful for modeling cross-sectional dependence governed by a complex, large network. This paper provides two main results. The first result is a stable central limit theorem for a sum of random variables with CND. The second result is a Donsker-type result of stable convergence of empirical processes indexed by a class of functions satisfying a certain bracketing entropy condition when the random variables satisfy CND.
</p>projecteuclid.org/euclid.bj/1551862848_20190306040120Wed, 06 Mar 2019 04:01 ESTOracle inequalities for high-dimensional predictionhttps://projecteuclid.org/euclid.bj/1551862849<strong>Johannes Lederer</strong>, <strong>Lu Yu</strong>, <strong>Irina Gaynanova</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1225--1255.</p><p><strong>Abstract:</strong><br/>
The abundance of high-dimensional data in the modern sciences has generated tremendous interest in penalized estimators such as the lasso, scaled lasso, square-root lasso, elastic net, and many others. In this paper, we establish a general oracle inequality for prediction in high-dimensional linear regression with such methods. Since the proof relies only on convexity and continuity arguments, the result holds irrespective of the design matrix and applies to a wide range of penalized estimators. Overall, the bound demonstrates that generic estimators can provide consistent prediction with any design matrix. From a practical point of view, the bound can help to identify the potential of specific estimators, and they can help to get a sense of the prediction accuracy in a given application.
</p>projecteuclid.org/euclid.bj/1551862849_20190306040120Wed, 06 Mar 2019 04:01 ESTTruncated random measureshttps://projecteuclid.org/euclid.bj/1551862850<strong>Trevor Campbell</strong>, <strong>Jonathan H. Huggins</strong>, <strong>Jonathan P. How</strong>, <strong>Tamara Broderick</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1256--1288.</p><p><strong>Abstract:</strong><br/>
Completely random measures (CRMs) and their normalizations are a rich source of Bayesian nonparametric priors. Examples include the beta, gamma, and Dirichlet processes. In this paper, we detail two major classes of sequential CRM representations— series representations and superposition representations —within which we organize both novel and existing sequential representations that can be used for simulation and posterior inference. These two classes and their constituent representations subsume existing ones that have previously been developed in an ad hoc manner for specific processes. Since a complete infinite-dimensional CRM cannot be used explicitly for computation, sequential representations are often truncated for tractability. We provide truncation error analyses for each type of sequential representation, as well as their normalized versions, thereby generalizing and improving upon existing truncation error bounds in the literature. We analyze the computational complexity of the sequential representations, which in conjunction with our error bounds allows us to directly compare representations and discuss their relative efficiency. We include numerous applications of our theoretical results to commonly-used (normalized) CRMs, demonstrating that our results enable a straightforward representation and analysis of CRMs that has not previously been available in a Bayesian nonparametric context.
</p>projecteuclid.org/euclid.bj/1551862850_20190306040120Wed, 06 Mar 2019 04:01 ESTMinimax optimal estimation in partially linear additive models under high dimensionhttps://projecteuclid.org/euclid.bj/1551862851<strong>Zhuqing Yu</strong>, <strong>Michael Levine</strong>, <strong>Guang Cheng</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1289--1325.</p><p><strong>Abstract:</strong><br/>
In this paper, we derive minimax rates for estimating both parametric and nonparametric components in partially linear additive models with high dimensional sparse vectors and smooth functional components. The minimax lower bound for Euclidean components is the typical sparse estimation rate that is independent of nonparametric smoothness indices. However, the minimax lower bound for each component function exhibits an interplay between the dimensionality and sparsity of the parametric component and the smoothness of the relevant nonparametric component. Indeed, the minimax risk for smooth nonparametric estimation can be slowed down to the sparse estimation rate whenever the smoothness of the nonparametric component or dimensionality of the parametric component is sufficiently large. In the above setting, we demonstrate that penalized least square estimators can nearly achieve minimax lower bounds.
</p>projecteuclid.org/euclid.bj/1551862851_20190306040120Wed, 06 Mar 2019 04:01 ESTStrong Gaussian approximation of the mixture Rasch modelhttps://projecteuclid.org/euclid.bj/1551862852<strong>Friedrich Liese</strong>, <strong>Alexander Meister</strong>, <strong>Johanna Kappus</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1326--1354.</p><p><strong>Abstract:</strong><br/>
We consider the famous Rasch model, which is applied to psychometric surveys when $n$ persons under test answer $m$ questions. The score is given by a realization of a random binary $n\times m$-matrix. Its $(j,k)$th component indicates whether or not the answer of the $j$th person to the $k$th question is correct. In the mixture, Rasch model one assumes that the persons are chosen randomly from a population. We prove that the mixture Rasch model is asymptotically equivalent to a Gaussian observation scheme in Le Cam’s sense as $n$ tends to infinity and $m$ is allowed to increase slowly in $n$. For that purpose, we show a general result on strong Gaussian approximation of the sum of independent high-dimensional binary random vectors. As a first application, we construct an asymptotic confidence region for the difficulty parameters of the questions.
</p>projecteuclid.org/euclid.bj/1551862852_20190306040120Wed, 06 Mar 2019 04:01 ESTTime-frequency analysis of locally stationary Hawkes processeshttps://projecteuclid.org/euclid.bj/1551862853<strong>François Roueff</strong>, <strong>Rainer von Sachs</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1355--1385.</p><p><strong>Abstract:</strong><br/>
Locally stationary Hawkes processes have been introduced in order to generalise classical Hawkes processes away from stationarity by allowing for a time-varying second-order structure. This class of self-exciting point processes has recently attracted a lot of interest in applications in the life sciences (seismology, genomics, neuro-science, …), but also in the modeling of high-frequency financial data. In this contribution, we provide a fully developed nonparametric estimation theory of both local mean density and local Bartlett spectra of a locally stationary Hawkes process. In particular, we apply our kernel estimation of the spectrum localised both in time and frequency to two data sets of transaction times revealing pertinent features in the data that had not been made visible by classical non-localised approaches based on models with constant fertility functions over time.
</p>projecteuclid.org/euclid.bj/1551862853_20190306040120Wed, 06 Mar 2019 04:01 ESTQuenched central limit theorem rates of convergence for one-dimensional random walks in random environmentshttps://projecteuclid.org/euclid.bj/1551862854<strong>Sung Won Ahn</strong>, <strong>Jonathon Peterson</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1386--1411.</p><p><strong>Abstract:</strong><br/>
Unlike classical simple random walks, one-dimensional random walks in random environments (RWRE) are known to have a wide array of potential limiting distributions. Under certain assumptions, however, it is known that CLT-like limiting distributions hold for the walk under both the quenched and averaged measures. We give upper bounds on the rates of convergence for the quenched central limit theorems for both the hitting time and position of the RWRE with polynomial rates of convergence that depend on the distribution on environments.
</p>projecteuclid.org/euclid.bj/1551862854_20190306040120Wed, 06 Mar 2019 04:01 ESTFrom random partitions to fractional Brownian sheetshttps://projecteuclid.org/euclid.bj/1551862855<strong>Olivier Durieu</strong>, <strong>Yizao Wang</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1412--1450.</p><p><strong>Abstract:</strong><br/>
We propose discrete random-field models that are based on random partitions of $\mathbb{N}^{2}$. The covariance structure of each random field is determined by the underlying random partition. Functional central limit theorems are established for the proposed models, and fractional Brownian sheets, with full range of Hurst indices, arise in the limit. Our models could be viewed as discrete analogues of fractional Brownian sheets, in the same spirit that the simple random walk is the discrete analogue of the Brownian motion.
</p>projecteuclid.org/euclid.bj/1551862855_20190306040120Wed, 06 Mar 2019 04:01 ESTA Bernstein-type inequality for functions of bounded interactionhttps://projecteuclid.org/euclid.bj/1551862856<strong>Andreas Maurer</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1451--1471.</p><p><strong>Abstract:</strong><br/>
We give a distribution-dependent concentration inequality for functions of independent variables. The result extends Bernstein’s inequality from sums to more general functions, whose variation in any argument does not depend too much on the other arguments. Applications sharpen existing bounds for U-statistics and the generalization error of regularized least squares.
</p>projecteuclid.org/euclid.bj/1551862856_20190306040120Wed, 06 Mar 2019 04:01 ESTAn extreme-value approach for testing the equality of large U-statistic based correlation matriceshttps://projecteuclid.org/euclid.bj/1551862857<strong>Cheng Zhou</strong>, <strong>Fang Han</strong>, <strong>Xin-Sheng Zhang</strong>, <strong>Han Liu</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1472--1503.</p><p><strong>Abstract:</strong><br/>
There has been an increasing interest in testing the equality of large Pearson’s correlation matrices. However, in many applications it is more important to test the equality of large rank-based correlation matrices since they are more robust to outliers and nonlinearity. Unlike the Pearson’s case, testing the equality of large rank-based statistics has not been well explored and requires us to develop new methods and theory. In this paper, we provide a framework for testing the equality of two large U-statistic based correlation matrices, which include the rank-based correlation matrices as special cases. Our approach exploits extreme value statistics and the Jackknife estimator for uncertainty assessment and is valid under a fully nonparametric model. Theoretically, we develop a theory for testing the equality of U-statistic based correlation matrices. We then apply this theory to study the problem of testing large Kendall’s tau correlation matrices and demonstrate its optimality. For proving this optimality, a novel construction of least favorable distributions is developed for the correlation matrix comparison.
</p>projecteuclid.org/euclid.bj/1551862857_20190306040120Wed, 06 Mar 2019 04:01 ESTNumerically stable online estimation of variance in particle filtershttps://projecteuclid.org/euclid.bj/1551862858<strong>Jimmy Olsson</strong>, <strong>Randal Douc</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1504--1535.</p><p><strong>Abstract:</strong><br/>
This paper discusses variance estimation in sequential Monte Carlo methods, alternatively termed particle filters. The variance estimator that we propose is a natural modification of that suggested by H.P. Chan and T.L. Lai [ Ann. Statist. 41 (2013) 2877–2904], which allows the variance to be estimated in a single run of the particle filter by tracing the genealogical history of the particles. However, due particle lineage degeneracy, the estimator of the mentioned work becomes numerically unstable as the number of sequential particle updates increases. Thus, by tracing only a part of the particles’ genealogy rather than the full one, our estimator gains long-term numerical stability at the cost of a bias. The scope of the genealogical tracing is regulated by a lag, and under mild, easily checked model assumptions, we prove that the bias tends to zero geometrically fast as the lag increases. As confirmed by our numerical results, this allows the bias to be tightly controlled also for moderate particle sample sizes.
</p>projecteuclid.org/euclid.bj/1551862858_20190306040120Wed, 06 Mar 2019 04:01 ESTNew tests of uniformity on the compact classical groups as diagnostics for weak-$^{*}$ mixing of Markov chainshttps://projecteuclid.org/euclid.bj/1551862859<strong>Amir Sepehri</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1536--1567.</p><p><strong>Abstract:</strong><br/>
This paper introduces two new families of non-parametric tests of goodness-of-fit on the compact classical groups. One of them is a family of tests for the eigenvalue distribution induced by the uniform distribution, which is consistent against all fixed alternatives. The other is a family of tests for the uniform distribution on the entire group, which is again consistent against all fixed alternatives. The construction of these tests heavily employs facts and techniques from the representation theory of compact groups. In particular, new Cauchy identities are derived and proved for the characters of compact classical groups, in order to accommodate the computation of the test statistic. We find the asymptotic distribution under the null and general alternatives. The tests are proved to be asymptotically admissible. Local power is derived and the global properties of the power function against local alternatives are explored.
The new tests are validated on two random walks for which the mixing-time is studied in the literature. The new tests, and several others, are applied to the Markov chain sampler proposed by Jones, Osipov and Rokhlin [ Proc. Natl. Acad. Sci. 108 (2011) 15679–15686], providing strong evidence supporting the claim that the sampler mixes quickly.
</p>projecteuclid.org/euclid.bj/1551862859_20190306040120Wed, 06 Mar 2019 04:01 ESTMacroscopic analysis of determinantal random ballshttps://projecteuclid.org/euclid.bj/1551862860<strong>Jean-Christophe Breton</strong>, <strong>Adrien Clarenne</strong>, <strong>Renan Gobard</strong>. <p><strong>Source: </strong>Bernoulli, Volume 25, Number 2, 1568--1601.</p><p><strong>Abstract:</strong><br/>
We consider a collection of Euclidean random balls in $\mathbb{R}^{d}$ generated by a determinantal point process inducing inhibitory interaction into the balls. We study this model at a macroscopic level obtained by a zooming-out and three different regimes – Gaussian, Poissonian and stable – are exhibited as in the Poissonian model without interaction. This shows that the macroscopic behaviour erases the interactions induced by the determinantal point process.
</p>projecteuclid.org/euclid.bj/1551862860_20190306040120Wed, 06 Mar 2019 04:01 EST