Volume 28 Issue 2 | Bernoulli

Bernoulli

VOL. 28 · NO. 2 | May 2022

< Previous Issue | Next Issue >

VIEW ALL ABSTRACTS +

Frontmatter

Editorial Board

Bernoulli 28 (2), (May 2022)

No abstract available

Articles

De-biasing the lasso with degrees-of-freedom adjustment

Pierre C. Bellec, Cun-Hui Zhang

Bernoulli 28 (2), 713-743, (May 2022) DOI: 10.3150/21-BEJ1348

KEYWORDS: statistical inference, Lasso, Semiparametric model, Fisher information, efficiency, Confidence interval, p-value, regression, High-dimensional data

Read Abstract +

This paper studies schemes to de-bias the Lasso in sparse linear regression with Gaussian design where the goal is to estimate and construct confidence intervals for a low-dimensional projection of the unknown coefficient vector in a preconceived direction ${\boldsymbol{a}_{0}}$ . Our analysis reveals that previously analyzed propositions to de-bias the Lasso require a modification in order to enjoy nominal coverage and asymptotic efficiency in a full range of the level of sparsity. This modification takes the form of a degrees-of-freedom adjustment that accounts for the dimension of the model selected by the Lasso. The degrees-of-freedom adjustment (a) preserves the success of de-biasing methodologies in regimes where previous proposals were successful, and (b) repairs the nominal coverage and provides efficiency in regimes where previous proposals produce spurious inferences and provably fail to achieve the nominal coverage. Hence our theoretical and simulation results call for the implementation of this degrees-of-freedom adjustment in de-biasing methodologies.

Let ${s_{0}}$ denote the number of nonzero coefficients of the true coefficient vector and Σ the population Gram matrix. The unadjusted de-biasing scheme may fail to achieve the nominal coverage as soon as ${s_{0}}\ggg {n^{2/3}}$ if Σ is known. If Σ is unknown, the degrees-of-freedom adjustment grants efficiency for the contrast in a general direction ${\boldsymbol{a}_{0}}$ when

$\frac{{s_{0}}\log p}{n}+\min {\frac{{s_{\Omega }}\log p}{n},\frac{\| {\Sigma ^{-1}}{\boldsymbol{a}_{0}}{\| _{1}}\sqrt{\log p}}{\| {\Sigma ^{-1/2}}{\boldsymbol{a}_{0}}{\| _{2}}\sqrt{n}}}+\frac{\min ({s_{\Omega }},{s_{0}})\log p}{\sqrt{n}}\to 0$

where ${s_{\Omega }}=\| {\Sigma ^{-1}}{\boldsymbol{a}_{0}}{\| _{0}}$ . The dependence in ${s_{0}},{s_{\Omega }}$ and $\| {\Sigma ^{-1}}{\boldsymbol{a}_{0}}{\| _{1}}$ is optimal and closes a gap in previous upper and lower bounds. Our construction of the estimated score vector provides a novel methodology to handle dense directions ${\boldsymbol{a}_{0}}$ .

Beyond the degrees-of-freedom adjustment, our proof techniques yield a sharp ${\ell _{\infty }}$ error bound for the Lasso which is of independent interest.

Model-free bootstrap for a general class of stationary time series

Yiren Wang, Dimitris N. Politis

Bernoulli 28 (2), 744-770, (May 2022) DOI: 10.3150/21-BEJ1352

KEYWORDS: m-approximation, bootstrap validity, Transformation function, confidence intervals

Read Abstract +

Statistical deconvolution of the free Fokker-Planck equation at fixed time

Mylène Maïda, Tien Dat Nguyen, Thanh Mai Pham Ngoc, Vincent Rivoirard, Viet Chi Tran

Bernoulli 28 (2), 771-802, (May 2022) DOI: 10.3150/21-BEJ1366

KEYWORDS: 35Q62, 65M32, 62G05, 46L53, 35R30, 60B20, 46L54

Read Abstract +

We are interested in reconstructing the initial condition of a non-linear partial differential equation (PDE), namely the Fokker-Planck equation, from the observation of a Dyson Brownian motion at a given time $t\textgreater 0$ . The Fokker-Planck equation describes the evolution of electrostatic repulsive particle systems, and can be seen as the large particle limit of correctly renormalized Dyson Brownian motions. The solution of the Fokker-Planck equation can be written as the free convolution of the initial condition and the semi-circular distribution. We propose a nonparametric estimator for the initial condition obtained by performing the free deconvolution via the subordination functions method. This statistical estimator is original as it involves the resolution of a fixed point equation, and a classical deconvolution by a Cauchy distribution. This is due to the fact that, in free probability, the analogue of the Fourier transform is the R-transform, related to the Cauchy transform. In past literature, there has been a focus on the estimation of the initial conditions of linear PDEs such as the heat equation, but to the best of our knowledge, this is the first time that the problem is tackled for a non-linear PDE. The convergence of the estimator is proved and the integrated mean square error is computed, providing rates of convergence similar to the ones known for non-parametric deconvolution methods. Finally, a simulation study illustrates the good performances of our estimator.

Empirical process of concomitants for partly categorial data and applications in statistics

Daniel Gaigall, Julian Gerstenberg, Thi Thu Ha Trinh

Bernoulli 28 (2), 803-829, (May 2022) DOI: 10.3150/21-BEJ1367

KEYWORDS: Categorial variable, concomitant, induced order statistic, empirical process, Independence test, two-sample test, bootstrap, triangular array, local alternatives

Read Abstract +

Martingale Wasserstein inequality for probability measures in the convex order

Benjamin Jourdain, William Margheriti

Bernoulli 28 (2), 830-858, (May 2022) DOI: 10.3150/21-BEJ1368

KEYWORDS: Convex order, Martingale optimal transport, Wasserstein distance, martingale couplings

Read Abstract +

It was shown by the authors that two one-dimensional probability measures in the convex order admit a martingale coupling with respect to which the integral of $|x-y|$ is smaller than twice their ${\mathcal{W}_{1}}$ -distance (Wasserstein distance with index 1). We showed that replacing $|x-y|$ and ${\mathcal{W}_{1}}$ respectively with $|x-y{|^{\rho }}$ and ${\mathcal{W}_{\rho }^{\rho }}$ does not lead to a finite multiplicative constant. We show here that a finite constant is recovered when replacing ${\mathcal{W}_{\rho }^{\rho }}$ with the product of ${\mathcal{W}_{\rho }}$ times the centred ρ-th moment of the second marginal to the power $\rho -1$ . Then we study the generalisation of this new martingale Wasserstein inequality to higher dimension.

Convergence rates of two-component MCMC samplers

Qian Qin, Galin L. Jones

Bernoulli 28 (2), 859-885, (May 2022) DOI: 10.3150/21-BEJ1369

KEYWORDS: Deterministic-scan, geometric ergodicity, Gibbs, Metropolis-within-Gibbs, random-scan

Read Abstract +

Component-wise MCMC algorithms, including Gibbs and conditional Metropolis-Hastings samplers, are commonly used for sampling from multivariate probability distributions. A long-standing question regarding Gibbs algorithms is whether a deterministic-scan (systematic-scan) sampler converges faster than its random-scan counterpart. We answer this question when the samplers involve two components by establishing an exact quantitative relationship between the ${L^{2}}$ convergence rates of the two samplers. The relationship shows that the deterministic-scan sampler converges faster. We also establish qualitative relations among the convergence rates of two-component Gibbs samplers and some conditional Metropolis-Hastings variants. For instance, it is shown that if some two-component conditional Metropolis-Hastings samplers are geometrically ergodic, then so are the associated Gibbs samplers.

Local elliptic law

Johannes Alt, Torben Krüger

Bernoulli 28 (2), 886-909, (May 2022) DOI: 10.3150/21-BEJ1370

KEYWORDS: Local law, eigenvector delocalisation, elliptic ensemble, matrix Dyson equation

Read Abstract +

Learning with tree tensor networks: Complexity estimates and model selection

Bertrand Michel, Anthony Nouy

Bernoulli 28 (2), 910-936, (May 2022) DOI: 10.3150/21-BEJ1371

KEYWORDS: Tensor networks, Statistical learning, Metric entropy, Model selection, minimax adaptive

Read Abstract +

In this paper, we propose and analyze a model selection method for tree tensor networks in an empirical risk minimization framework and analyze its performance over a wide range of smoothness classes. Tree tensor networks, or tree-based tensor formats, are prominent model classes for the approximation of high-dimensional functions in numerical analysis and data science. They correspond to sum-product neural networks with a sparse connectivity associated with a dimension partition tree T, widths given by a tuple r of tensor ranks, and multilinear activation functions (or units). The approximation power of these model classes has been proved to be optimal (or near to optimal) for classical smoothness classes. However, in an empirical risk minimization framework with a limited number of observations, the dimension tree T and ranks r should be selected carefully to balance estimation and approximation errors. In this paper, we propose a complexity-based model selection strategy à la Barron, Birgé, Massart. Given a family of model classes associated with different trees, ranks, tensor product feature spaces and sparsity patterns for sparse tensor networks, a model is selected by minimizing a penalized empirical risk, with a penalty depending on the complexity of the model class. After deriving bounds of the metric entropy of tree tensor networks with bounded parameters, we deduce a form of the penalty from bounds on suprema of empirical processes. This choice of penalty yields a risk bound for the predictor associated with the selected model. In a least-squares setting, after deriving fast rates of convergence of the risk, we show that the proposed strategy is (near to) minimax adaptive to a wide range of smoothness classes including Sobolev or Besov spaces (with isotropic, anisotropic or mixed dominating smoothness) and analytic functions. We discuss the role of sparsity of the tensor network for obtaining optimal performance in several regimes. In practice, the amplitude of the penalty is calibrated with a slope heuristics method. Numerical experiments in a least-squares regression setting illustrate the performance of the strategy for the approximation of multivariate functions and univariate functions identified with tensors by tensorization (quantization).

Central limit theorem and self-normalized Cramér-type moderate deviation for Euler-Maruyama scheme

Jianya Lu, Yuzhen Tan, Lihu Xu

Bernoulli 28 (2), 937-964, (May 2022) DOI: 10.3150/21-BEJ1372

KEYWORDS: Stochastic differential equation, Euler-Maruyama scheme, central limit theorem, self-normalized Cramér-type moderate deviation, Stein’s method

Read Abstract +

We consider a stochastic differential equation and its Euler-Maruyama (EM) scheme, under some appropriate conditions, they both admit a unique invariant measure, denoted by π and ${\pi _{\eta }}$ respectively (η is the step size of the EM scheme). We construct an empirical measure ${\Pi _{\eta }}$ of the EM scheme as a statistic of ${\pi _{\eta }}$ , and use Stein’s method developed in Fang, Shao and Xu (Probab. Theory Related Fields 174 (2019) 945–979) to prove a central limit theorem of ${\Pi _{\eta }}$ . The proof of the self-normalized Cramér-type moderate deviation (SNCMD) is based on a standard decomposition on Markov chain, splitting ${\eta ^{-1/2}}({\Pi _{\eta }}(.)-\pi (.))$ into a martingale difference series sum ${\mathcal{H}_{\eta }}$ and a negligible remainder ${\mathcal{R}_{\eta }}$ . We handle ${\mathcal{H}_{\eta }}$ by the time-change technique for martingale, while prove that ${\mathcal{R}_{\eta }}$ is exponentially negligible by concentration inequalities, which have their independent interest. Moreover, we show that SNCMD holds for $x=o({\eta ^{-1/6}})$ , which has the same order as that of the classical result in Shao (J. Theoret. Probab. 12 (1999) 385–398), Jing, Shao and Wang (Ann. Probab. 31 (2003) 2167–2215).

On the measure of anchored Gaussian simplices, with applications to multivariate medians

Davy Paindaveine

Bernoulli 28 (2), 965-996, (May 2022) DOI: 10.3150/21-BEJ1373 Open Access

KEYWORDS: distributional identities, Mellin transforms, Multivariate medians, Oja median, Random simplices, spatial median, Stochastic geometry

Read Abstract +

We consider anchored Gaussian ℓ-simplices in the d-dimensional Euclidean space, that is, simplices with one fixed vertex $y\in {\mathbb{R}^{d}}$ and the remaining vertices ${X_{1}},\dots ,{X_{\ell }}$ randomly sampled from the d-variate standard normal distribution. We determine the distribution of the measure of such simplices for any d, any ℓ, and any anchor point y, which is of interest, e.g., when studying the asymptotic behaviour of U-statistics based on such simplex measures. We provide two proofs of the results. The first one is short but is not self-contained as it crucially relies on a technical result for non-central Wishart distributions. The second one is a simple and self-contained proof, that also provides some geometric insight on the results. Quite nicely, variations on this second argument reveal intriguing distributional identities on products of central and non-central chi-square distributions with Beta-distributed non-centrality parameters. We independently establish these distributional identities by making use of Mellin transforms. Beyond the aforementioned use to study the asymptotic behaviour of some U-statistics, our results do find natural applications in the context of robust location estimation, as we illustrate by considering a class of simplex-based multivariate medians that contains the celebrated spatial median and Oja median as special cases. Throughout, our results are confirmed by numerical experiments.

Inference in latent factor regression with clusterable features

Xin Bing, Florentina Bunea, Marten Wegkamp

Bernoulli 28 (2), 997-1020, (May 2022) DOI: 10.3150/21-BEJ1374

KEYWORDS: high dimensional regression, latent factor model, Identification, uniform inference, minimax estimation, pure variables, post clustering inference/regression, adaptive estimation

Read Abstract +

Regression models, in which the observed features $X\in {\mathbb{R}^{p}}$ and the response $Y\in \mathbb{R}$ depend, jointly, on a lower dimensional, unobserved, latent vector $Z\in {\mathbb{R}^{K}}$ , with $K\ll p$ , are popular in a large array of applications, and mainly used for predicting a response from correlated features. In contrast, methodology and theory for inference on the regression coefficient $\beta \in {\mathbb{R}^{K}}$ relating Y to Z are scarce, since typically the un-observable factor Z is hard to interpret. Furthermore, the determination of the asymptotic variance of an estimator of β is a long-standing problem, with solutions known only in a few particular cases.

To address some of these outstanding questions, we develop inferential tools for β in a class of factor regression models in which the observed features are signed mixtures of the latent factors. The model specifications are both practically desirable, in a large array of applications, render interpretability to the components of Z, and are sufficient for parameter identifiability.

Without assuming that the number of latent factors K or the structure of the mixture is known in advance, we construct computationally efficient estimators of β, along with estimators of other important model parameters. We benchmark the rate of convergence of β by first establishing its ${\ell _{2}}$ -norm minimax lower bound, and show that our proposed estimator $\widehat{\beta }$ is minimax-rate adaptive. Our main contribution is the provision of a unified analysis of the component-wise Gaussian asymptotic distribution of $\widehat{\beta }$ and, especially, the derivation of a closed form expression of its asymptotic variance, together with consistent variance estimators. The resulting inferential tools can be used when both K and p are independent of the sample size n, and also when both, or either, p and K vary with n, while allowing for $p\textgreater n$ . This complements the only asymptotic normality results obtained for a particular case of the model under consideration, in the regime $K=O(1)$ and $p\to \infty$ , but without a variance estimate.

As an application, we provide, within our model specifications, a statistical platform for inference in regression on latent cluster centers, thereby increasing the scope of our theoretical results.

We benchmark the newly developed methodology on a recently collected data set for the study of the effectiveness of a new SIV vaccine. Our analysis enables the determination of the top latent antibody-centric mechanisms associated with the vaccine response.

Joint inference on extreme expectiles for multivariate heavy-tailed distributions

Simone A. Padoan, Gilles Stupfler

Bernoulli 28 (2), 1021-1048, (May 2022) DOI: 10.3150/21-BEJ1375

KEYWORDS: expectiles, Extremal dependence, heavy tails, joint convergence, joint inference, tail copula, testing

Read Abstract +

Asymptotically efficient estimators for stochastic blockmodels: The naive MLE, the rank-constrained MLE, and the spectral estimator

Minh Tang, Joshua Cape, Carey E. Priebe

Bernoulli 28 (2), 1049-1073, (May 2022) DOI: 10.3150/21-BEJ1376

KEYWORDS: Asymptotic efficiency, random dot product graph, stochastic blockmodels, asymptotic normality, spectral embedding

Read Abstract +

We establish asymptotic normality results for estimation of the block probability matrix B in stochastic blockmodel graphs using spectral embedding when the average degrees grows at the rate of $\omega (\sqrt{n})$ in n, the number of vertices. As a corollary, we show that when B is of full-rank, estimates of B obtained from spectral embedding are asymptotically efficient. When B is singular the estimates obtained from spectral embedding can have smaller mean square error than those obtained from maximizing the log-likelihood under no rank assumption, and furthermore, can be almost as efficient as the true MLE that assumes the rank of B is known. Our results indicate, in the context of stochastic blockmodel graphs, that spectral embedding is not just computationally tractable, but that the resulting estimates are also admissible, even when compared to the purportedly optimal but computationally intractable maximum likelihood estimation under no rank assumption.

Oracle lower bounds for stochastic gradient sampling algorithms

Niladri S. Chatterji, Peter L. Bartlett, Philip M. Long

Bernoulli 28 (2), 1074-1092, (May 2022) DOI: 10.3150/21-BEJ1377

KEYWORDS: Sampling lower bounds, information theoretic lower bounds, Markov chain Monte Carlo, stochastic gradient Monte Carlo

Read Abstract +

We consider the problem of sampling from a strongly log-concave density in ${\mathbb{R}^{d}}$ , and prove an information theoretic lower bound on the number of stochastic gradient queries of the log density needed. Several popular sampling algorithms (including many Markov chain Monte Carlo methods) operate by using stochastic gradients of the log density to generate a sample; our results establish an information theoretic limit for all these algorithms.

We show that for every algorithm, there exists a well-conditioned strongly log-concave target density for which the distribution of points generated by the algorithm would be at least ε away from the target in total variation distance if the number of gradient queries is less than $\Omega ({\sigma ^{2}}d/{\varepsilon ^{2}})$ , where ${\sigma ^{2}}d$ is the variance of the stochastic gradient. Our lower bound follows by combining the ideas of Le Cam deficiency routinely used in the comparison of statistical experiments along with standard information theoretic tools used in lower bounding Bayes risk functions. To the best of our knowledge our results provide the first nontrivial dimension-dependent lower bound for this problem.

Rates and coverage for monotone densities using projection-posterior

Moumita Chakraborty, Subhashis Ghosal

Bernoulli 28 (2), 1093-1119, (May 2022) DOI: 10.3150/21-BEJ1379

KEYWORDS: monotone density, contraction rate, Bayesian test for monotonicity, credible interval, coverage

Read Abstract +

We consider Bayesian inference for a monotone density on the unit interval and study the resulting asymptotic properties. We consider a “projection-posterior” approach, where we construct a prior on density functions through random histograms without imposing the monotonicity constraint, but induce a random distribution by projecting a sample from the posterior on the space of monotone functions. The approach allows us to retain posterior conjugacy, allowing explicit expressions extremely useful for studying asymptotic properties. We show that the projection-posterior contracts at the optimal ${n^{-1/3}}$ -rate. We then construct a consistent test based on the posterior distribution for testing the hypothesis of monotonicity. Finally, we obtain the limiting coverage of a projection-posterior credible interval for the value of the function at an interior point. Interestingly, the limiting coverage turns out to be higher than the nominal credibility level, the opposite of the undercoverage phenomenon observed in a smoothness regime. Moreover, we show that a recalibration method using a lower credibility level gives an intended limiting coverage. We also discuss extensions of the obtained results for densities on the half-line. We conduct a simulation study to demonstrate the accuracy of the asymptotic results in finite samples.

Minimax estimation of norms of a probability density: I. Lower bounds

Alexander Goldenshluger, Oleg V. Lepski

Bernoulli 28 (2), 1120-1154, (May 2022) DOI: 10.3150/21-BEJ1380

KEYWORDS: estimation of nonlinear functionals, minimax estimation, minimax risk, anisotropic Nikolskii’s class, Best approximation

Read Abstract +

The paper deals with the problem of nonparametric estimating the ${\mathbb{L}_{p}}$ –norm, $p\in (1,\infty )$ , of a probability density on ${\mathbb{R}^{d}}$ , $d\ge 1$ from independent observations. The unknown density is assumed to belong to a ball in the anisotropic Nikolskii’s space. We adopt the minimax approach, and derive lower bounds on the minimax risk. In particular, we demonstrate that accuracy of estimation procedures essentially depends on whether p is integer or not. Moreover, we develop a general technique for derivation of lower bounds on the minimax risk in the problems of estimating nonlinear functionals. The proposed technique is applicable for a broad class of nonlinear functionals, and it is used for derivation of the lower bounds in the ${\mathbb{L}_{p}}$ –norm estimation.

Minimax estimation of norms of a probability density: II. Rate-optimal estimation procedures

Alexander Goldenshluger, Oleg V. Lepski

Bernoulli 28 (2), 1155-1178, (May 2022) DOI: 10.3150/21-BEJ1381

KEYWORDS: Density estimation, minimax risk, Lp-norm, U-statistics, anisotropic Nikol’skii class

Read Abstract +

In this paper we develop rate–optimal estimation procedures in the problem of estimating the ${\mathbb{L}_{p}}$ –norm, $p\in (1,\infty )$ of a probability density from independent observations. The density is assumed to be defined on ${\mathbb{R}^{d}}$ , $d\ge 1$ and to belong to a ball in the anisotropic Nikolskii space. We adopt the minimax approach and construct rate–optimal estimators in the case of integer $p\ge 2$ . We demonstrate that, depending on the parameters of the Nikolskii class and the norm index p, the minimax rates of convergence may vary from inconsistency to the parametric $\sqrt{n}$ –estimation. The results in this paper complement the minimax lower bounds derived in the companion paper (Goldenshluger and Lepski (2020)).

Local minimax rates for closeness testing of discrete distributions

Joseph Lam-Weil, Alexandra Carpentier, Bharath K. Sriperumbudur

Bernoulli 28 (2), 1179-1197, (May 2022) DOI: 10.3150/21-BEJ1382

KEYWORDS: Local minimax optimality, closeness testing, two-sample, instance optimal, discrete distributions, Hypothesis testing, composite-composite testing

Read Abstract +

We consider the closeness testing problem for discrete distributions. The goal is to distinguish whether two samples are drawn from the same unspecified distribution, or whether their respective distributions are separated in ${L_{1}}$ -norm. In this paper, we focus on adapting the rate to the shape of the underlying distributions, i.e. we consider a local minimax setting. We provide, to the best of our knowledge, the first local minimax rate for the separation distance up to logarithmic factors, together with a test that achieves it. In view of the rate, closeness testing turns out to be substantially harder than the related one-sample testing problem over a wide range of cases.

Paving property for real stable polynomials and strongly Rayleigh processes

Kasra Alishahi, Milad Barzegar

Bernoulli 28 (2), 1198-1223, (May 2022) DOI: 10.3150/21-BEJ1383

KEYWORDS: Paving property, real stable polynomials, strongly Rayleigh processes

Read Abstract +

Non-asymptotic properties of spectral decomposition of large Gram-type matrices and applications

Lyuou Zhang, Wen Zhou, Haonan Wang

Bernoulli 28 (2), 1224-1249, (May 2022) DOI: 10.3150/21-BEJ1384

KEYWORDS: approximate factor model, Gram-type matrices, high-dimensional time series, non-asymptotic analysis, Principal Component Analysis, spectral decomposition

Read Abstract +

Nonparametric regression for locally stationary random fields under stochastic sampling design

Daisuke Kurisu

Bernoulli 28 (2), 1250-1275, (May 2022) DOI: 10.3150/21-BEJ1385

KEYWORDS: Nonparametric regression, locally stationary random field, irregularly spaced data, Additive model, Lévy-driven moving average random field

Read Abstract +

In this study, we develop an asymptotic theory of nonparametric regression for locally stationary random fields (LSRFs) $\{{\boldsymbol{X}_{\boldsymbol{s},{A_{n}}}}:\boldsymbol{s}\in {R_{n}}\}$ in ${\mathbb{R}^{p}}$ observed at irregularly spaced locations in ${R_{n}}={[0,{A_{n}}]^{d}}\subset {\mathbb{R}^{d}}$ . We first derive the uniform convergence rate of general kernel estimators, followed by the asymptotic normality of an estimator for the mean function of the model. Moreover, we consider additive models to avoid the curse of dimensionality arising from the dependence of the convergence rate of estimators on the number of covariates. Subsequently, we derive the uniform convergence rate and joint asymptotic normality of the estimators for additive functions. We also introduce approximately ${m_{n}}$ -dependent RFs to provide examples of LSRFs. We find that these RFs include a wide class of Lévy-driven moving average RFs.

A Cramér–Wold device for infinite divisibility of

{\mathbb{Z}^{d}}

-valued distributions

David Berger, Alexander Lindner

Bernoulli 28 (2), 1276-1283, (May 2022) DOI: 10.3150/21-BEJ1386

KEYWORDS: Cramér–Wold device, infinitely divisible distribution, quasi-infinitely divisible distribution, signed Lévy measure

Read Abstract +

We show that a Cramér–Wold device holds for infinite divisibility of ${\mathbb{Z}^{d}}$ -valued distributions, i.e. that the distribution of a ${\mathbb{Z}^{d}}$ -valued random vector X is infinitely divisible if and only if the distribution of ${a^{T}}X$ is infinitely divisible for all $a\in {\mathbb{R}^{d}}$ , and that this in turn is equivalent to infinite divisibility of the distribution of ${a^{T}}X$ for all $a\in {\mathbb{N}_{0}^{d}}$ . A key tool for proving this is a Lévy–Khintchine type representation with a signed Lévy measure for the characteristic function of a ${\mathbb{Z}^{d}}$ -valued distribution, provided the characteristic function is zero-free.

Adaptive Bayesian density estimation in sup-norm

Zacharie Naulet

Bernoulli 28 (2), 1284-1308, (May 2022) DOI: 10.3150/21-BEJ1387

KEYWORDS: Bayesian density estimation, supremum norm, Adaptation

Read Abstract +

We investigate the problem of deriving adaptive posterior rates of contraction on ${\mathbb{L}_{\infty }}$ balls in density estimation. Although it is known that log-density priors can achieve optimal rates when the true density is sufficiently smooth, adaptive rates were still to be proven. Here we establish that the so-called spike-and-slab prior can achieve adaptive and optimal posterior contraction rates. Along the way, we prove a generic ${\mathbb{L}_{\infty }}$ contraction result for log-density priors with independent wavelet coefficients. Interestingly, our approach is different from previous works on ${\mathbb{L}_{\infty }}$ contraction and is reminiscent of the classical test-based approach used in Bayesian nonparametrics. Moreover, we require no lower bound on the smoothness of the true density, albeit the rates are deteriorated by an extra $\log (n)$ factor in the case of low smoothness.

Markov-modulated generalized Ornstein-Uhlenbeck processes and an application in risk theory

Anita Behme, Apostolos Sideris

Bernoulli 28 (2), 1309-1339, (May 2022) DOI: 10.3150/21-BEJ1389

KEYWORDS: exponential functional, generalized Ornstein-Uhlenbeck process, Lévy process, Markov additive process, Markov-modulated random recurrence equation, Markov-switching model, Risk theory, ruin probability, stationary process

Read Abstract +

Symmetric inclusion process with slow boundary: Hydrodynamics and hydrostatics

Chiara Franceschini, Patrícia Gonçalves, Federico Sau

Bernoulli 28 (2), 1340-1381, (May 2022) DOI: 10.3150/21-BEJ1390

KEYWORDS: symmetric inclusion process, slow boundary, Hydrodynamic limit, hydrostatic limit, Non-equilibrium steady state, duality for Markov processes, second class particles

Read Abstract +

Empirical variance minimization with applications in variance reduction and optimal control

Denis Belomestny, Leonid Iosipoi, Quentin Paris, Nikita Zhivotovskiy

Bernoulli 28 (2), 1382-1407, (May 2022) DOI: 10.3150/21-BEJ1392

KEYWORDS: Empirical variance minimization, variance reduction, control variates, optimal control

Read Abstract +

Defective Galton-Watson processes in a varying environment

Götz Kersting, Carmen Minuesa

Bernoulli 28 (2), 1408-1431, (May 2022) DOI: 10.3150/21-BEJ1393

KEYWORDS: branching process, varying environment, defective distribution, absorption, family tree

Read Abstract +

We study an extension of the so-called defective Galton-Watson processes obtained by allowing the offspring distribution to change over the generations. Thus, in these processes, the individuals reproduce independently of the others and in accordance to some possibly defective offspring distribution depending on the generation. Moreover, the defect $1-{f_{n}}(1)$ of the offspring distribution at generation n represents the probability that the process hits an absorbing state Δ at that generation. We focus on the asymptotic behaviour of these processes. We establish the almost sure convergence of the process to a random variable with values in ${\mathbb{N}_{0}}\cup \{\Delta \}$ and we provide two characterisations of the duality extinction-absorption at Δ. We also state some results on the absorption time and the properties of the process conditioned upon its non-absorption, some of which require us to introduce the notion of defective branching trees in varying environment.

A note on the phase transition for independent alignment percolation

Marcelo Hilário, Daniel Ungaretti

Bernoulli 28 (2), 1432-1447, (May 2022) DOI: 10.3150/21-BEJ1395

KEYWORDS: percolation, renormalization, phase transition

Read Abstract +

We study the independent alignment percolation model on ${\mathbb{Z}^{d}}$ introduced by Beaton, Grimmett and Holmes. It is a model for random intersecting line segments defined as follows. First the sites of ${\mathbb{Z}^{d}}$ are independently declared occupied with probability p and vacant otherwise. Conditional on the configuration of occupied vertices, consider the set of all line segments that are parallel to the coordinate axis, whose extremes are occupied vertices and that do not traverse any other occupied vertex. Declare independently the segments on this set open with probability λ and closed otherwise. All the edges that lie on open segments are also declared open giving rise to a bond percolation model in ${\mathbb{Z}^{d}}$ . We show that for any $d\ge 2$ and $p\in (0,1]$ the critical value for λ satisfies ${\lambda _{c}}(p)\textless 1$ completing the proof that the phase transition is non-trivial over the whole interval $(0,1]$ . We also show that the critical curve $p\mapsto {\lambda _{c}}(p)$ is continuous at $p=1$ .

Spectral equivalence of Gaussian random functions: Operator approach

Alexander Nazarov, Yakov Nikitin

Bernoulli 28 (2), 1448-1460, (May 2022) DOI: 10.3150/21-BEJ1396

KEYWORDS: Gaussian random functions, identity in law, spectral equivalence, tensor product, Brownian sheet

Read Abstract +

Posterior probabilities: Nonmonotonicity, asymptotic rates, log-concavity, and Turán’s inequality

Sergiu Hart, Yosef Rinott

Bernoulli 28 (2), 1461-1490, (May 2022) DOI: 10.3150/21-BEJ1398

KEYWORDS: Bayesian analysis, stochastic and likelihood ratio orders, sequential observations, expected posteriors, Unimodality, Legendre polynomials, exponential families

Read Abstract +

In the standard Bayesian framework data are assumed to be generated by a distribution parametrized by θ in a parameter space Θ, over which a prior distribution π is given. A Bayesian statistician quantifies the belief that the true parameter is ${\theta _{0}}$ in Θ by its posterior probability given the observed data. We investigate the behavior of the posterior belief in ${\theta _{0}}$ when the data are generated under some parameter ${\theta _{1}}$ , which may or may not be the same as ${\theta _{0}}$ . Starting from stochastic orders, specifically, likelihood ratio dominance, that obtain for resulting distributions of posteriors, we consider monotonicity properties of the posterior probabilities as a function of the sample size when data arrive sequentially. While the ${\theta _{0}}$ -posterior is monotonically increasing (i.e., it is a submartingale) when the data are generated under that same ${\theta _{0}}$ , it need not be monotonically decreasing in general, not even in terms of its overall expectation, when the data are generated under a different ${\theta _{1}}$ . In fact, it may keep going up and down many times, even in simple cases such as iid coin tosses. We obtain precise asymptotic rates when the data come from the wide class of exponential families of distributions; these rates imply in particular that the expectation of the ${\theta _{0}}$ -posterior under ${\theta _{1}}\ne {\theta _{0}}$ is eventually strictly decreasing. Finally, we show that in a number of interesting cases this expectation is a log-concave function of the sample size, and thus unimodal. In the Bernoulli case we obtain this result by developing an inequality that is related to Turán’s inequality for Legendre polynomials.

Convergence of jump processes with stochastic intensity to Brownian motion with inert drift

Clayton Barnes

Bernoulli 28 (2), 1491-1518, (May 2022) DOI: 10.3150/21-BEJ1399

KEYWORDS: Brownian motion, discrete approximation, Random walk, Local time

Read Abstract +

Backmatter

Table of Contents

Bernoulli 28 (2), (May 2022)

No abstract available

KEYWORDS/PHRASES

PUBLICATION TITLE:

PUBLICATION YEARS