Bayesian Analysis Articles (Project Euclid)
http://projecteuclid.org/euclid.ba
The latest articles from Bayesian Analysis on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2012 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Wed, 13 Jun 2012 14:27 EDTWed, 13 Jun 2012 14:27 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
Separable covariance arrays via the Tucker product, with applications to multivariate
relational data
http://projecteuclid.org/euclid.ba/1339612040
<strong>Peter D. Hoff</strong><p><strong>Source: </strong>Bayesian Anal., Volume 6, Number 2, 179--196.</p><p><strong>Abstract:</strong><br/>
Modern datasets are often in the form of matrices or arrays, potentially having
correlations along each set of data indices. For example, data involving repeated
measurements of several variables over time may exhibit temporal correlation as well as
correlation among the variables. A possible model for matrix-valued data is the class of
matrix normal distributions, which is parametrized by two covariance matrices, one for
each index set of the data. In this article we discuss an extension of the matrix normal
model to accommodate multidimensional data arrays, or tensors. We show how a particular
array-matrix product can be used to generate the class of array normal distributions
having separable covariance structure. We derive some properties of these covariance
structures and the corresponding array normal distributions, and show how the array-matrix
product can be used to define a semi-conjugate prior distribution and calculate the
corresponding posterior distribution. We illustrate the methodology in an analysis of
multivariate longitudinal network data which take the form of a four-way array.
</p>projecteuclid.org/euclid.ba/1339612040_Wed, 13 Jun 2012 14:27 EDTWed, 13 Jun 2012 14:27 EDTBayesian Functional Data Modeling for Heterogeneous Volatilityhttp://projecteuclid.org/euclid.ba/1461603846<strong>Bin Zhu</strong>, <strong>David B. Dunson</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 335--350.</p><p><strong>Abstract:</strong><br/>
Although there are many methods for functional data analysis, less emphasis is put on characterizing variability among volatilities of individual functions. In particular, certain individuals exhibit erratic swings in their trajectory while other individuals have more stable trajectories. There is evidence of such volatility heterogeneity in blood pressure trajectories during pregnancy, for example, and reason to suspect that volatility is a biologically important feature. Most functional data analysis models implicitly assume similar or identical smoothness of the individual functions, and hence can lead to misleading inferences on volatility and an inadequate representation of the functions. We propose a novel class of functional data analysis models characterized using hierarchical stochastic differential equations. We model the derivatives of a mean function and deviation functions using Gaussian processes, while also allowing covariate dependence including on the volatilities of the deviation functions. Following a Bayesian approach to inference, a Markov chain Monte Carlo algorithm is used for posterior computation. The methods are tested on simulated data and applied to blood pressure trajectories during pregnancy.
</p>projecteuclid.org/euclid.ba/1461603846_20170308040102Wed, 08 Mar 2017 04:01 ESTLatent Space Approaches to Community Detection in Dynamic Networkshttp://projecteuclid.org/euclid.ba/1461603847<strong>Daniel K. Sewell</strong>, <strong>Yuguo Chen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 351--377.</p><p><strong>Abstract:</strong><br/>
Embedding dyadic data into a latent space has long been a popular approach to modeling networks of all kinds. While clustering has been done using this approach for static networks, this paper gives two methods of community detection within dynamic network data, building upon the distance and projection models previously proposed in the literature. Our proposed approaches capture the time-varying aspect of the data, can model directed or undirected edges, inherently incorporate transitivity and account for each actor’s individual propensity to form edges. We provide Bayesian estimation algorithms, and apply these methods to a ranked dynamic friendship network and world export/import data.
</p>projecteuclid.org/euclid.ba/1461603847_20170308040102Wed, 08 Mar 2017 04:01 ESTDependent Species Sampling Models for Spatial Density Estimationhttp://projecteuclid.org/euclid.ba/1462297334<strong>Seongil Jo</strong>, <strong>Jaeyong Lee</strong>, <strong>Peter Müller</strong>, <strong>Fernando A. Quintana</strong>, <strong>Lorenzo Trippa</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 379--406.</p><p><strong>Abstract:</strong><br/>
We consider a novel Bayesian nonparametric model for density estimation with an underlying spatial structure. The model is built on a class of species sampling models, which are discrete random probability measures that can be represented as a mixture of random support points and random weights. Specifically, we construct a collection of spatially dependent species sampling models and propose a mixture model based on this collection. The key idea is the introduction of spatial dependence by modeling the weights through a conditional autoregressive model. We present an extensive simulation study to compare the performance of the proposed model with competitors. The proposed model compares favorably to these alternatives. We apply the method to the estimation of summer precipitation density functions using Climate Prediction Center Merged Analysis of Precipitation data over East Asia.
</p>projecteuclid.org/euclid.ba/1462297334_20170308040102Wed, 08 Mar 2017 04:01 ESTA Hierarchical Bayesian Setting for an Inverse Problem in Linear Parabolic PDEs with Noisy Boundary Conditionshttp://projecteuclid.org/euclid.ba/1463078272<strong>Fabrizio Ruggeri</strong>, <strong>Zaid Sawlan</strong>, <strong>Marco Scavino</strong>, <strong>Raul Tempone</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 407--433.</p><p><strong>Abstract:</strong><br/>
In this work we develop a Bayesian setting to infer unknown parameters in initial-boundary value problems related to linear parabolic partial differential equations. We realistically assume that the boundary data are noisy, for a given prescribed initial condition. We show how to derive the joint likelihood function for the forward problem, given some measurements of the solution field subject to Gaussian noise. Given Gaussian priors for the time-dependent Dirichlet boundary values, we analytically marginalize the joint likelihood using the linearity of the equation. Our hierarchical Bayesian approach is fully implemented in an example that involves the heat equation. In this example, the thermal diffusivity is the unknown parameter. We assume that the thermal diffusivity parameter can be modeled a priori through a lognormal random variable or by means of a space-dependent stationary lognormal random field. Synthetic data are used to test the inference. We exploit the behavior of the non-normalized log posterior distribution of the thermal diffusivity. Then, we use the Laplace method to obtain an approximated Gaussian posterior and therefore avoid costly Markov Chain Monte Carlo computations. Expected information gains and predictive posterior densities for observable quantities are numerically estimated using Laplace approximation for different experimental setups.
</p>projecteuclid.org/euclid.ba/1463078272_20170308040102Wed, 08 Mar 2017 04:01 ESTBayesian Inference for Diffusion-Driven Mixed-Effects Modelshttp://projecteuclid.org/euclid.ba/1464035697<strong>Gavin A. Whitaker</strong>, <strong>Andrew Golightly</strong>, <strong>Richard J. Boys</strong>, <strong>Chris Sherlock</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 435--463.</p><p><strong>Abstract:</strong><br/>
Stochastic differential equations (SDEs) provide a natural framework for modelling intrinsic stochasticity inherent in many continuous-time physical processes. When such processes are observed in multiple individuals or experimental units, SDE driven mixed-effects models allow the quantification of both between and within individual variation. Performing Bayesian inference for such models using discrete-time data that may be incomplete and subject to measurement error is a challenging problem and is the focus of this paper. We extend a recently proposed MCMC scheme to include the SDE driven mixed-effects framework. Fundamental to our approach is the development of a novel construct that allows for efficient sampling of conditioned SDEs that may exhibit nonlinear dynamics between observation times. We apply the resulting scheme to synthetic data generated from a simple SDE model of orange tree growth, and real data on aphid numbers recorded under a variety of different treatment regimes. In addition, we provide a systematic comparison of our approach with an inference scheme based on a tractable approximation of the SDE, that is, the linear noise approximation.
</p>projecteuclid.org/euclid.ba/1464035697_20170308040102Wed, 08 Mar 2017 04:01 ESTAutomated Parameter Blocking for Efficient Markov Chain Monte Carlo Samplinghttp://projecteuclid.org/euclid.ba/1464266500<strong>Daniel Turek</strong>, <strong>Perry de Valpine</strong>, <strong>Christopher J. Paciorek</strong>, <strong>Clifford Anderson-Bergman</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 465--490.</p><p><strong>Abstract:</strong><br/>
Markov chain Monte Carlo (MCMC) sampling is an important and commonly used tool for the analysis of hierarchical models. Nevertheless, practitioners generally have two options for MCMC: utilize existing software that generates a black-box “one size fits all" algorithm, or the challenging (and time consuming) task of implementing a problem-specific MCMC algorithm. Either choice may result in inefficient sampling, and hence researchers have become accustomed to MCMC runtimes on the order of days (or longer) for large models. We propose an automated procedure to determine an efficient MCMC block-sampling algorithm for a given model and computing platform. Our procedure dynamically determines blocks of parameters for joint sampling that result in efficient MCMC sampling of the entire model. We test this procedure using a diverse suite of example models, and observe non-trivial improvements in MCMC efficiency for many models. Our procedure is the first attempt at such, and may be generalized to a broader space of MCMC algorithms. Our results suggest that substantive improvements in MCMC efficiency may be practically realized using our automated blocking procedure, or variants thereof, which warrants additional study and application.
</p>projecteuclid.org/euclid.ba/1464266500_20170308040102Wed, 08 Mar 2017 04:01 ESTDynamic Chain Graph Models for Time Series Network Datahttp://projecteuclid.org/euclid.ba/1466165926<strong>Osvaldo Anacleto</strong>, <strong>Catriona Queen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 491--509.</p><p><strong>Abstract:</strong><br/>
This paper introduces a new class of Bayesian dynamic models for inference and forecasting in high-dimensional time series observed on networks. The new model, called the dynamic chain graph model, is suitable for multivariate time series which exhibit symmetries within subsets of series and a causal drive mechanism between these subsets. The model can accommodate high-dimensional, non-linear and non-normal time series and enables local and parallel computation by decomposing the multivariate problem into separate, simpler sub-problems of lower dimensions. The advantages of the new model are illustrated by forecasting traffic network flows and also modelling gene expression data from transcriptional networks.
</p>projecteuclid.org/euclid.ba/1466165926_20170308040102Wed, 08 Mar 2017 04:01 ESTMixtures of $g$ -priors for analysis of variance models with a diverging number of parametershttp://projecteuclid.org/euclid.ba/1467722664<strong>Min Wang</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 511--532.</p><p><strong>Abstract:</strong><br/>
We consider Bayesian approaches for the hypothesis testing problem in the analysis-of-variance (ANOVA) models. With the aid of the singular value decomposition of the centered designed matrix, we reparameterize the ANOVA models with linear constraints for uniqueness into a standard linear regression model without any constraint. We derive the Bayes factors based on mixtures of $g$ -priors and study their consistency properties with a growing number of parameters. It is shown that two commonly used hyper-priors on $g$ (the Zellner-Siow prior and the beta-prime prior) yield inconsistent Bayes factors due to the presence of an inconsistency region around the null model. We propose a new class of hyper-priors to avoid this inconsistency problem. Simulation studies on the two-way ANOVA models are conducted to compare the performance of the proposed procedures with that of some existing ones in the literature.
</p>projecteuclid.org/euclid.ba/1467722664_20170308040102Wed, 08 Mar 2017 04:01 ESTData-Dependent Posterior Propriety of a Bayesian Beta-Binomial-Logit Modelhttp://projecteuclid.org/euclid.ba/1469021382<strong>Hyungsuk Tak</strong>, <strong>Carl N. Morris</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 533--555.</p><p><strong>Abstract:</strong><br/>
A Beta-Binomial-Logit model is a Beta-Binomial model with covariate information incorporated via a logistic regression. Posterior propriety of a Bayesian Beta-Binomial-Logit model can be data-dependent for improper hyper-prior distributions. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. When a posterior is improper due to improper hyper-prior distributions, we suggest using proper hyper-prior distributions that can mimic the behaviors of improper choices.
</p>projecteuclid.org/euclid.ba/1469021382_20170308040102Wed, 08 Mar 2017 04:01 ESTVariational Bayes for Functional Data Registration, Smoothing, and Predictionhttp://projecteuclid.org/euclid.ba/1469553352<strong>Cecilia Earls</strong>, <strong>Giles Hooker</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 557--582.</p><p><strong>Abstract:</strong><br/>
We propose a model for functional data registration that extends current inferential capabilities for unregistered data by providing a flexible probabilistic framework that 1) allows for functional prediction in the context of registration and 2) can be adapted to include smoothing and registration in one model. The proposed inferential framework is a Bayesian hierarchical model where the registered functions are modeled as Gaussian processes. To address the computational demands of inference in high-dimensional Bayesian models, we propose an adapted form of the variational Bayes algorithm for approximate inference that performs similarly to Markov Chain Monte Carlo (MCMC) sampling methods for well-defined problems. The efficiency of the adapted variational Bayes (AVB) algorithm allows variability in a predicted registered, warping, and unregistered function to be depicted separately via bootstrapping. Temperature data related to the El-Niño phenomenon is used to demonstrate the unique inferential capabilities for prediction provided by this model.
</p>projecteuclid.org/euclid.ba/1469553352_20170308040102Wed, 08 Mar 2017 04:01 ESTHigh-Dimensional Bayesian Geostatisticshttp://projecteuclid.org/euclid.ba/1494921642<strong>Sudipto Banerjee</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 583--614.</p><p><strong>Abstract:</strong><br/>
With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has $\sim n$ floating point operations (flops), where $n$ the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.
</p>projecteuclid.org/euclid.ba/1494921642_20170523220230Tue, 23 May 2017 22:02 EDTThe Scaled Beta2 Distribution as a Robust Prior for Scaleshttp://projecteuclid.org/euclid.ba/1469553353<strong>María-Eglée Pérez</strong>, <strong>Luis Raúl Pericchi</strong>, <strong>Isabel Cristina Ramírez</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 615--637.</p><p><strong>Abstract:</strong><br/>
We put forward the Scaled Beta2 (SBeta2) as a flexible and tractable family for modeling scales in both hierarchical and non-hierarchical settings. Various sensible alternatives to the overuse of vague Inverted Gamma priors have been proposed, mainly for hierarchical models. Several of these alternatives are particular cases of the SBeta2 or can be well approximated by it. This family of distributions can be obtained in closed form as a Gamma scale mixture of Gamma distributions, as the Student distribution can be obtained as a Gamma scale mixture of Normal variables. Members of the SBeta2 family arise as intrinsic priors and as divergence based priors in diverse situations, hierarchical and non-hierarchical.
The SBeta2 family unifies and generalizes different proposals in the Bayesian literature, and has numerous theoretical and practical advantages: it is flexible, its members can be lighter, as heavy or heavier tailed as the half-Cauchy, and different behaviors at the origin can be modeled. It has the reciprocality property, i.e if the variance parameter is in the family the precision also is. It is easy to simulate from, and can be embedded in a Gibbs sampling scheme. Short of not being conjugate, it is also amazingly tractable: when coupled with a conditional Cauchy prior for locations, the marginal prior for locations can be found explicitly as proportional to known transcendental functions, and for integer values of the hyperparameters an analytical closed form exists. Furthermore, for specific choices of the hyperparameters, the marginal is found to be an explicit “horseshoe prior”, which are known to have excellent theoretical and practical properties. To our knowledge this is the first closed form horseshoe prior obtained. We also show that for certain values of the hyperparameters the mixture of a Normal and a Scaled Beta2 distribution also gives a closed form marginal.
Examples include robust normal and binomial hierarchical modeling and meta-analysis, with real and simulated data.
</p>projecteuclid.org/euclid.ba/1469553353_20170525220502Thu, 25 May 2017 22:05 EDTA Decision-Theoretic Comparison of Treatments to Resolve Air Leaks After Lung Surgery Based on Nonparametric Modelinghttp://projecteuclid.org/euclid.ba/1469553354<strong>Yanxun Xu</strong>, <strong>Peter F. Thall</strong>, <strong>Peter Müller</strong>, <strong>Mehran J. Reza</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 639--652.</p><p><strong>Abstract:</strong><br/>
We propose a Bayesian nonparametric utility-based group sequential design for a randomized clinical trial to compare a gel sealant to standard care for resolving air leaks after pulmonary resection. Clinically, resolving air leaks in the days soon after surgery is highly important, since longer resolution time produces undesirable complications that require extended hospitalization. The problem of comparing treatments is complicated by the fact that the resolution time distributions are skewed and multi-modal, so using means is misleading. We address these challenges by assuming Bayesian nonparametric probability models for the resolution time distributions and basing the comparative test on weighted means. The weights are elicited as clinical utilities of the resolution times. The proposed design uses posterior expected utilities as group sequential test criteria. The procedure’s frequentist properties are studied by extensive simulations.
</p>projecteuclid.org/euclid.ba/1469553354_20170525220502Thu, 25 May 2017 22:05 EDTNonparametric Goodness of Fit via Cross-Validation Bayes Factorshttp://projecteuclid.org/euclid.ba/1471454532<strong>Jeffrey D. Hart</strong>, <strong>Taeryon Choi</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 653--677.</p><p><strong>Abstract:</strong><br/>
A nonparametric Bayes procedure is proposed for testing the fit of a parametric model for a distribution. Alternatives to the parametric model are kernel density estimates. Data splitting makes it possible to use kernel estimates for this purpose in a Bayesian setting. A kernel estimate indexed by bandwidth is computed from one part of the data, a training set, and then used as a model for the rest of the data, a validation set. A Bayes factor is calculated from the validation set by comparing the marginal for the kernel model with the marginal for the parametric model of interest. A simulation study is used to investigate how large the training set should be, and examples involving astronomy and wind data are provided. A proof of Bayes consistency of the proposed test is also provided.
</p>projecteuclid.org/euclid.ba/1471454532_20170525220502Thu, 25 May 2017 22:05 EDTBayesian Mixture Models with Focused Clustering for Mixed Ordinal and Nominal Datahttp://projecteuclid.org/euclid.ba/1471454533<strong>Maria DeYoreo</strong>, <strong>Jerome P. Reiter</strong>, <strong>D. Sunshine Hillygus</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 679--703.</p><p><strong>Abstract:</strong><br/>
In some contexts, mixture models can fit certain variables well at the expense of others in ways beyond the analyst’s control. For example, when the data include some variables with non-trivial amounts of missing values, the mixture model may fit the marginal distributions of the nearly and fully complete variables at the expense of the variables with high fractions of missing data. Motivated by this setting, we present a mixture model for mixed ordinal and nominal data that splits variables into two groups, focus variables and remainder variables. The model allows the analyst to specify a rich sub-model for the focus variables and a simpler sub-model for remainder variables, yet still capture associations among the variables. Using simulations, we illustrate advantages and limitations of focused clustering compared to mixture models that do not distinguish variables. We apply the model to handle missing values in an analysis of the 2012 American National Election Study, estimating relationships among voting behavior, ideology, and political party affiliation.
</p>projecteuclid.org/euclid.ba/1471454533_20170525220502Thu, 25 May 2017 22:05 EDTOptimal Robustness Results for Relative Belief Inferences and the Relationship to Prior-Data Conflicthttp://projecteuclid.org/euclid.ba/1473276256<strong>Luai Al Labadi</strong>, <strong>Michael Evans</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 705--728.</p><p><strong>Abstract:</strong><br/>
The robustness to the prior of Bayesian inference procedures based on a measure of statistical evidence is considered. These inferences are shown to have optimal properties with respect to robustness. Furthermore, a connection between robustness and prior-data conflict is established. In particular, the inferences are shown to be effectively robust when the choice of prior does not lead to prior-data conflict. When there is prior-data conflict, however, robustness may fail to hold.
</p>projecteuclid.org/euclid.ba/1473276256_20170525220502Thu, 25 May 2017 22:05 EDTA Generalised Semiparametric Bayesian Fay–Herriot Model for Small Area Estimation Shrinking Both Means and Varianceshttp://projecteuclid.org/euclid.ba/1473276257<strong>Silvia Polettini</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 729--752.</p><p><strong>Abstract:</strong><br/>
In survey sampling, interest often lies in unplanned domains (or small areas), whose sample sizes may be too small to allow for accurate design-based inference. To improve the direct estimates by borrowing strength from similar domains, most small area methods rely on mixed effects regression models.
This contribution extends the well known Fay–Herriot model (Fay and Herriot, 1979) within a Bayesian approach in two directions. First, the default normality assumption for the random effects is replaced by a nonparametric specification using a Dirichlet process. Second, uncertainty on variances is explicitly introduced, recognizing the fact that they are actually estimated from survey data. The proposed approach shrinks variances as well as means, and accounts for all sources of uncertainty. Adopting a flexible model for the random effects allows to accommodate outliers and vary the borrowing of strength by identifying local neighbourhoods where the exchangeability assumption holds. Through application to real and simulated data, we investigate the performance of the proposed model in predicting the domain means under different distributional assumptions. We also focus on the construction of credible intervals for the area means, a topic that has received less attention in the literature. Frequentist properties such as mean squared prediction error (MSPE), coverage and interval length are investigated. The experiments performed seem to indicate that inferences under the proposed model are characterised by smaller mean squared error than competing approaches; frequentist coverage of the credible intervals is close to nominal.
</p>projecteuclid.org/euclid.ba/1473276257_20170525220502Thu, 25 May 2017 22:05 EDTSelection of Tuning Parameters, Solution Paths and Standard Errors for Bayesian Lassoshttp://projecteuclid.org/euclid.ba/1473276258<strong>Vivekananda Roy</strong>, <strong>Sounak Chakraborty</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 753--778.</p><p><strong>Abstract:</strong><br/>
Penalized regression methods such as the lasso and elastic net (EN) have become popular for simultaneous variable selection and coefficient estimation. Implementation of these methods require selection of the penalty parameters. We propose an empirical Bayes (EB) methodology for selecting these tuning parameters as well as computation of the regularization path plots. The EB method does not suffer from the “double shrinkage problem” of frequentist EN. Also it avoids the difficulty of constructing an appropriate prior on the penalty parameters. The EB methodology is implemented by efficient importance sampling method based on multiple Gibbs sampler chains. Since the Markov chains underlying the Gibbs sampler are proved to be geometrically ergodic, Markov chain central limit theorem can be used to provide asymptotically valid confidence band for profiles of EN coefficients. The practical effectiveness of our method is illustrated by several simulation examples and two real life case studies. Although this article considers lasso and EN for brevity, the proposed EB method is general and can be used to select shrinkage parameters in other regularization methods.
</p>projecteuclid.org/euclid.ba/1473276258_20170525220502Thu, 25 May 2017 22:05 EDTAdaptive Shrinkage in Pólya Tree Type Modelshttp://projecteuclid.org/euclid.ba/1473276260<strong>Li Ma</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 779--805.</p><p><strong>Abstract:</strong><br/>
We introduce a hierarchical generalization to the Pólya tree that incorporates locally adaptive shrinkage to data features of different scales, while maintaining analytical simplicity and computational efficiency. Inference under the new model proceeds efficiently using general recipes for conjugate hierarchical models, and can be completed extremely efficiently for data sets with large numbers of observations. We illustrate in density estimation that the achieved adaptive shrinkage results in proper smoothing and substantially improves inference. We evaluate the performance of the model through simulation under several schematic scenarios carefully designed to be representative of a variety of applications. We compare its performance to that of the Pólya tree, the optional Pólya tree, and the Dirichlet process mixture. We then apply our method to a flow cytometry data with 455,472 observations to achieve fast estimation of a large number of univariate and multivariate densities, and investigate the computational properties of our method in that context. In addition, we establish theoretical guarantees for the model including absolute continuity, full nonparametricity, and posterior consistency. All proofs are given in the Supplementary Material (Ma, 2016).
</p>projecteuclid.org/euclid.ba/1473276260_20170525220502Thu, 25 May 2017 22:05 EDTA Bayes Interpretation of Stacking for $\mathcal{M}$ -Complete and $\mathcal{M}$ -Open Settingshttp://projecteuclid.org/euclid.ba/1473276261<strong>Tri Le</strong>, <strong>Bertrand Clarke</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 807--829.</p><p><strong>Abstract:</strong><br/>
In ${\mathcal{M}}$ -open problems where no true model can be conceptualized, it is common to back off from modeling and merely seek good prediction. Even in ${\mathcal{M}}$ -complete problems, taking a predictive approach can be very useful. Stacking is a model averaging procedure that gives a composite predictor by combining individual predictors from a list of models using weights that optimize a cross-validation criterion. We show that the stacking weights also asymptotically minimize a posterior expected loss. Hence we formally provide a Bayesian justification for cross-validation. Often the weights are constrained to be positive and sum to one. For greater generality, we omit the positivity constraint and relax the ‘sum to one’ constraint.
A key question is ‘What predictors should be in the average?’ We first verify that the stacking error depends only on the span of the models. Then we propose using bootstrap samples from the data to generate empirical basis elements that can be used to form models. We use this in two computed examples to give stacking predictors that are (i) data driven, (ii) optimal with respect to the number of component predictors, and (iii) optimal with respect to the weight each predictor gets.
</p>projecteuclid.org/euclid.ba/1473276261_20170525220502Thu, 25 May 2017 22:05 EDTLatent Class Mixture Models of Treatment Effect Heterogeneityhttp://projecteuclid.org/euclid.ba/1473362569<strong>Zach Shahn</strong>, <strong>David Madigan</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 831--854.</p><p><strong>Abstract:</strong><br/>
We provide a general Bayesian framework for modeling treatment effect heterogeneity in experiments with non-categorical outcomes. Our modeling approach incorporates latent class mixture components to capture discrete heterogeneity and regression interaction terms to capture continuous heterogeneity. Flexible error distributions allow robust posterior inference on parameters of interest. Hierarchical shrinkage priors on relevant parameters address multiple comparisons concerns. Leave-one-out cross validation estimates of expected posterior predictive density obtained through importance sampling, together with posterior predictive checks, provide a convenient method for model selection and evaluation. We apply our approach to a clinical trial comparing two HIV treatments and to an instrumental variable analysis of a natural experiment on the effect of Medicaid enrollment on emergency department utilization.
</p>projecteuclid.org/euclid.ba/1473362569_20170525220502Thu, 25 May 2017 22:05 EDTIntrinsic Bayesian Analysis for Occupancy Modelshttp://projecteuclid.org/euclid.ba/1473431536<strong>Daniel Taylor-Rodríguez</strong>, <strong>Andrew J. Womack</strong>, <strong>Claudio Fuentes</strong>, <strong>Nikolay Bliznyuk</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 855--877.</p><p><strong>Abstract:</strong><br/>
Occupancy models are typically used to determine the probability of a species being present at a given site while accounting for imperfect detection. The survey data underlying these models often include information on several predictors that could potentially characterize habitat suitability and species detectability. Because these variables might not all be relevant, model selection techniques are necessary in this context. In practice, model selection is performed using the Akaike Information Criterion (AIC), as few other alternatives are available. This paper builds an objective Bayesian variable selection framework for occupancy models through the intrinsic prior methodology. The procedure incorporates priors on the model space that account for test multiplicity and respect the polynomial hierarchy of the predictors when higher-order terms are considered. The methodology is implemented using a stochastic search algorithm that is able to thoroughly explore large spaces of occupancy models. The proposed strategy is entirely automatic and provides control of false positives without sacrificing the discovery of truly meaningful covariates. The performance of the method is evaluated and compared to AIC through a simulation study. The method is illustrated on two datasets previously studied in the literature.
</p>projecteuclid.org/euclid.ba/1473431536_20170525220502Thu, 25 May 2017 22:05 EDTBayesian Analysis of Boundary and Near-Boundary Evidence in Econometric Models with Reduced Rankhttps://projecteuclid.org/euclid.ba/1501120970<strong>Nalan Baştürk</strong>, <strong>Lennart Hoogerheide</strong>, <strong>Herman K. van Dijk</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 879--917.</p><p><strong>Abstract:</strong><br/>
Weak empirical evidence near and at the boundary of the parameter region is a predominant feature in econometric models. Examples are macroeconometric models with weak information on the number of stable relations, microeconometric models measuring connectivity between variables with weak instruments, financial econometric models like the random walk with weak evidence on the efficient market hypothesis and factor models for investment policies with weak information on the number of unobserved factors. A Bayesian analysis is presented of the common issue in these models, which refers to the topic of a reduced rank. Reduced rank is a boundary issue and its effect on the shape of the posteriors of the equation system parameters with a reduced rank is explored systematically. These shapes refer to ridges due to weak identification, fat tails and multimodality. Discussing several alternative routes to construct regularization priors, we show that flat posterior surfaces are integrable even though the marginal posterior tends to infinity if the parameters tend to the values corresponding to local non-identification. We introduce a lasso type shrinkage prior combined with orthogonal normalization which restricts the range of the parameters in a plausible way. This can be combined with other shrinkage, smoothness and data based priors using training samples or dummy observations. Using such classes of priors, it is shown how conditional probabilities of evidence near and at the boundary can be evaluated effectively. These results allow for Bayesian inference using mixtures of posteriors under the boundary state and the near-boundary state. The approach is applied to the estimation of education-income effect in all states of the US economy. The empirical results indicate that there exist substantial differences of this effect between almost all states. This may affect important national and state-wise policies on required length of education. The use of the proposed approach may, in general, lead to more accurate forecasting and decision analysis in other problems in economics, finance and marketing.
</p>projecteuclid.org/euclid.ba/1501120970_20170829220126Tue, 29 Aug 2017 22:01 EDTA Bayesian Nonparametric Approach to Testing for Dependence Between Random Variableshttps://projecteuclid.org/euclid.ba/1474463236<strong>Sarah Filippi</strong>, <strong>Chris C. Holmes</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 919--938.</p><p><strong>Abstract:</strong><br/>
Nonparametric and nonlinear measures of statistical dependence between pairs of random variables are important tools in modern data analysis. In particular the emergence of large data sets can now support the relaxation of linearity assumptions implicit in traditional association scores such as correlation. Here we describe a Bayesian nonparametric procedure that leads to a tractable, explicit and analytic quantification of the relative evidence for dependence vs independence. Our approach uses Pólya tree priors on the space of probability measures which can then be embedded within a decision theoretic test for dependence. Pólya tree priors can accommodate known uncertainty in the form of the underlying sampling distribution and provides an explicit posterior probability measure of both dependence and independence. Well known advantages of having an explicit probability measure include: easy comparison of evidence across different studies; encoding prior information; quantifying changes in dependence across different experimental conditions, and the integration of results within formal decision analysis.
</p>projecteuclid.org/euclid.ba/1474463236_20171117220542Fri, 17 Nov 2017 22:05 ESTJoint Species Distribution Modeling: Dimension Reduction Using Dirichlet Processeshttps://projecteuclid.org/euclid.ba/1478073617<strong>Daniel Taylor-Rodríguez</strong>, <strong>Kimberly Kaufeld</strong>, <strong>Erin M. Schliep</strong>, <strong>James S. Clark</strong>, <strong>Alan E. Gelfand</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 939--967.</p><p><strong>Abstract:</strong><br/> Species distribution models are used to evaluate the variables that affect the distribution and abundance of species and to predict biodiversity. Historically, such models have been fitted to each species independently. While independent models can provide useful information regarding distribution and abundance, they ignore the fact that, after accounting for environmental covariates, residual interspecies dependence persists. With stacking of individual models, misleading behaviors, may arise. In particular, individual models often imply too many species per location. Recently developed joint species distribution models have application to presence–absence, continuous or discrete abundance, abundance with large numbers of zeros, and discrete, ordinal, and compositional data. Here, we deal with the challenge of joint modeling for a large number of species. To appreciate the challenge in the simplest way, with just presence/absence (binary) response and say, $S$ species, we have an $S$ -way contingency table with $2^{S}$ cell probabilities. Even if $S$ is as small as $100$ this is an enormous table, infeasible to work with without some structure to reduce dimension. We develop a computationally feasible approach to accommodate a large number of species (say order $10^{3}$ ) that allows us to: 1) assess the dependence structure across species; 2) identify clusters of species that have similar dependence patterns; and 3) jointly predict species distributions. To do so, we build hierarchical models capturing dependence between species at the first or “data” stage rather than at a second or “mean” stage. We employ the Dirichlet process for clustering in a novel way to reduce dimension in the joint covariance structure. This last step makes computation tractable. We use Forest Inventory Analysis (FIA) data in the eastern region of the United States to demonstrate our method. It consists of presence–absence measurements for 112 tree species, observed east of the Mississippi. As a proof of concept for our dimension reduction approach, we also include simulations using continuous and binary data. </p>projecteuclid.org/euclid.ba/1478073617_20171117220542Fri, 17 Nov 2017 22:05 ESTVariable Selection in Seemingly Unrelated Regressions with Random Predictorshttps://projecteuclid.org/euclid.ba/1488855633<strong>David Puelz</strong>, <strong>P. Richard Hahn</strong>, <strong>Carlos M. Carvalho</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 969--989.</p><p><strong>Abstract:</strong><br/>
This paper considers linear model selection when the response is vector-valued and the predictors, either all or some, are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a “post-inference model summarization” strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing.
</p>projecteuclid.org/euclid.ba/1488855633_20171117220542Fri, 17 Nov 2017 22:05 ESTApproximate Bayesian Inference in Semiparametric Copula Modelshttps://projecteuclid.org/euclid.ba/1510110045<strong>Clara Grazian</strong>, <strong>Brunero Liseo</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 991--1016.</p><p><strong>Abstract:</strong><br/>
We describe a simple method for making inference on a functional of a multivariate distribution, based on its copula representation. We make use of an approximate Bayesian Monte Carlo algorithm, where the proposed values of the functional of interest are weighted in terms of their Bayesian exponentially tilted empirical likelihood. This method is particularly useful when the “true” likelihood function associated with the working model is too costly to evaluate or when the working model is only partially specified.
</p>projecteuclid.org/euclid.ba/1510110045_20171117220542Fri, 17 Nov 2017 22:05 ESTFast Simulation of Hyperplane-Truncated Multivariate Normal Distributionshttps://projecteuclid.org/euclid.ba/1488337478<strong>Yulai Cong</strong>, <strong>Bo Chen</strong>, <strong>Mingyuan Zhou</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1017--1037.</p><p><strong>Abstract:</strong><br/>
We introduce a fast and easy-to-implement simulation algorithm for a multivariate normal distribution truncated on the intersection of a set of hyperplanes, and further generalize it to efficiently simulate random variables from a multivariate normal distribution whose covariance (precision) matrix can be decomposed as a positive-definite matrix minus (plus) a low-rank symmetric matrix. Example results illustrate the correctness and efficiency of the proposed simulation algorithms.
</p>projecteuclid.org/euclid.ba/1488337478_20171117220542Fri, 17 Nov 2017 22:05 ESTBayesian Variable Selection Regression of Multivariate Responses for Group Datahttps://projecteuclid.org/euclid.ba/1508983455<strong>B. Liquet</strong>, <strong>K. Mengersen</strong>, <strong>A. N. Pettitt</strong>, <strong>M. Sutton</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1039--1067.</p><p><strong>Abstract:</strong><br/>
We propose two multivariate extensions of the Bayesian group lasso for variable selection and estimation for data with high dimensional predictors and multi-dimensional response variables. The methods utilize spike and slab priors to yield solutions which are sparse at either a group level or both a group and individual feature level. The incorporation of group structure in a predictor matrix is a key factor in obtaining better estimators and identifying associations between multiple responses and predictors. The approach is suited to many biological studies where the response is multivariate and each predictor is embedded in some biological grouping structure such as gene pathways. Our Bayesian models are connected with penalized regression, and we prove both oracle and asymptotic distribution properties under an orthogonal design. We derive efficient Gibbs sampling algorithms for our models and provide the implementation in a comprehensive R package called MBSGS available on the Comprehensive R Archive Network (CRAN). The performance of the proposed approaches is compared to state-of-the-art variable selection strategies on simulated data sets. The proposed methodology is illustrated on a genetic dataset in order to identify markers grouping across chromosomes that explain the joint variability of gene expression in multiple tissues.
</p>projecteuclid.org/euclid.ba/1508983455_20171117220542Fri, 17 Nov 2017 22:05 ESTInconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing Ithttps://projecteuclid.org/euclid.ba/1510974325<strong>Peter Grünwald</strong>, <strong>Thijs van Ommen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1069--1103.</p><p><strong>Abstract:</strong><br/>
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are heteroskedastic (though, significantly, there are no outliers). As sample size increases, the posterior puts its mass on worse and worse models of ever higher dimension. This is caused by hypercompression , the phenomenon that the posterior puts its mass on distributions that have much larger KL divergence from the ground truth than their average, i.e. the Bayes predictive distribution. To remedy the problem, we equip the likelihood in Bayes’ theorem with an exponent called the learning rate, and we propose the SafeBayesian method to learn the learning rate from the data. SafeBayes tends to select small learning rates, and regularizes more, as soon as hypercompression takes place. Its results on our data are quite encouraging.
</p>projecteuclid.org/euclid.ba/1510974325_20171117220542Fri, 17 Nov 2017 22:05 ESTThe Horseshoe+ Estimator of Ultra-Sparse Signalshttps://projecteuclid.org/euclid.ba/1474572263<strong>Anindya Bhadra</strong>, <strong>Jyotishka Datta</strong>, <strong>Nicholas G. Polson</strong>, <strong>Brandon Willard</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1105--1131.</p><p><strong>Abstract:</strong><br/>
We propose a new prior for ultra-sparse signal detection that we term the “horseshoe+ prior.” The horseshoe+ prior is a natural extension of the horseshoe prior that has achieved success in the estimation and detection of sparse signals and has been shown to possess a number of desirable theoretical properties while enjoying computational feasibility in high dimensions. The horseshoe+ prior builds upon these advantages. Our work proves that the horseshoe+ posterior concentrates at a rate faster than that of the horseshoe in the Kullback–Leibler (K-L) sense. We also establish theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horseshoe and achieves the optimal Bayes risk in testing up to a constant. For one-group global–local scale mixture priors, we develop a new technique for analyzing the marginal sparse prior densities using the class of Meijer-G functions. In simulations, the horseshoe+ estimator demonstrates superior performance in a standard design setting against competing methods, including the horseshoe and Dirichlet–Laplace estimators. We conclude with an illustration on a prostate cancer data set and by pointing out some directions for future research.
</p>projecteuclid.org/euclid.ba/1474572263_20171117220542Fri, 17 Nov 2017 22:05 ESTAsymptotic Optimality of One-Group Shrinkage Priors in Sparse High-dimensional Problemshttps://projecteuclid.org/euclid.ba/1475266758<strong>Prasenjit Ghosh</strong>, <strong>Arijit Chakrabarti</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1133--1161.</p><p><strong>Abstract:</strong><br/>
We study asymptotic optimality of inference in a high-dimensional sparse normal means model using a broad class of one-group shrinkage priors. Assuming that the proportion of non-zero means is known, we show that the corresponding Bayes estimates asymptotically attain the minimax risk (up to a multiplicative constant) for estimation with squared error loss. The constant is shown to be 1 for the important sub-class of “horseshoe-type” priors proving exact asymptotic minimaxity property for these priors, a result hitherto unknown in the literature. An empirical Bayes version of the estimator is shown to achieve the minimax rate in case the level of sparsity is unknown. We prove that the resulting posterior distributions contract around the true mean vector at the minimax optimal rate and provide important insight about the possible rate of posterior contraction around the corresponding Bayes estimator. Our work shows that for rate optimality, a heavy tailed prior with sufficient mass around zero is enough, a pole at zero like the horseshoe prior is not necessary. This part of the work is inspired by Pas et al. (2014). We come up with novel unifying arguments to extend their results over the general class of priors. Next we focus on simultaneous hypothesis testing for the means under the additive $0-1$ loss where the means are modeled through a two-groups mixture distribution. We study asymptotic risk properties of certain multiple testing procedures induced by the class of one-group priors under study, when applied in this set-up. Our key results show that the tests based on the “horseshoe-type” priors asymptotically achieve the risk of the optimal solution in this two-groups framework up to the correct constant and are thus asymptotically Bayes optimal under sparsity (ABOS). This is the first result showing that in a sparse problem a class of one-group priors can exactly mimic the performance of an optimal two-groups solution asymptotically. Our work shows an intrinsic technical connection between the theories of minimax estimation and simultaneous hypothesis testing for such one-group priors.
</p>projecteuclid.org/euclid.ba/1475266758_20171117220542Fri, 17 Nov 2017 22:05 ESTBayesian Analysis of the Stationary MAP 2https://projecteuclid.org/euclid.ba/1477321094<strong>P. Ramírez-Cobo</strong>, <strong>R. E. Lillo</strong>, <strong>M. P. Wiper</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1163--1194.</p><p><strong>Abstract:</strong><br/>
In this article we describe a method for carrying out Bayesian estimation for the two-state stationary Markov arrival process ( $\mathit{MAP}_{2}$ ), which has been proposed as a versatile model in a number of contexts. The approach is illustrated on both simulated and real data sets, where the performance of the $\mathit{MAP}_{2}$ is compared against that of the well-known $\mathit{MMPP}_{2}$ . As an extension of the method, we estimate the queue length and virtual waiting time distributions of a stationary $\mathit{MAP}_{2}/G/1$ queueing system, a matrix generalization of the $M/G/1$ queue that allows for dependent inter-arrival times. Our procedure is illustrated with applications in Internet traffic analysis.
</p>projecteuclid.org/euclid.ba/1477321094_20171117220542Fri, 17 Nov 2017 22:05 ESTMarginal Pseudo-Likelihood Learning of Discrete Markov Network Structureshttps://projecteuclid.org/euclid.ba/1477918728<strong>Johan Pensar</strong>, <strong>Henrik Nyman</strong>, <strong>Juha Niiranen</strong>, <strong>Jukka Corander</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1195--1215.</p><p><strong>Abstract:</strong><br/>
Markov networks are a popular tool for modeling multivariate distributions over a set of discrete variables. The core of the Markov network representation is an undirected graph which elegantly captures the dependence structure over the variables. Traditionally, the Bayesian approach of learning the graph structure from data has been done under the assumption of chordality since non-chordal graphs are difficult to evaluate for likelihood-based scores. Recently, there has been a surge of interest towards the use of regularized pseudo-likelihood methods as such approaches can avoid the assumption of chordality. Many of the currently available methods necessitate the use of a tuning parameter to adapt the level of regularization for a particular dataset. Here we introduce the marginal pseudo-likelihood which has a built-in regularization through marginalization over the graph-specific nuisance parameters. We prove consistency of the resulting graph estimator via comparison with the pseudo-Bayesian information criterion. To identify high-scoring graph structures in a high-dimensional setting we design a two-step algorithm that exploits the decomposable structure of the score. Using synthetic and existing benchmark networks, the marginal pseudo-likelihood method is shown to perform favorably against recent popular structure learning methods.
</p>projecteuclid.org/euclid.ba/1477918728_20171117220542Fri, 17 Nov 2017 22:05 ESTCorrection to: “Posterior Consistency of Bayesian Quantile Regression Based on the Misspecified Asymmetric Laplace Density”https://projecteuclid.org/euclid.ba/1505354708<strong>Karthik Sriram</strong>, <strong>R.V. Ramamoorthi</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1217--1219.</p><p><strong>Abstract:</strong><br/>
In this note, we highlight and provide corrections to two errors in the paper: Karthik Sriram, R.V. Ramamoorthi, Pulak Ghosh (2013) “Posterior Consistency of Bayesian Quantile Regression Based on the Misspecified Asymmetric Laplace Density”, Bayesian Analysis, Vol 8, Num 2, pg 479–504 .
</p>projecteuclid.org/euclid.ba/1505354708_20171117220542Fri, 17 Nov 2017 22:05 ESTUncertainty Quantification for the Horseshoe (with Discussion)https://projecteuclid.org/euclid.ba/1504231319<strong>Stéphanie van der Pas</strong>, <strong>Botond Szabó</strong>, <strong>Aad van der Vaart</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1221--1274.</p><p><strong>Abstract:</strong><br/>
We investigate the credible sets and marginal credible intervals resulting from the horseshoe prior in the sparse multivariate normal means model. We do so in an adaptive setting without assuming knowledge of the sparsity level (number of signals). We consider both the hierarchical Bayes method of putting a prior on the unknown sparsity level and the empirical Bayes method with the sparsity level estimated by maximum marginal likelihood. We show that credible balls and marginal credible intervals have good frequentist coverage and optimal size if the sparsity level of the prior is set correctly. By general theory honest confidence sets cannot adapt in size to an unknown sparsity level. Accordingly the hierarchical and empirical Bayes credible sets based on the horseshoe prior are not honest over the full parameter space. We show that this is due to over-shrinkage for certain parameters and characterise the set of parameters for which credible balls and marginal credible intervals do give correct uncertainty quantification. In particular we show that the fraction of false discoveries by the marginal Bayesian procedure is controlled by a correct choice of cut-off.
</p>projecteuclid.org/euclid.ba/1504231319_20171117220542Fri, 17 Nov 2017 22:05 ESTDeep Learning: A Bayesian Perspectivehttps://projecteuclid.org/euclid.ba/1510801992<strong>Nicholas G. Polson</strong>, <strong>Vadim Sokolov</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1275--1304.</p><p><strong>Abstract:</strong><br/>
Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and hyper-parameter tuning. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are all shown to be shallow learners. Their deep learning counterparts exploit multiple deep layers of data reduction which provide predictive performance gains. Stochastic gradient descent (SGD) training optimisation and Dropout (DO) regularization provide estimation and variable selection. Bayesian regularization is central to finding weights and connections in networks to optimize the predictive bias-variance trade-off. To illustrate our methodology, we provide an analysis of international bookings on Airbnb. Finally, we conclude with directions for future research.
</p>projecteuclid.org/euclid.ba/1510801992_20171117220542Fri, 17 Nov 2017 22:05 ESTBayesian Spectral Modeling for Multivariate Spatial Distributions of Elemental Concentrations in Soilhttps://projecteuclid.org/euclid.ba/1478919835<strong>Maria A. Terres</strong>, <strong>Montserrat Fuentes</strong>, <strong>Dean Hesterberg</strong>, <strong>Matthew Polizzotto</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 1--28.</p><p><strong>Abstract:</strong><br/>
Recent technological advances have enabled researchers in a variety of fields to collect accurately geocoded data for several variables simultaneously. In many cases it may be most appropriate to jointly model these multivariate spatial processes without constraints on their conditional relationships. When data have been collected on a regular lattice, the multivariate conditionally autoregressive (MCAR) models are a common choice. However, inference from these MCAR models relies heavily on the pre-specified neighborhood structure and often assumes a separable covariance structure. Here, we present a multivariate spatial model using a spectral analysis approach that enables inference on the conditional relationships between the variables that does not rely on a pre-specified neighborhood structure, is non-separable, and is computationally efficient. Covariance and cross-covariance functions are defined in the spectral domain to obtain computational efficiency. The resulting pseudo posterior inference on the correlation matrix allows for quantification of the conditional dependencies. A comparison is made with an MCAR model that is shown to be highly sensitive to the choice of neighborhood. The approaches are illustrated for the toxic element arsenic and four other soil elements whose relative concentrations were measured on a microscale spatial lattice. Understanding conditional relationships between arsenic and other soil elements provides insights for mitigating pervasive arsenic poisoning in drinking water in southern Asia and elsewhere.
</p>projecteuclid.org/euclid.ba/1478919835_20171214220453Thu, 14 Dec 2017 22:04 ESTBayesian Inference and Testing of Group Differences in Brain Networkshttps://projecteuclid.org/euclid.ba/1479179031<strong>Daniele Durante</strong>, <strong>David B. Dunson</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 29--58.</p><p><strong>Abstract:</strong><br/>
Network data are increasingly collected along with other variables of interest. Our motivation is drawn from neurophysiology studies measuring brain connectivity networks for a sample of individuals along with their membership to a low or high creative reasoning group. It is of paramount importance to develop statistical methods for testing of global and local changes in the structural interconnections among brain regions across groups. We develop a general Bayesian procedure for inference and testing of group differences in the network structure, which relies on a nonparametric representation for the conditional probability mass function associated with a network-valued random variable. By leveraging a mixture of low-rank factorizations, we allow simple global and local hypothesis testing adjusting for multiplicity. An efficient Gibbs sampler is defined for posterior computation. We provide theoretical results on the flexibility of the model and assess testing performance in simulations. The approach is applied to provide novel insights on the relationships between human brain networks and creativity.
</p>projecteuclid.org/euclid.ba/1479179031_20171214220453Thu, 14 Dec 2017 22:04 ESTApproximation of Bayesian Predictive $p$ -Values with Regression ABChttps://projecteuclid.org/euclid.ba/1479286819<strong>David J. Nott</strong>, <strong>Christopher C. Drovandi</strong>, <strong>Kerrie Mengersen</strong>, <strong>Michael Evans</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 59--83.</p><p><strong>Abstract:</strong><br/>
In the Bayesian framework a standard approach to model criticism is to compare some function of the observed data to a reference predictive distribution. The result of the comparison can be summarized in the form of a $p$ -value, and computation of some kinds of Bayesian predictive $p$ -values can be challenging. The use of regression adjustment approximate Bayesian computation (ABC) methods is explored for this task. Two problems are considered. The first is approximation of distributions of prior predictive $p$ -values for the purpose of choosing weakly informative priors in the case where the model checking statistic is expensive to compute. Here the computation is difficult because of the need to repeatedly sample from a prior predictive distribution for different values of a prior hyperparameter. The second problem considered is the calibration of posterior predictive $p$ -values so that they are uniformly distributed under some reference distribution for the data. Computation is difficult because the calibration process requires repeated approximation of the posterior for different data sets under the reference distribution. In both these problems we argue that high accuracy in the computations is not required, which makes fast approximations such as regression adjustment ABC very useful. We illustrate our methods with several examples.
</p>projecteuclid.org/euclid.ba/1479286819_20171214220453Thu, 14 Dec 2017 22:04 ESTLatent Marked Poisson Process with Applications to Object Segmentationhttps://projecteuclid.org/euclid.ba/1480129463<strong>Sindhu Ghanta</strong>, <strong>Jennifer G. Dy</strong>, <strong>Donglin Niu</strong>, <strong>Michael I. Jordan</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 85--113.</p><p><strong>Abstract:</strong><br/>
In difficult object segmentation tasks, utilizing image information alone is not sufficient; incorporation of object shape prior models is necessary to obtain competitive segmentation performance. Most formulations that incorporate both shape and image information are in the form of energy functional optimization problems. This paper introduces a Bayesian latent marked Poisson process for segmenting multiple objects in an image. The model takes both shape and image feature/appearance into account—it generates object locations from a spatial Poisson process, then generates shape parameters from a shape prior model as the latent marks. Inferentially, this partitions the image: pixels inside objects are assumed to be generated from an object observation/appearance model and pixels outside objects come from a background model. The Poisson process provides (non-homogeneous) spatial priors for object locations and the marks allow the incorporation of shape priors. We develop a hybrid Gibbs sampler that addresses the variation in model order and nonconjugacy that arise in this setting and we present experimental results on synthetic images and two diverse domains in real images: cell segmentation in biological images and pedestrian and car detection in traffic images.
</p>projecteuclid.org/euclid.ba/1480129463_20171214220453Thu, 14 Dec 2017 22:04 ESTReal-Time Bayesian Parameter Estimation for Item Response Modelshttps://projecteuclid.org/euclid.ba/1482138050<strong>Ruby Chiu-Hsing Weng</strong>, <strong>D. Stephen Coad</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 115--137.</p><p><strong>Abstract:</strong><br/>
Bayesian item response models have been used in modeling educational testing and Internet ratings data. Typically, the statistical analysis is carried out using Markov Chain Monte Carlo methods. However, these may not be computationally feasible when real-time data continuously arrive and online parameter estimation is needed. We develop an efficient algorithm based on a deterministic moment-matching method to adjust the parameters in real-time. The proposed online algorithm works well for two real datasets, achieving good accuracy but with considerably less computational time.
</p>projecteuclid.org/euclid.ba/1482138050_20171214220453Thu, 14 Dec 2017 22:04 ESTImproving the Efficiency of Fully Bayesian Optimal Design of Experiments Using Randomised Quasi-Monte Carlohttps://projecteuclid.org/euclid.ba/1483066880<strong>Christopher C. Drovandi</strong>, <strong>Minh-Ngoc Tran</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 139--162.</p><p><strong>Abstract:</strong><br/>
Optimal experimental design is an important methodology for most efficiently allocating resources in an experiment to best achieve some goal. Bayesian experimental design considers the potential impact that various choices of the controllable variables have on the posterior distribution of the unknowns. Optimal Bayesian design involves maximising an expected utility function, which is an analytically intractable integral over the prior predictive distribution. These integrals are typically estimated via standard Monte Carlo methods. In this paper, we demonstrate that the use of randomised quasi-Monte Carlo can bring significant reductions to the variance of the estimated expected utility. This variance reduction can then lead to a more efficient optimisation algorithm for maximising the expected utility.
</p>projecteuclid.org/euclid.ba/1483066880_20171214220453Thu, 14 Dec 2017 22:04 ESTRegularization and Confounding in Linear Regression for Treatment Effect Estimationhttps://projecteuclid.org/euclid.ba/1484103680<strong>P. Richard Hahn</strong>, <strong>Carlos M. Carvalho</strong>, <strong>David Puelz</strong>, <strong>Jingyu He</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 163--182.</p><p><strong>Abstract:</strong><br/>
This paper investigates the use of regularization priors in the context of treatment effect estimation using observational data where the number of control variables is large relative to the number of observations. First, the phenomenon of “regularization-induced confounding” is introduced, which refers to the tendency of regularization priors to adversely bias treatment effect estimates by over-shrinking control variable regression coefficients. Then, a simultaneous regression model is presented which permits regularization priors to be specified in a way that avoids this unintentional “re-confounding”. The new model is illustrated on synthetic and empirical data.
</p>projecteuclid.org/euclid.ba/1484103680_20171214220453Thu, 14 Dec 2017 22:04 ESTDirichlet Process Mixture Models for Modeling and Generating Synthetic Versions of Nested Categorical Datahttps://projecteuclid.org/euclid.ba/1485227030<strong>Jingchen Hu</strong>, <strong>Jerome P. Reiter</strong>, <strong>Quanli Wang</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 183--200.</p><p><strong>Abstract:</strong><br/>
We present a Bayesian model for estimating the joint distribution of multivariate categorical data when units are nested within groups. Such data arise frequently in social science settings, for example, people living in households. The model assumes that (i) each group is a member of a group-level latent class, and (ii) each unit is a member of a unit-level latent class nested within its group-level latent class. This structure allows the model to capture dependence among units in the same group. It also facilitates simultaneous modeling of variables at both group and unit levels. We develop a version of the model that assigns zero probability to groups and units with physically impossible combinations of variables. We apply the model to estimate multivariate relationships in a subset of the American Community Survey. Using the estimated model, we generate synthetic household data that could be disseminated as redacted public use files. Supplementary materials (Hu et al., 2017) for this article are available online.
</p>projecteuclid.org/euclid.ba/1485227030_20171214220453Thu, 14 Dec 2017 22:04 ESTOptimal Gaussian Approximations to the Posterior for Log-Linear Models with Diaconis–Ylvisaker Priorshttps://projecteuclid.org/euclid.ba/1487646097<strong>James Johndrow</strong>, <strong>Anirban Bhattacharya</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 201--223.</p><p><strong>Abstract:</strong><br/>
In contingency table analysis, sparse data is frequently encountered for even modest numbers of variables, resulting in non-existence of maximum likelihood estimates. A common solution is to obtain regularized estimates of the parameters of a log-linear model. Bayesian methods provide a coherent approach to regularization, but are often computationally intensive. Conjugate priors ease computational demands, but the conjugate Diaconis–Ylvisaker priors for the parameters of log-linear models do not give rise to closed form credible regions, complicating posterior inference. Here we derive the optimal Gaussian approximation to the posterior for log-linear models with Diaconis–Ylvisaker priors, and provide convergence rate and finite-sample bounds for the Kullback–Leibler divergence between the exact posterior and the optimal Gaussian approximation. We demonstrate empirically in simulations and a real data application that the approximation is highly accurate, even for modest sample sizes. We also propose a method for model selection using the approximation. The proposed approximation provides a computationally scalable approach to regularized estimation and approximate Bayesian inference for log-linear models.
</p>projecteuclid.org/euclid.ba/1487646097_20171214220453Thu, 14 Dec 2017 22:04 ESTLocally Adaptive Smoothing with Markov Random Fields and Shrinkage Priorshttps://projecteuclid.org/euclid.ba/1487905413<strong>James R. Faulkner</strong>, <strong>Vladimir N. Minin</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 225--252.</p><p><strong>Abstract:</strong><br/>
We present a locally adaptive nonparametric curve fitting method that operates within a fully Bayesian framework. This method uses shrinkage priors to induce sparsity in order- $k$ differences in the latent trend function, providing a combination of local adaptation and global control. Using a scale mixture of normals representation of shrinkage priors, we make explicit connections between our method and $k$ th order Gaussian Markov random field smoothing. We call the resulting processes shrinkage prior Markov random fields (SPMRFs). We use Hamiltonian Monte Carlo to approximate the posterior distribution of model parameters because this method provides superior performance in the presence of the high dimensionality and strong parameter correlations exhibited by our models. We compare the performance of three prior formulations using simulated data and find the horseshoe prior provides the best compromise between bias and precision. We apply SPMRF models to two benchmark data examples frequently used to test nonparametric methods. We find that this method is flexible enough to accommodate a variety of data generating models and offers the adaptive properties and computational tractability to make it a useful addition to the Bayesian nonparametric toolbox.
</p>projecteuclid.org/euclid.ba/1487905413_20171214220453Thu, 14 Dec 2017 22:04 ESTComputationally Efficient Multivariate Spatio-Temporal Models for High-Dimensional Count-Valued Data (with Discussion)https://projecteuclid.org/euclid.ba/1507687687<strong>Jonathan R. Bradley</strong>, <strong>Scott H. Holan</strong>, <strong>Christopher K. Wikle</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 1, 253--310.</p><p><strong>Abstract:</strong><br/>
We introduce a computationally efficient Bayesian model for predicting high-dimensional dependent count-valued data. In this setting, the Poisson data model with a latent Gaussian process model has become the de facto model. However, this model can be difficult to use in high dimensional settings, where the data may be tabulated over different variables, geographic regions, and times. These computational difficulties are further exacerbated by acknowledging that count-valued data are naturally non-Gaussian. Thus, many of the current approaches, in Bayesian inference, require one to carefully calibrate a Markov chain Monte Carlo (MCMC) technique. We avoid MCMC methods that require tuning by developing a new conjugate multivariate distribution. Specifically, we introduce a multivariate log-gamma distribution and provide substantial methodological development of independent interest including: results regarding conditional distributions, marginal distributions, an asymptotic relationship with the multivariate normal distribution, and full-conditional distributions for a Gibbs sampler. To incorporate dependence between variables, regions, and time points, a multivariate spatio-temporal mixed effects model (MSTM) is used. To demonstrate our methodology we use data obtained from the US Census Bureau’s Longitudinal Employer-Household Dynamics (LEHD) program. In particular, our approach is motivated by the LEHD’s Quarterly Workforce Indicators (QWIs), which constitute current estimates of important US economic variables.
</p>projecteuclid.org/euclid.ba/1507687687_20180309040031Fri, 09 Mar 2018 04:00 ESTA New Monte Carlo Method for Estimating Marginal Likelihoodshttps://projecteuclid.org/euclid.ba/1488250818<strong>Yu-Bo Wang</strong>, <strong>Ming-Hui Chen</strong>, <strong>Lynn Kuo</strong>, <strong>Paul O. Lewis</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 311--333.</p><p><strong>Abstract:</strong><br/>
Evaluating the marginal likelihood in Bayesian analysis is essential for model selection. Estimators based on a single Markov chain Monte Carlo sample from the posterior distribution include the harmonic mean estimator and the inflated density ratio estimator. We propose a new class of Monte Carlo estimators based on this single Markov chain Monte Carlo sample. This class can be thought of as a generalization of the harmonic mean and inflated density ratio estimators using a partition weighted kernel (likelihood times prior). We show that our estimator is consistent and has better theoretical properties than the harmonic mean and inflated density ratio estimators. In addition, we provide guidelines on choosing optimal weights. Simulation studies were conducted to examine the empirical performance of the proposed estimator. We further demonstrate the desirable features of the proposed estimator with two real data sets: one is from a prostate cancer study using an ordinal probit regression model with latent variables; the other is for the power prior construction from two Eastern Cooperative Oncology Group phase III clinical trials using the cure rate survival model with similar objectives.
</p>projecteuclid.org/euclid.ba/1488250818_20180326220157Mon, 26 Mar 2018 22:01 EDTA Comparison of Truncated and Time-Weighted Plackett–Luce Models for Probabilistic Forecasting of Formula One Resultshttps://projecteuclid.org/euclid.ba/1488250819<strong>Daniel A. Henderson</strong>, <strong>Liam J. Kirrane</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 335--358.</p><p><strong>Abstract:</strong><br/>
We compare several variants of the Plackett–Luce model, a commonly-used model for permutations, in terms of their ability to accurately forecast Formula One motor racing results. A Bayesian approach to forecasting is adopted and a Gibbs sampler for sampling from the posterior distributions of the model parameters is described. Prediction of the results from the 2010 to 2013 Formula One seasons highlights clear strengths and weaknesses of the various models. We demonstrate by example that down weighting past results can improve forecasts, and that some of the models we consider are competitive with the forecasts implied by bookmakers odds.
</p>projecteuclid.org/euclid.ba/1488250819_20180326220157Mon, 26 Mar 2018 22:01 EDTOn the Use of Cauchy Prior Distributions for Bayesian Logistic Regressionhttps://projecteuclid.org/euclid.ba/1488855634<strong>Joyee Ghosh</strong>, <strong>Yingbo Li</strong>, <strong>Robin Mitra</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 359--383.</p><p><strong>Abstract:</strong><br/>
In logistic regression, separation occurs when a linear combination of the predictors can perfectly classify part or all of the observations in the sample, and as a result, finite maximum likelihood estimates of the regression coefficients do not exist. Gelman et al. (2008) recommended independent Cauchy distributions as default priors for the regression coefficients in logistic regression, even in the case of separation, and reported posterior modes in their analyses. As the mean does not exist for the Cauchy prior, a natural question is whether the posterior means of the regression coefficients exist under separation. We prove theorems that provide necessary and sufficient conditions for the existence of posterior means under independent Cauchy priors for the logit link and a general family of link functions, including the probit link. We also study the existence of posterior means under multivariate Cauchy priors. For full Bayesian inference, we develop a Gibbs sampler based on Pólya-Gamma data augmentation to sample from the posterior distribution under independent Student- $t$ priors including Cauchy priors, and provide a companion R package $\mathtt{tglm}$ , available at CRAN. We demonstrate empirically that even when the posterior means of the regression coefficients exist under separation, the magnitude of the posterior samples for Cauchy priors may be unusually large, and the corresponding Gibbs sampler shows extremely slow mixing. While alternative algorithms such as the No-U-Turn Sampler (NUTS) in Stan can greatly improve mixing, in order to resolve the issue of extremely heavy tailed posteriors for Cauchy priors under separation, one would need to consider lighter tailed priors such as normal priors or Student- $t$ priors with degrees of freedom larger than one.
</p>projecteuclid.org/euclid.ba/1488855634_20180326220157Mon, 26 Mar 2018 22:01 EDTSequential Bayesian Analysis of Multivariate Count Datahttps://projecteuclid.org/euclid.ba/1490234588<strong>Tevfik Aktekin</strong>, <strong>Nick Polson</strong>, <strong>Refik Soyer</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 385--409.</p><p><strong>Abstract:</strong><br/>
We develop a new class of dynamic multivariate Poisson count models that allow for fast online updating. We refer to this class as multivariate Poisson-scaled beta (MPSB) models. The MPSB model allows for serial dependence in count data as well as dependence with a random common environment across time series. Notable features of our model are analytic forms for state propagation, predictive likelihood densities, and sequential updating via sufficient statistics for the static model parameters. Our approach leads to a fully adapted particle learning algorithm and a new class of predictive likelihoods and marginal distributions which we refer to as the (dynamic) multivariate confluent hyper-geometric negative binomial distribution (MCHG-NB) and the dynamic multivariate negative binomial (DMNB) distribution, respectively. To illustrate our methodology, we use a simulation study and empirical data on weekly consumer non-durable goods demand.
</p>projecteuclid.org/euclid.ba/1490234588_20180326220157Mon, 26 Mar 2018 22:01 EDTBayesian Analysis of RNA-Seq Data Using a Family of Negative Binomial Modelshttps://projecteuclid.org/euclid.ba/1491616976<strong>Lili Zhao</strong>, <strong>Weisheng Wu</strong>, <strong>Dai Feng</strong>, <strong>Hui Jiang</strong>, <strong>XuanLong Nguyen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 411--436.</p><p><strong>Abstract:</strong><br/>
The analysis of RNA-Seq data has been focused on three main categories, including gene expression, relative exon usage and transcript expression. Methods have been proposed independently for each category using a negative binomial (NB) model. However, counts following a NB distribution on one feature (e.g., exon) do not guarantee a NB distribution for the other two features (e.g., gene/transcript). In this paper we propose a family of Negative Binomial models, which integrates the gene, exon and transcript analysis under a coherent NB model. The proposed model easily incorporates the uncertainty of assigning reads to transcripts and simplifies substantially the estimation for the relative usage. We developed simple Gibbs sampling algorithms for the posterior inference by exploiting fully tractable closed-forms of computation via suitable conjugate priors. The proposed models were investigated under extensive simulations. Finally, we applied our model to a real data set.
</p>projecteuclid.org/euclid.ba/1491616976_20180326220157Mon, 26 Mar 2018 22:01 EDTEfficient Model Comparison Techniques for Models Requiring Large Scale Data Augmentationhttps://projecteuclid.org/euclid.ba/1493431262<strong>Panayiota Touloupou</strong>, <strong>Naif Alzahrani</strong>, <strong>Peter Neal</strong>, <strong>Simon E. F. Spencer</strong>, <strong>Trevelyan J. McKinley</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 437--459.</p><p><strong>Abstract:</strong><br/>
Selecting between competing statistical models is a challenging problem especially when the competing models are non-nested. In this paper we offer a simple solution by devising an algorithm which combines MCMC and importance sampling to obtain computationally efficient estimates of the marginal likelihood which can then be used to compare the models. The algorithm is successfully applied to a longitudinal epidemic data set, where calculating the marginal likelihood is made more challenging by the presence of large amounts of missing data. In this context, our importance sampling approach is shown to outperform existing methods for computing the marginal likelihood.
</p>projecteuclid.org/euclid.ba/1493431262_20180326220157Mon, 26 Mar 2018 22:01 EDTTesting Un-Separated Hypotheses by Estimating a Distancehttps://projecteuclid.org/euclid.ba/1498204951<strong>Jean-Bernard Salomond</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 461--484.</p><p><strong>Abstract:</strong><br/>
In this paper we propose a Bayesian answer to testing problems when the hypotheses are not well separated. The idea of the method is to study the posterior distribution of a discrepancy measure between the parameter and the model we want to test for. This is shown to be equivalent to a modification of the testing loss. An advantage of this approach is that it can easily be adapted to complex hypotheses testing which are in general difficult to test for. Asymptotic properties of the test can be derived from the asymptotic behaviour of the posterior distribution of the discrepancy measure, and gives insight on possible calibrations. In addition one can derive separation rates for testing, which ensure the asymptotic frequentist optimality of our procedures.
</p>projecteuclid.org/euclid.ba/1498204951_20180326220157Mon, 26 Mar 2018 22:01 EDTVariational Hamiltonian Monte Carlo via Score Matchinghttps://projecteuclid.org/euclid.ba/1500948232<strong>Cheng Zhang</strong>, <strong>Babak Shahbaba</strong>, <strong>Hongkai Zhao</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 485--506.</p><p><strong>Abstract:</strong><br/>
Traditionally, the field of computational Bayesian statistics has been divided into two main subfields: variational methods and Markov chain Monte Carlo (MCMC). In recent years, however, several methods have been proposed based on combining variational Bayesian inference and MCMC simulation in order to improve their overall accuracy and computational efficiency. This marriage of fast evaluation and flexible approximation provides a promising means of designing scalable Bayesian inference methods. In this paper, we explore the possibility of incorporating variational approximation into a state-of-the-art MCMC method, Hamiltonian Monte Carlo (HMC), to reduce the required expensive computation involved in the sampling procedure, which is the bottleneck for many applications of HMC in big data problems. To this end, we exploit the regularity in parameter space to construct a free-form approximation of the target distribution by a fast and flexible surrogate function using an optimized additive model of proper random basis, which can also be viewed as a single-hidden layer feedforward neural network. The surrogate function provides sufficiently accurate approximation while allowing for fast computation in the sampling procedure, resulting in an efficient approximate Bayesian inference algorithm. We demonstrate the advantages of our proposed method using both synthetic and real data problems.
</p>projecteuclid.org/euclid.ba/1500948232_20180326220157Mon, 26 Mar 2018 22:01 EDTMerging MCMC Subposteriors through Gaussian-Process Approximationshttps://projecteuclid.org/euclid.ba/1502265628<strong>Christopher Nemeth</strong>, <strong>Chris Sherlock</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 507--530.</p><p><strong>Abstract:</strong><br/>
Markov chain Monte Carlo (MCMC) algorithms have become powerful tools for Bayesian inference. However, they do not scale well to large-data problems. Divide-and-conquer strategies, which split the data into batches and, for each batch, run independent MCMC algorithms targeting the corresponding subposterior, can spread the computational burden across a number of separate computer cores. The challenge with such strategies is in recombining the subposteriors to approximate the full posterior. By creating a Gaussian-process approximation for each log-subposterior density we create a tractable approximation for the full posterior. This approximation is exploited through three methodologies: firstly a Hamiltonian Monte Carlo algorithm targeting the expectation of the posterior density provides a sample from an approximation to the posterior; secondly, evaluating the true posterior at the sampled points leads to an importance sampler that, asymptotically, targets the true posterior expectations; finally, an alternative importance sampler uses the full Gaussian-process distribution of the approximation to the log-posterior density to re-weight any initial sample and provide both an estimate of the posterior expectation and a measure of the uncertainty in it.
</p>projecteuclid.org/euclid.ba/1502265628_20180326220157Mon, 26 Mar 2018 22:01 EDTModeling Skewed Spatial Data Using a Convolution of Gaussian and Log-Gaussian Processeshttps://projecteuclid.org/euclid.ba/1502762659<strong>Hamid Zareifard</strong>, <strong>Majid Jafari Khaledi</strong>, <strong>Firoozeh Rivaz</strong>, <strong>Mohammad Q. Vahidi-Asl</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 13, Number 2, 531--557.</p><p><strong>Abstract:</strong><br/>
In spatial statistics, it is usual to consider a Gaussian process for spatial latent variables. As the data often exhibit non-normality, we introduce a novel skew process, named hereafter Gaussian-log Gaussian convolution (GLGC) to construct latent spatial models which provide great flexibility in capturing skewness. Some properties including closed-form expressions for the moments and the skewness of the GLGC process are derived. Particularly, we show that the mean square continuity and differentiability of the GLGC process are established by those of the Gaussian and log-Gaussian processes considered in its structure. Moreover, the usefulness of the proposed approach is demonstrated through the analysis of spatial data, including mixed ordinal and continuous outcomes that are jointly modeled through a common latent process. A fully Bayesian analysis is adopted to make inference. Our methodology is illustrated with simulation experiments as well as an environmental data set.
</p>projecteuclid.org/euclid.ba/1502762659_20180326220157Mon, 26 Mar 2018 22:01 EDT