Bayesian Analysis Articles (Project Euclid)
http://projecteuclid.org/euclid.ba
The latest articles from Bayesian Analysis on Project Euclid, a site for mathematics and statistics resources.en-usCopyright 2012 Cornell University LibraryEuclid-L@cornell.edu (Project Euclid Team)Wed, 13 Jun 2012 14:27 EDTWed, 13 Jun 2012 14:27 EDThttp://projecteuclid.org/collection/euclid/images/logo_linking_100.gifProject Euclid
http://projecteuclid.org/
Separable covariance arrays via the Tucker product, with applications to multivariate
relational data
http://projecteuclid.org/euclid.ba/1339612040
<strong>Peter D. Hoff</strong><p><strong>Source: </strong>Bayesian Anal., Volume 6, Number 2, 179--196.</p><p><strong>Abstract:</strong><br/>
Modern datasets are often in the form of matrices or arrays, potentially having
correlations along each set of data indices. For example, data involving repeated
measurements of several variables over time may exhibit temporal correlation as well as
correlation among the variables. A possible model for matrix-valued data is the class of
matrix normal distributions, which is parametrized by two covariance matrices, one for
each index set of the data. In this article we discuss an extension of the matrix normal
model to accommodate multidimensional data arrays, or tensors. We show how a particular
array-matrix product can be used to generate the class of array normal distributions
having separable covariance structure. We derive some properties of these covariance
structures and the corresponding array normal distributions, and show how the array-matrix
product can be used to define a semi-conjugate prior distribution and calculate the
corresponding posterior distribution. We illustrate the methodology in an analysis of
multivariate longitudinal network data which take the form of a four-way array.
</p>projecteuclid.org/euclid.ba/1339612040_Wed, 13 Jun 2012 14:27 EDTWed, 13 Jun 2012 14:27 EDTSpatial Panel Data Model with Error Dependence: A Bayesian Separable Covariance Approachhttp://projecteuclid.org/euclid.ba/1446124569<strong>Samantha Leorato</strong>, <strong>Maura Mezzetti</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1035--1069.</p><p><strong>Abstract:</strong><br/>
A hierarchical Bayesian model for spatial panel data is proposed. The idea behind the proposed method is to analyze spatially dependent panel data by means of a separable covariance matrix. Let us indicate the observations as $y_{it}$ , in $i=1,\ldots,N$ regions and at $t=1,\ldots,T$ times, and suppose the covariance matrix of $\mathbf{y}$ , given a set of regressors, is written as a Kronecker product of a purely spatial and a purely temporal covariance. On the one hand, the structure of separable covariances dramatically reduces the number of parameters, while on the other hand, the lack of a structured pattern for spatial and temporal covariances permits capturing possible unknown dependencies (both in time and space). The use of the Bayesian approach allows one to overcome some of the difficulties of the classical (MLE or GMM based) approach. We present two illustrative examples: the estimation of cigarette price elasticity and of the determinants of the house price in 120 municipalities in the Province of Rome.
</p>projecteuclid.org/euclid.ba/1446124569_20160906144911Tue, 06 Sep 2016 14:49 EDTScale-Dependent Priors for Variance Parameters in Structured Additive Distributional Regressionhttp://projecteuclid.org/euclid.ba/1448323525<strong>Nadja Klein</strong>, <strong>Thomas Kneib</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1071--1106.</p><p><strong>Abstract:</strong><br/>
The selection of appropriate hyperpriors for variance parameters is an important and sensible topic in all kinds of Bayesian regression models involving the specification of (conditionally) Gaussian prior structures where the variance parameters determine a data-driven, adaptive amount of prior variability or precision. We consider the special case of structured additive distributional regression where Gaussian priors are used to enforce specific properties such as smoothness or shrinkage on various effect types combined in predictors for multiple parameters related to the distribution of the response. Relying on a recently proposed class of penalised complexity priors motivated from a general set of construction principles, we derive a hyperprior structure where prior elicitation is facilitated by assumptions on the scaling of the different effect types. The posterior distribution is assessed with an adaptive Markov chain Monte Carlo scheme and conditions for its propriety are studied theoretically. We investigate the new type of scale-dependent priors in simulations and two challenging applications, in particular in comparison to the standard inverse gamma priors but also alternatives such as half-normal, half-Cauchy and proper uniform priors for standard deviations.
</p>projecteuclid.org/euclid.ba/1448323525_20160906144911Tue, 06 Sep 2016 14:49 EDTNew Classes of Priors Based on Stochastic Orders and Distortion Functionshttp://projecteuclid.org/euclid.ba/1448590531<strong>J. Pablo Arias-Nicolás</strong>, <strong>Fabrizio Ruggeri</strong>, <strong>Alfonso Suárez-Llorens</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1107--1136.</p><p><strong>Abstract:</strong><br/>
In the context of robust Bayesian analysis, we introduce a new class of prior distributions based on stochastic orders and distortion functions. We provide the new definition, its interpretation and the main properties and we also study the relationship with other classical classes of prior beliefs. We also consider Kolmogorov and Kantorovich metrics to measure the uncertainty induced by such a class, as well as its effect on the set of corresponding Bayes actions. Finally, we conclude the paper with some numerical examples.
</p>projecteuclid.org/euclid.ba/1448590531_20160906144911Tue, 06 Sep 2016 14:49 EDTBayesian Semiparametric Inference on Functional Relationships in Linear Mixed Modelshttp://projecteuclid.org/euclid.ba/1448852253<strong>Seonghyun Jeong</strong>, <strong>Taeyoung Park</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1137--1163.</p><p><strong>Abstract:</strong><br/>
Regression models with varying coefficients changing over certain underlying covariates offer great flexibility in capturing a functional relationship between the response and other covariates. This article extends such regression models to include random effects and to account for correlation and heteroscedasticity in error terms, and proposes an efficient new data-driven method to estimate varying regression coefficients via reparameterization and partial collapse. The proposed methodology is illustrated with a simulated study and longitudinal data from a study of soybean growth.
</p>projecteuclid.org/euclid.ba/1448852253_20160906144911Tue, 06 Sep 2016 14:49 EDTA New Family of Non-Local Priors for Chain Event Graph Model Selectionhttp://projecteuclid.org/euclid.ba/1448852254<strong>Rodrigo A. Collazo</strong>, <strong>Jim Q. Smith</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1165--1201.</p><p><strong>Abstract:</strong><br/>
Chain Event Graphs (CEGs) are a rich and provenly useful class of graphical models. The class contains discrete Bayesian Networks as a special case and is able to depict directly the asymmetric context-specific statements in the model. But bespoke efficient algorithms now need to be developed to search the enormous CEG model space. In different contexts Bayes Factor scored search algorithm using non-local priors (NLPs) has recently proved very successful for searching other huge model spaces. Here we define and explore three different types of NLP that we customise to search CEG spaces. We demonstrate how one of these candidate NLPs provides a framework for search which is both robust and computationally efficient. It also avoids selecting an overfitting model as the standard conjugate methods sometimes do. We illustrate the efficacy of our methods with two examples. First we analyse a previously well-studied 5-year longitudinal study of childhood hospitalisation. The second much larger example selects between competing models of prisoners’ radicalisation in British prisons: because of its size an application beyond the scope of earlier Bayes Factor search algorithms.
</p>projecteuclid.org/euclid.ba/1448852254_20161129220249Tue, 29 Nov 2016 22:02 ESTBayesian Analysis of Continuous Time Markov Chains with Application to Phylogenetic Modellinghttp://projecteuclid.org/euclid.ba/1448899900<strong>Tingting Zhao</strong>, <strong>Ziyu Wang</strong>, <strong>Alexander Cumberworth</strong>, <strong>Joerg Gsponer</strong>, <strong>Nando de Freitas</strong>, <strong>Alexandre Bouchard-Côté</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1203--1237.</p><p><strong>Abstract:</strong><br/>
Bayesian analysis of continuous time, discrete state space time series is an important and challenging problem, where incomplete observation and large parameter sets call for user-defined priors based on known properties of the process. Generalized linear models have a largely unexplored potential to construct such prior distributions. We show that an important challenge with Bayesian generalized linear modelling of continuous time Markov chains is that classical Markov chain Monte Carlo techniques are too ineffective to be practical in that setup. We address this issue using an auxiliary variable construction combined with an adaptive Hamiltonian Monte Carlo algorithm. This sampling algorithm and model make it efficient both in terms of computation and analyst’s time to construct stochastic processes informed by prior knowledge, such as known properties of the states of the process. We demonstrate the flexibility and scalability of our framework using synthetic and real phylogenetic protein data, where a prior based on amino acid physicochemical properties is constructed to obtain accurate rate matrix estimates.
</p>projecteuclid.org/euclid.ba/1448899900_20161129220249Tue, 29 Nov 2016 22:02 ESTBayesian Solution Uncertainty Quantification for Differential Equationshttp://projecteuclid.org/euclid.ba/1473276259<strong>Oksana A. Chkrebtii</strong>, <strong>David A. Campbell</strong>, <strong>Ben Calderhead</strong>, <strong>Mark A. Girolami</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1239--1267.</p><p><strong>Abstract:</strong><br/>
We explore probability modelling of discretization uncertainty for system states defined implicitly by ordinary or partial differential equations. Accounting for this uncertainty can avoid posterior under-coverage when likelihoods are constructed from a coarsely discretized approximation to system equations. A formalism is proposed for inferring a fixed but a priori unknown model trajectory through Bayesian updating of a prior process conditional on model information. A one-step-ahead sampling scheme for interrogating the model is described, its consistency and first order convergence properties are proved, and its computational complexity is shown to be proportional to that of numerical explicit one-step solvers. Examples illustrate the flexibility of this framework to deal with a wide variety of complex and large-scale systems. Within the calibration problem, discretization uncertainty defines a layer in the Bayesian hierarchy, and a Markov chain Monte Carlo algorithm that targets this posterior distribution is presented. This formalism is used for inference on the JAK-STAT delay differential equation model of protein dynamics from indirectly observed measurements. The discussion outlines implications for the new field of probabilistic numerics.
</p>projecteuclid.org/euclid.ba/1473276259_20161129220249Tue, 29 Nov 2016 22:02 ESTComment on Article by Chkrebtii, Campbell, Calderhead, and Girolamihttp://projecteuclid.org/euclid.ba/1480474948<strong>Martin Lysy</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1269--1273.</p><p><strong>Abstract:</strong><br/>
The authors present an ingenious probabilistic numerical solver for deterministic differential equations (DEs). The true solution is progressively identified via model interrogations, in a formal framework of Bayesian updating. I have attempted to extend the authors’ ideas to stochastic differential equations (SDEs), and discuss two challenges encountered in this endeavor: (i) the non-differentiability of SDE sample paths, and (ii) the sampling of diffusion bridges, typically required of solutions to the SDE inverse problem.
</p>projecteuclid.org/euclid.ba/1480474948_20161129220249Tue, 29 Nov 2016 22:02 ESTComment on Article by Chkrebtii, Campbell, Calderhead, and Girolamihttp://projecteuclid.org/euclid.ba/1480474949<strong>Sarat C. Dass</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1275--1277.</p>projecteuclid.org/euclid.ba/1480474949_20161129220249Tue, 29 Nov 2016 22:02 ESTComment on Article by Chkrebtii, Campbell, Calderhead, and Girolamihttp://projecteuclid.org/euclid.ba/1479805385<strong>Bani K. Mallick</strong>, <strong>Keren Yang</strong>, <strong>Nilabja Guha</strong>, <strong>Yalchin Efendiev</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1279--1284.</p><p><strong>Abstract:</strong><br/>
This note is a discussion of the article “Bayesian Solution Uncertainty Quantification for Differential Equations” by Chkrebtii, Campbell, Calderhead, and Girolami. The authors propose stochastic models for differential equation discretizations. While appreciating the main concepts, we point out some possible extensions and modifications.
</p>projecteuclid.org/euclid.ba/1479805385_20161129220249Tue, 29 Nov 2016 22:02 ESTContributed Discussion on Article by Chkrebtii, Campbell, Calderhead, and Girolamihttp://projecteuclid.org/euclid.ba/1480474950<strong>François-Xavier Briol</strong>, <strong>Jon Cockayne</strong>, <strong>Onur Teymur</strong>, <strong>William Weimin Yoo</strong>, <strong>Jon Cockayne</strong>, <strong>Michael Schober</strong>, <strong>Philipp Hennig</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1285--1293.</p>projecteuclid.org/euclid.ba/1480474950_20161129220249Tue, 29 Nov 2016 22:02 ESTRejoinderhttp://projecteuclid.org/euclid.ba/1480129462<strong>Oksana A. Chkrebtii</strong>, <strong>David A. Campbell</strong>, <strong>Ben Calderhead</strong>, <strong>Mark A. Girolami</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 11, Number 4, 1295--1299.</p>projecteuclid.org/euclid.ba/1480129462_20161129220249Tue, 29 Nov 2016 22:02 ESTBayesian Inference and Model Assessment for Spatial Point Patterns Using Posterior Predictive Sampleshttp://projecteuclid.org/euclid.ba/1448899901<strong>Thomas J. Leininger</strong>, <strong>Alan E. Gelfand</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 1--30.</p><p><strong>Abstract:</strong><br/>
Spatial point pattern data describes locations of events observed over a given domain, with the number of and locations of these events being random. Historically, data analysis for spatial point patterns has focused on rejecting complete spatial randomness and then on fitting a richer model specification. From a Bayesian standpoint, the literature is growing but primarily considers versions of Poisson processes, focusing on specifications for the intensity. However, the Bayesian literature on, e.g., clustering or inhibition processes is limited, primarily attending to model fitting. There is little attention given to full inference and scant with regard to model adequacy or model comparison.
The contribution here is full Bayesian analysis, implemented through generation of posterior point patterns using composition. Model features, hence broad inference, can be explored through functions of these samples. The approach is general, applicable to any generative model for spatial point patterns.
The approach is also useful in considering model criticism and model selection both in-sample and, when possible, out-of-sample. Here, we adapt or extend familiar tools. In particular, for model criticism, we consider Bayesian residuals, realized and predictive, along with empirical coverage and prior predictive checks through Monte Carlo tests. For model choice, we propose strategies using predictive mean square error, empirical coverage, and ranked probability scores. For simplicity, we illustrate these methods with standard models such as Poisson processes, log-Gaussian Cox processes, and Gibbs processes. The utility of our approach is demonstrated using a simulation study and two real datasets.
</p>projecteuclid.org/euclid.ba/1448899901_20170117220051Tue, 17 Jan 2017 22:00 ESTBayesian Two-Stage Design for Phase II Clinical Trials with Switching Hypothesis Testshttp://projecteuclid.org/euclid.ba/1450456405<strong>Haolun Shi</strong>, <strong>Guosheng Yin</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 31--51.</p><p><strong>Abstract:</strong><br/>
Conventional phase II clinical trials use either a single-arm or a double-arm scheme to examine the treatment effect of an investigational drug. The hypotheses tests under these two schemes are different, as a single-arm study usually tests the response rate of the new drug against a set of fixed reference rates and a double-arm randomized trial compares the new drug with the standard treatment or placebo. To bridge the single- and double-arm schemes in one phase II clinical trial, we propose a Bayesian two-stage design with changing hypothesis tests. Stage 1 enrolls patients solely to the experimental arm to make a comparison with the reference rates, and stage 2 imposes a double-arm comparison of the experimental arm with the control arm. The design is calibrated with respect to error rates from both the frequentist and Bayesian perspectives. Moreover, we control the “type III error rate”, defined as the probability of prematurely stopping the trial at stage 1 when the trial is supposed to move on to stage 2. We conduct extensive simulations on the calculations of these error rates to examine the operational characteristics of our proposed method, and illustrate it with a non-small cell lung cancer trial.
</p>projecteuclid.org/euclid.ba/1450456405_20170117220051Tue, 17 Jan 2017 22:00 ESTPosterior Concentration Rates for Counting Processes with Aalen Multiplicative Intensitieshttp://projecteuclid.org/euclid.ba/1451333725<strong>Sophie Donnet</strong>, <strong>Vincent Rivoirard</strong>, <strong>Judith Rousseau</strong>, <strong>Catia Scricciolo</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 53--87.</p><p><strong>Abstract:</strong><br/>
We provide sufficient conditions to derive posterior concentration rates for Aalen counting processes on a finite time horizon. The conditions are designed to resemble those proposed in the literature for the problem of density estimation, for instance, in Ghosal et al. (2000), so that existing results on density estimation can be adapted to the present setting. We apply the general theorem to some prior models including Dirichlet process mixtures of uniform densities to estimate monotone nondecreasing intensities and log-splines.
</p>projecteuclid.org/euclid.ba/1451333725_20170117220051Tue, 17 Jan 2017 22:00 ESTBayesian Nonparametric Tests via Sliced Inverse Modelinghttp://projecteuclid.org/euclid.ba/1453211961<strong>Bo Jiang</strong>, <strong>Chao Ye</strong>, <strong>Jun S. Liu</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 89--112.</p><p><strong>Abstract:</strong><br/>
We study the problem of independence and conditional independence tests between categorical covariates and a continuous response variable, which has an immediate application in genetics. Instead of estimating the conditional distribution of the response given values of covariates, we model the conditional distribution of covariates given the discretized response (aka “slices”). By assigning a prior probability to each possible discretization scheme, we can compute efficiently a Bayes factor (BF)-statistic for the independence (or conditional independence) test using a dynamic programming algorithm. Asymptotic and finite-sample properties such as power and null distribution of the BF statistic are studied, and a stepwise variable selection method based on the BF statistic is further developed. We compare the BF statistic with some existing classical methods and demonstrate its statistical power through extensive simulation studies. We apply the proposed method to a mouse genetics data set aiming to detect quantitative trait loci (QTLs) and obtain promising results.
</p>projecteuclid.org/euclid.ba/1453211961_20170117220051Tue, 17 Jan 2017 22:00 ESTThe General Projected Normal Distribution of Arbitrary Dimension: Modeling and Bayesian Inferencehttp://projecteuclid.org/euclid.ba/1453211962<strong>Daniel Hernandez-Stumpfhauser</strong>, <strong>F. Jay Breidt</strong>, <strong>Mark J. van der Woerd</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 113--133.</p><p><strong>Abstract:</strong><br/>
The general projected normal distribution is a simple and intuitive model for directional data in any dimension: a multivariate normal random vector divided by its length is the projection of that vector onto the surface of the unit hypersphere. Observed data consist of the projections, but not the lengths. Inference for this model has been restricted to the two-dimensional (circular) case, using Bayesian methods with data augmentation to generate the latent lengths and a Metropolis-within-Gibbs algorithm to sample from the posterior. We describe a new parameterization of the general projected normal distribution that makes inference in any dimension tractable, including the important three-dimensional (spherical) case, which has not previously been considered. Under this new parameterization, the full conditionals of the unknown parameters have closed forms, and we propose a new slice sampler to draw the latent lengths without the need for rejection. Gibbs sampling with this new scheme is fast and easy, leading to improved Bayesian inference; for example, it is now feasible to conduct model selection among complex mixture and regression models for large data sets. Our parameterization also allows straightforward incorporation of covariates into the covariance matrix of the multivariate normal, increasing the ability of the model to explain directional data as a function of independent regressors. Circular and spherical cases are considered in detail and illustrated with scientific applications. For the circular case, seasonal variation in time-of-day departures of anglers from recreational fishing sites is modeled using covariates in both the mean vector and covariance matrix. For the spherical case, we consider paired angles that describe the relative positions of carbon atoms along the backbone chain of a protein. We fit mixtures of general projected normals to these data, with the best-fitting mixture accurately describing biologically meaningful structures including helices, $\beta$ -sheets, and coils and turns. Finally, we show via simulation that our methodology has satisfactory performance in some 10-dimensional and 50-dimensional problems.
</p>projecteuclid.org/euclid.ba/1453211962_20170117220051Tue, 17 Jan 2017 22:00 ESTHierarchical Shrinkage Priors for Regression Modelshttp://projecteuclid.org/euclid.ba/1453211963<strong>Jim Griffin</strong>, <strong>Phil Brown</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 135--159.</p><p><strong>Abstract:</strong><br/>
In some linear models, such as those with interactions, it is natural to include the relationship between the regression coefficients in the analysis. In this paper, we consider how robust hierarchical continuous prior distributions can be used to express dependence between the size but not the sign of the regression coefficients. For example, to include ideas of heredity in the analysis of linear models with interactions. We develop a simple method for controlling the shrinkage of regression effects to zero at different levels of the hierarchy by considering the behaviour of the continuous prior at zero. Applications to linear models with interactions and generalized additive models are used as illustrations.
</p>projecteuclid.org/euclid.ba/1453211963_20170117220051Tue, 17 Jan 2017 22:00 ESTBayesian Endogenous Tobit Quantile Regressionhttp://projecteuclid.org/euclid.ba/1455559718<strong>Genya Kobayashi</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 161--191.</p><p><strong>Abstract:</strong><br/>
This study proposes $p$ -th Tobit quantile regression models with endogenous variables. In the first stage regression of the endogenous variable on the exogenous variables, the assumption that the $\alpha$ -th quantile of the error term is zero is introduced. Then, the residual of this regression model is included in the $p$ -th quantile regression model in such a way that the $p$ -th conditional quantile of the new error term is zero. The error distribution of the first stage regression is modelled around the zero $\alpha$ -th quantile assumption by using parametric and semiparametric approaches. Since the value of $\alpha$ is a priori unknown, it is treated as an additional parameter and is estimated from the data. The proposed models are then demonstrated by using simulated data and real data on the labour supply of married women.
</p>projecteuclid.org/euclid.ba/1455559718_20170117220051Tue, 17 Jan 2017 22:00 ESTBayesian Detection of Abnormal Segments in Multiple Time Serieshttp://projecteuclid.org/euclid.ba/1456235761<strong>Lawrence Bardwell</strong>, <strong>Paul Fearnhead</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 193--218.</p><p><strong>Abstract:</strong><br/>
We present a novel Bayesian approach to analysing multiple time-series with the aim of detecting abnormal regions. These are regions where the properties of the data change from some normal or baseline behaviour. We allow for the possibility that such changes will only be present in a, potentially small, subset of the time-series. We develop a general model for this problem, and show how it is possible to accurately and efficiently perform Bayesian inference, based upon recursions that enable independent sampling from the posterior distribution. A motivating application for this problem comes from detecting copy number variation (CNVs), using data from multiple individuals. Pooling information across individuals can increase the power of detecting CNVs, but often a specific CNV will only be present in a small subset of the individuals. We evaluate the Bayesian method on both simulated and real CNV data, and give evidence that this approach is more accurate than a recently proposed method for analysing such data.
</p>projecteuclid.org/euclid.ba/1456235761_20170117220051Tue, 17 Jan 2017 22:00 ESTAdaptive Empirical Bayesian Smoothing Splineshttp://projecteuclid.org/euclid.ba/1457383100<strong>Paulo Serra</strong>, <strong>Tatyana Krivobokova</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 219--238.</p><p><strong>Abstract:</strong><br/>
In this paper we develop and study adaptive empirical Bayesian smoothing splines. These are smoothing splines with both smoothing parameter and penalty order determined via the empirical Bayes method from the marginal likelihood of the model. The selected order and smoothing parameter are used to construct adaptive credible sets with good frequentist coverage for the underlying regression function. We use these credible sets as a proxy to show the superior performance of adaptive empirical Bayesian smoothing splines compared to frequentist smoothing splines.
</p>projecteuclid.org/euclid.ba/1457383100_20170117220051Tue, 17 Jan 2017 22:00 ESTTowards a Multidimensional Approach to Bayesian Disease Mappinghttp://projecteuclid.org/euclid.ba/1458324098<strong>Miguel A. Martinez-Beneito</strong>, <strong>Paloma Botella-Rocamora</strong>, <strong>Sudipto Banerjee</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 239--259.</p><p><strong>Abstract:</strong><br/>
Multivariate disease mapping enriches traditional disease mapping studies by analysing several diseases jointly. This yields improved estimates of the geographical distribution of risk from the diseases by enabling borrowing of information across diseases. Beyond multivariate smoothing for several diseases, several other variables, such as sex, age group, race, time period, and so on, could also be jointly considered to derive multivariate estimates. The resulting multivariate structures should induce an appropriate covariance model for the data. In this paper, we introduce a formal framework for the analysis of multivariate data arising from the combination of more than two variables (geographical units and at least two more variables), what we have called Multidimensional Disease Mapping. We develop a theoretical framework containing both separable and non-separable dependence structures and illustrate its performance on the study of real mortality data in Comunitat Valenciana (Spain).
</p>projecteuclid.org/euclid.ba/1458324098_20170117220051Tue, 17 Jan 2017 22:00 ESTEstimating the Marginal Likelihood Using the Arithmetic Mean Identityhttp://projecteuclid.org/euclid.ba/1459772735<strong>Anna Pajor</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 261--287.</p><p><strong>Abstract:</strong><br/>
In this paper we propose a conceptually straightforward method to estimate the marginal data density value (also called the marginal likelihood). We show that the marginal likelihood is equal to the prior mean of the conditional density of the data given the vector of parameters restricted to a certain subset of the parameter space, $A$ , times the reciprocal of the posterior probability of the subset $A$ . This identity motivates one to use Arithmetic Mean estimator based on simulation from the prior distribution restricted to any (but reasonable) subset of the space of parameters. By trimming this space, regions of relatively low likelihood are removed, and thereby the efficiency of the Arithmetic Mean estimator is improved. We show that the adjusted Arithmetic Mean estimator is unbiased and consistent.
</p>projecteuclid.org/euclid.ba/1459772735_20170303220726Fri, 03 Mar 2017 22:07 ESTAdapting the ABC Distance Functionhttp://projecteuclid.org/euclid.ba/1460641065<strong>Dennis Prangle</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 1, 289--309.</p><p><strong>Abstract:</strong><br/>
Approximate Bayesian computation performs approximate inference for models where likelihood computations are expensive or impossible. Instead simulations from the model are performed for various parameter values and accepted if they are close enough to the observations. There has been much progress on deciding which summary statistics of the data should be used to judge closeness, but less work on how to weight them. Typically weights are chosen at the start of the algorithm which normalise the summary statistics to vary on similar scales. However these may not be appropriate in iterative ABC algorithms, where the distribution from which the parameters are proposed is updated. This can substantially alter the resulting distribution of summary statistics, so that different weights are needed for normalisation. This paper presents two iterative ABC algorithms which adaptively update their weights and demonstrates improved results on test applications.
</p>projecteuclid.org/euclid.ba/1460641065_20170303220726Fri, 03 Mar 2017 22:07 ESTBayesian Estimation of Principal Components for Functional Datahttp://projecteuclid.org/euclid.ba/1461092217<strong>Adam J. Suarez</strong>, <strong>Subhashis Ghosal</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 311--333.</p><p><strong>Abstract:</strong><br/>
The area of principal components analysis (PCA) has seen relatively few contributions from the Bayesian school of inference. In this paper, we propose a Bayesian method for PCA in the case of functional data observed with error. We suggest modeling the covariance function by use of an approximate spectral decomposition, leading to easily interpretable parameters. We perform model selection, both over the number of principal components and the number of basis functions used in the approximation. We study in depth the choice of using the implied distributions arising from the inverse Wishart prior and prove a convergence theorem for the case of an exact finite dimensional representation. We also discuss computational issues as well as the care needed in choosing hyperparameters. A simulation study is used to demonstrate competitive performance against a recent frequentist procedure, particularly in terms of the principal component estimation. Finally, we apply the method to a real dataset, where we also incorporate model selection on the dimension of the finite basis used for modeling.
</p>projecteuclid.org/euclid.ba/1461092217_20170308040102Wed, 08 Mar 2017 04:01 ESTBayesian Functional Data Modeling for Heterogeneous Volatilityhttp://projecteuclid.org/euclid.ba/1461603846<strong>Bin Zhu</strong>, <strong>David B. Dunson</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 335--350.</p><p><strong>Abstract:</strong><br/>
Although there are many methods for functional data analysis, less emphasis is put on characterizing variability among volatilities of individual functions. In particular, certain individuals exhibit erratic swings in their trajectory while other individuals have more stable trajectories. There is evidence of such volatility heterogeneity in blood pressure trajectories during pregnancy, for example, and reason to suspect that volatility is a biologically important feature. Most functional data analysis models implicitly assume similar or identical smoothness of the individual functions, and hence can lead to misleading inferences on volatility and an inadequate representation of the functions. We propose a novel class of functional data analysis models characterized using hierarchical stochastic differential equations. We model the derivatives of a mean function and deviation functions using Gaussian processes, while also allowing covariate dependence including on the volatilities of the deviation functions. Following a Bayesian approach to inference, a Markov chain Monte Carlo algorithm is used for posterior computation. The methods are tested on simulated data and applied to blood pressure trajectories during pregnancy.
</p>projecteuclid.org/euclid.ba/1461603846_20170308040102Wed, 08 Mar 2017 04:01 ESTLatent Space Approaches to Community Detection in Dynamic Networkshttp://projecteuclid.org/euclid.ba/1461603847<strong>Daniel K. Sewell</strong>, <strong>Yuguo Chen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 351--377.</p><p><strong>Abstract:</strong><br/>
Embedding dyadic data into a latent space has long been a popular approach to modeling networks of all kinds. While clustering has been done using this approach for static networks, this paper gives two methods of community detection within dynamic network data, building upon the distance and projection models previously proposed in the literature. Our proposed approaches capture the time-varying aspect of the data, can model directed or undirected edges, inherently incorporate transitivity and account for each actor’s individual propensity to form edges. We provide Bayesian estimation algorithms, and apply these methods to a ranked dynamic friendship network and world export/import data.
</p>projecteuclid.org/euclid.ba/1461603847_20170308040102Wed, 08 Mar 2017 04:01 ESTDependent Species Sampling Models for Spatial Density Estimationhttp://projecteuclid.org/euclid.ba/1462297334<strong>Seongil Jo</strong>, <strong>Jaeyong Lee</strong>, <strong>Peter Müller</strong>, <strong>Fernando A. Quintana</strong>, <strong>Lorenzo Trippa</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 379--406.</p><p><strong>Abstract:</strong><br/>
We consider a novel Bayesian nonparametric model for density estimation with an underlying spatial structure. The model is built on a class of species sampling models, which are discrete random probability measures that can be represented as a mixture of random support points and random weights. Specifically, we construct a collection of spatially dependent species sampling models and propose a mixture model based on this collection. The key idea is the introduction of spatial dependence by modeling the weights through a conditional autoregressive model. We present an extensive simulation study to compare the performance of the proposed model with competitors. The proposed model compares favorably to these alternatives. We apply the method to the estimation of summer precipitation density functions using Climate Prediction Center Merged Analysis of Precipitation data over East Asia.
</p>projecteuclid.org/euclid.ba/1462297334_20170308040102Wed, 08 Mar 2017 04:01 ESTA Hierarchical Bayesian Setting for an Inverse Problem in Linear Parabolic PDEs with Noisy Boundary Conditionshttp://projecteuclid.org/euclid.ba/1463078272<strong>Fabrizio Ruggeri</strong>, <strong>Zaid Sawlan</strong>, <strong>Marco Scavino</strong>, <strong>Raul Tempone</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 407--433.</p><p><strong>Abstract:</strong><br/>
In this work we develop a Bayesian setting to infer unknown parameters in initial-boundary value problems related to linear parabolic partial differential equations. We realistically assume that the boundary data are noisy, for a given prescribed initial condition. We show how to derive the joint likelihood function for the forward problem, given some measurements of the solution field subject to Gaussian noise. Given Gaussian priors for the time-dependent Dirichlet boundary values, we analytically marginalize the joint likelihood using the linearity of the equation. Our hierarchical Bayesian approach is fully implemented in an example that involves the heat equation. In this example, the thermal diffusivity is the unknown parameter. We assume that the thermal diffusivity parameter can be modeled a priori through a lognormal random variable or by means of a space-dependent stationary lognormal random field. Synthetic data are used to test the inference. We exploit the behavior of the non-normalized log posterior distribution of the thermal diffusivity. Then, we use the Laplace method to obtain an approximated Gaussian posterior and therefore avoid costly Markov Chain Monte Carlo computations. Expected information gains and predictive posterior densities for observable quantities are numerically estimated using Laplace approximation for different experimental setups.
</p>projecteuclid.org/euclid.ba/1463078272_20170308040102Wed, 08 Mar 2017 04:01 ESTBayesian Inference for Diffusion-Driven Mixed-Effects Modelshttp://projecteuclid.org/euclid.ba/1464035697<strong>Gavin A. Whitaker</strong>, <strong>Andrew Golightly</strong>, <strong>Richard J. Boys</strong>, <strong>Chris Sherlock</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 435--463.</p><p><strong>Abstract:</strong><br/>
Stochastic differential equations (SDEs) provide a natural framework for modelling intrinsic stochasticity inherent in many continuous-time physical processes. When such processes are observed in multiple individuals or experimental units, SDE driven mixed-effects models allow the quantification of both between and within individual variation. Performing Bayesian inference for such models using discrete-time data that may be incomplete and subject to measurement error is a challenging problem and is the focus of this paper. We extend a recently proposed MCMC scheme to include the SDE driven mixed-effects framework. Fundamental to our approach is the development of a novel construct that allows for efficient sampling of conditioned SDEs that may exhibit nonlinear dynamics between observation times. We apply the resulting scheme to synthetic data generated from a simple SDE model of orange tree growth, and real data on aphid numbers recorded under a variety of different treatment regimes. In addition, we provide a systematic comparison of our approach with an inference scheme based on a tractable approximation of the SDE, that is, the linear noise approximation.
</p>projecteuclid.org/euclid.ba/1464035697_20170308040102Wed, 08 Mar 2017 04:01 ESTAutomated Parameter Blocking for Efficient Markov Chain Monte Carlo Samplinghttp://projecteuclid.org/euclid.ba/1464266500<strong>Daniel Turek</strong>, <strong>Perry de Valpine</strong>, <strong>Christopher J. Paciorek</strong>, <strong>Clifford Anderson-Bergman</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 465--490.</p><p><strong>Abstract:</strong><br/>
Markov chain Monte Carlo (MCMC) sampling is an important and commonly used tool for the analysis of hierarchical models. Nevertheless, practitioners generally have two options for MCMC: utilize existing software that generates a black-box “one size fits all" algorithm, or the challenging (and time consuming) task of implementing a problem-specific MCMC algorithm. Either choice may result in inefficient sampling, and hence researchers have become accustomed to MCMC runtimes on the order of days (or longer) for large models. We propose an automated procedure to determine an efficient MCMC block-sampling algorithm for a given model and computing platform. Our procedure dynamically determines blocks of parameters for joint sampling that result in efficient MCMC sampling of the entire model. We test this procedure using a diverse suite of example models, and observe non-trivial improvements in MCMC efficiency for many models. Our procedure is the first attempt at such, and may be generalized to a broader space of MCMC algorithms. Our results suggest that substantive improvements in MCMC efficiency may be practically realized using our automated blocking procedure, or variants thereof, which warrants additional study and application.
</p>projecteuclid.org/euclid.ba/1464266500_20170308040102Wed, 08 Mar 2017 04:01 ESTDynamic Chain Graph Models for Time Series Network Datahttp://projecteuclid.org/euclid.ba/1466165926<strong>Osvaldo Anacleto</strong>, <strong>Catriona Queen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 491--509.</p><p><strong>Abstract:</strong><br/>
This paper introduces a new class of Bayesian dynamic models for inference and forecasting in high-dimensional time series observed on networks. The new model, called the dynamic chain graph model, is suitable for multivariate time series which exhibit symmetries within subsets of series and a causal drive mechanism between these subsets. The model can accommodate high-dimensional, non-linear and non-normal time series and enables local and parallel computation by decomposing the multivariate problem into separate, simpler sub-problems of lower dimensions. The advantages of the new model are illustrated by forecasting traffic network flows and also modelling gene expression data from transcriptional networks.
</p>projecteuclid.org/euclid.ba/1466165926_20170308040102Wed, 08 Mar 2017 04:01 ESTMixtures of $g$ -priors for analysis of variance models with a diverging number of parametershttp://projecteuclid.org/euclid.ba/1467722664<strong>Min Wang</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 511--532.</p><p><strong>Abstract:</strong><br/>
We consider Bayesian approaches for the hypothesis testing problem in the analysis-of-variance (ANOVA) models. With the aid of the singular value decomposition of the centered designed matrix, we reparameterize the ANOVA models with linear constraints for uniqueness into a standard linear regression model without any constraint. We derive the Bayes factors based on mixtures of $g$ -priors and study their consistency properties with a growing number of parameters. It is shown that two commonly used hyper-priors on $g$ (the Zellner-Siow prior and the beta-prime prior) yield inconsistent Bayes factors due to the presence of an inconsistency region around the null model. We propose a new class of hyper-priors to avoid this inconsistency problem. Simulation studies on the two-way ANOVA models are conducted to compare the performance of the proposed procedures with that of some existing ones in the literature.
</p>projecteuclid.org/euclid.ba/1467722664_20170308040102Wed, 08 Mar 2017 04:01 ESTData-Dependent Posterior Propriety of a Bayesian Beta-Binomial-Logit Modelhttp://projecteuclid.org/euclid.ba/1469021382<strong>Hyungsuk Tak</strong>, <strong>Carl N. Morris</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 533--555.</p><p><strong>Abstract:</strong><br/>
A Beta-Binomial-Logit model is a Beta-Binomial model with covariate information incorporated via a logistic regression. Posterior propriety of a Bayesian Beta-Binomial-Logit model can be data-dependent for improper hyper-prior distributions. Various researchers in the literature have unknowingly used improper posterior distributions or have given incorrect statements about posterior propriety because checking posterior propriety can be challenging due to the complicated functional form of a Beta-Binomial-Logit model. We derive data-dependent necessary and sufficient conditions for posterior propriety within a class of hyper-prior distributions that encompass those used in previous studies. When a posterior is improper due to improper hyper-prior distributions, we suggest using proper hyper-prior distributions that can mimic the behaviors of improper choices.
</p>projecteuclid.org/euclid.ba/1469021382_20170308040102Wed, 08 Mar 2017 04:01 ESTVariational Bayes for Functional Data Registration, Smoothing, and Predictionhttp://projecteuclid.org/euclid.ba/1469553352<strong>Cecilia Earls</strong>, <strong>Giles Hooker</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 557--582.</p><p><strong>Abstract:</strong><br/>
We propose a model for functional data registration that extends current inferential capabilities for unregistered data by providing a flexible probabilistic framework that 1) allows for functional prediction in the context of registration and 2) can be adapted to include smoothing and registration in one model. The proposed inferential framework is a Bayesian hierarchical model where the registered functions are modeled as Gaussian processes. To address the computational demands of inference in high-dimensional Bayesian models, we propose an adapted form of the variational Bayes algorithm for approximate inference that performs similarly to Markov Chain Monte Carlo (MCMC) sampling methods for well-defined problems. The efficiency of the adapted variational Bayes (AVB) algorithm allows variability in a predicted registered, warping, and unregistered function to be depicted separately via bootstrapping. Temperature data related to the El-Niño phenomenon is used to demonstrate the unique inferential capabilities for prediction provided by this model.
</p>projecteuclid.org/euclid.ba/1469553352_20170308040102Wed, 08 Mar 2017 04:01 ESTHigh-Dimensional Bayesian Geostatisticshttp://projecteuclid.org/euclid.ba/1494921642<strong>Sudipto Banerjee</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 2, 583--614.</p><p><strong>Abstract:</strong><br/>
With the growing capabilities of Geographic Information Systems (GIS) and user-friendly software, statisticians today routinely encounter geographically referenced data containing observations from a large number of spatial locations and time points. Over the last decade, hierarchical spatiotemporal process models have become widely deployed statistical tools for researchers to better understand the complex nature of spatial and temporal variability. However, fitting hierarchical spatiotemporal models often involves expensive matrix computations with complexity increasing in cubic order for the number of spatial locations and temporal points. This renders such models unfeasible for large data sets. This article offers a focused review of two methods for constructing well-defined highly scalable spatiotemporal stochastic processes. Both these processes can be used as “priors” for spatiotemporal random fields. The first approach constructs a low-rank process operating on a lower-dimensional subspace. The second approach constructs a Nearest-Neighbor Gaussian Process (NNGP) that ensures sparse precision matrices for its finite realizations. Both processes can be exploited as a scalable prior embedded within a rich hierarchical modeling framework to deliver full Bayesian inference. These approaches can be described as model-based solutions for big spatiotemporal datasets. The models ensure that the algorithmic complexity has $\sim n$ floating point operations (flops), where $n$ the number of spatial locations (per iteration). We compare these methods and provide some insight into their methodological underpinnings.
</p>projecteuclid.org/euclid.ba/1494921642_20170523220230Tue, 23 May 2017 22:02 EDTThe Scaled Beta2 Distribution as a Robust Prior for Scaleshttp://projecteuclid.org/euclid.ba/1469553353<strong>María-Eglée Pérez</strong>, <strong>Luis Raúl Pericchi</strong>, <strong>Isabel Cristina Ramírez</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 615--637.</p><p><strong>Abstract:</strong><br/>
We put forward the Scaled Beta2 (SBeta2) as a flexible and tractable family for modeling scales in both hierarchical and non-hierarchical settings. Various sensible alternatives to the overuse of vague Inverted Gamma priors have been proposed, mainly for hierarchical models. Several of these alternatives are particular cases of the SBeta2 or can be well approximated by it. This family of distributions can be obtained in closed form as a Gamma scale mixture of Gamma distributions, as the Student distribution can be obtained as a Gamma scale mixture of Normal variables. Members of the SBeta2 family arise as intrinsic priors and as divergence based priors in diverse situations, hierarchical and non-hierarchical.
The SBeta2 family unifies and generalizes different proposals in the Bayesian literature, and has numerous theoretical and practical advantages: it is flexible, its members can be lighter, as heavy or heavier tailed as the half-Cauchy, and different behaviors at the origin can be modeled. It has the reciprocality property, i.e if the variance parameter is in the family the precision also is. It is easy to simulate from, and can be embedded in a Gibbs sampling scheme. Short of not being conjugate, it is also amazingly tractable: when coupled with a conditional Cauchy prior for locations, the marginal prior for locations can be found explicitly as proportional to known transcendental functions, and for integer values of the hyperparameters an analytical closed form exists. Furthermore, for specific choices of the hyperparameters, the marginal is found to be an explicit “horseshoe prior”, which are known to have excellent theoretical and practical properties. To our knowledge this is the first closed form horseshoe prior obtained. We also show that for certain values of the hyperparameters the mixture of a Normal and a Scaled Beta2 distribution also gives a closed form marginal.
Examples include robust normal and binomial hierarchical modeling and meta-analysis, with real and simulated data.
</p>projecteuclid.org/euclid.ba/1469553353_20170525220502Thu, 25 May 2017 22:05 EDTA Decision-Theoretic Comparison of Treatments to Resolve Air Leaks After Lung Surgery Based on Nonparametric Modelinghttp://projecteuclid.org/euclid.ba/1469553354<strong>Yanxun Xu</strong>, <strong>Peter F. Thall</strong>, <strong>Peter Müller</strong>, <strong>Mehran J. Reza</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 639--652.</p><p><strong>Abstract:</strong><br/>
We propose a Bayesian nonparametric utility-based group sequential design for a randomized clinical trial to compare a gel sealant to standard care for resolving air leaks after pulmonary resection. Clinically, resolving air leaks in the days soon after surgery is highly important, since longer resolution time produces undesirable complications that require extended hospitalization. The problem of comparing treatments is complicated by the fact that the resolution time distributions are skewed and multi-modal, so using means is misleading. We address these challenges by assuming Bayesian nonparametric probability models for the resolution time distributions and basing the comparative test on weighted means. The weights are elicited as clinical utilities of the resolution times. The proposed design uses posterior expected utilities as group sequential test criteria. The procedure’s frequentist properties are studied by extensive simulations.
</p>projecteuclid.org/euclid.ba/1469553354_20170525220502Thu, 25 May 2017 22:05 EDTNonparametric Goodness of Fit via Cross-Validation Bayes Factorshttp://projecteuclid.org/euclid.ba/1471454532<strong>Jeffrey D. Hart</strong>, <strong>Taeryon Choi</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 653--677.</p><p><strong>Abstract:</strong><br/>
A nonparametric Bayes procedure is proposed for testing the fit of a parametric model for a distribution. Alternatives to the parametric model are kernel density estimates. Data splitting makes it possible to use kernel estimates for this purpose in a Bayesian setting. A kernel estimate indexed by bandwidth is computed from one part of the data, a training set, and then used as a model for the rest of the data, a validation set. A Bayes factor is calculated from the validation set by comparing the marginal for the kernel model with the marginal for the parametric model of interest. A simulation study is used to investigate how large the training set should be, and examples involving astronomy and wind data are provided. A proof of Bayes consistency of the proposed test is also provided.
</p>projecteuclid.org/euclid.ba/1471454532_20170525220502Thu, 25 May 2017 22:05 EDTBayesian Mixture Models with Focused Clustering for Mixed Ordinal and Nominal Datahttp://projecteuclid.org/euclid.ba/1471454533<strong>Maria DeYoreo</strong>, <strong>Jerome P. Reiter</strong>, <strong>D. Sunshine Hillygus</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 679--703.</p><p><strong>Abstract:</strong><br/>
In some contexts, mixture models can fit certain variables well at the expense of others in ways beyond the analyst’s control. For example, when the data include some variables with non-trivial amounts of missing values, the mixture model may fit the marginal distributions of the nearly and fully complete variables at the expense of the variables with high fractions of missing data. Motivated by this setting, we present a mixture model for mixed ordinal and nominal data that splits variables into two groups, focus variables and remainder variables. The model allows the analyst to specify a rich sub-model for the focus variables and a simpler sub-model for remainder variables, yet still capture associations among the variables. Using simulations, we illustrate advantages and limitations of focused clustering compared to mixture models that do not distinguish variables. We apply the model to handle missing values in an analysis of the 2012 American National Election Study, estimating relationships among voting behavior, ideology, and political party affiliation.
</p>projecteuclid.org/euclid.ba/1471454533_20170525220502Thu, 25 May 2017 22:05 EDTOptimal Robustness Results for Relative Belief Inferences and the Relationship to Prior-Data Conflicthttp://projecteuclid.org/euclid.ba/1473276256<strong>Luai Al Labadi</strong>, <strong>Michael Evans</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 705--728.</p><p><strong>Abstract:</strong><br/>
The robustness to the prior of Bayesian inference procedures based on a measure of statistical evidence is considered. These inferences are shown to have optimal properties with respect to robustness. Furthermore, a connection between robustness and prior-data conflict is established. In particular, the inferences are shown to be effectively robust when the choice of prior does not lead to prior-data conflict. When there is prior-data conflict, however, robustness may fail to hold.
</p>projecteuclid.org/euclid.ba/1473276256_20170525220502Thu, 25 May 2017 22:05 EDTA Generalised Semiparametric Bayesian Fay–Herriot Model for Small Area Estimation Shrinking Both Means and Varianceshttp://projecteuclid.org/euclid.ba/1473276257<strong>Silvia Polettini</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 729--752.</p><p><strong>Abstract:</strong><br/>
In survey sampling, interest often lies in unplanned domains (or small areas), whose sample sizes may be too small to allow for accurate design-based inference. To improve the direct estimates by borrowing strength from similar domains, most small area methods rely on mixed effects regression models.
This contribution extends the well known Fay–Herriot model (Fay and Herriot, 1979) within a Bayesian approach in two directions. First, the default normality assumption for the random effects is replaced by a nonparametric specification using a Dirichlet process. Second, uncertainty on variances is explicitly introduced, recognizing the fact that they are actually estimated from survey data. The proposed approach shrinks variances as well as means, and accounts for all sources of uncertainty. Adopting a flexible model for the random effects allows to accommodate outliers and vary the borrowing of strength by identifying local neighbourhoods where the exchangeability assumption holds. Through application to real and simulated data, we investigate the performance of the proposed model in predicting the domain means under different distributional assumptions. We also focus on the construction of credible intervals for the area means, a topic that has received less attention in the literature. Frequentist properties such as mean squared prediction error (MSPE), coverage and interval length are investigated. The experiments performed seem to indicate that inferences under the proposed model are characterised by smaller mean squared error than competing approaches; frequentist coverage of the credible intervals is close to nominal.
</p>projecteuclid.org/euclid.ba/1473276257_20170525220502Thu, 25 May 2017 22:05 EDTSelection of Tuning Parameters, Solution Paths and Standard Errors for Bayesian Lassoshttp://projecteuclid.org/euclid.ba/1473276258<strong>Vivekananda Roy</strong>, <strong>Sounak Chakraborty</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 753--778.</p><p><strong>Abstract:</strong><br/>
Penalized regression methods such as the lasso and elastic net (EN) have become popular for simultaneous variable selection and coefficient estimation. Implementation of these methods require selection of the penalty parameters. We propose an empirical Bayes (EB) methodology for selecting these tuning parameters as well as computation of the regularization path plots. The EB method does not suffer from the “double shrinkage problem” of frequentist EN. Also it avoids the difficulty of constructing an appropriate prior on the penalty parameters. The EB methodology is implemented by efficient importance sampling method based on multiple Gibbs sampler chains. Since the Markov chains underlying the Gibbs sampler are proved to be geometrically ergodic, Markov chain central limit theorem can be used to provide asymptotically valid confidence band for profiles of EN coefficients. The practical effectiveness of our method is illustrated by several simulation examples and two real life case studies. Although this article considers lasso and EN for brevity, the proposed EB method is general and can be used to select shrinkage parameters in other regularization methods.
</p>projecteuclid.org/euclid.ba/1473276258_20170525220502Thu, 25 May 2017 22:05 EDTAdaptive Shrinkage in Pólya Tree Type Modelshttp://projecteuclid.org/euclid.ba/1473276260<strong>Li Ma</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 779--805.</p><p><strong>Abstract:</strong><br/>
We introduce a hierarchical generalization to the Pólya tree that incorporates locally adaptive shrinkage to data features of different scales, while maintaining analytical simplicity and computational efficiency. Inference under the new model proceeds efficiently using general recipes for conjugate hierarchical models, and can be completed extremely efficiently for data sets with large numbers of observations. We illustrate in density estimation that the achieved adaptive shrinkage results in proper smoothing and substantially improves inference. We evaluate the performance of the model through simulation under several schematic scenarios carefully designed to be representative of a variety of applications. We compare its performance to that of the Pólya tree, the optional Pólya tree, and the Dirichlet process mixture. We then apply our method to a flow cytometry data with 455,472 observations to achieve fast estimation of a large number of univariate and multivariate densities, and investigate the computational properties of our method in that context. In addition, we establish theoretical guarantees for the model including absolute continuity, full nonparametricity, and posterior consistency. All proofs are given in the Supplementary Material (Ma, 2016).
</p>projecteuclid.org/euclid.ba/1473276260_20170525220502Thu, 25 May 2017 22:05 EDTA Bayes Interpretation of Stacking for $\mathcal{M}$ -Complete and $\mathcal{M}$ -Open Settingshttp://projecteuclid.org/euclid.ba/1473276261<strong>Tri Le</strong>, <strong>Bertrand Clarke</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 807--829.</p><p><strong>Abstract:</strong><br/>
In ${\mathcal{M}}$ -open problems where no true model can be conceptualized, it is common to back off from modeling and merely seek good prediction. Even in ${\mathcal{M}}$ -complete problems, taking a predictive approach can be very useful. Stacking is a model averaging procedure that gives a composite predictor by combining individual predictors from a list of models using weights that optimize a cross-validation criterion. We show that the stacking weights also asymptotically minimize a posterior expected loss. Hence we formally provide a Bayesian justification for cross-validation. Often the weights are constrained to be positive and sum to one. For greater generality, we omit the positivity constraint and relax the ‘sum to one’ constraint.
A key question is ‘What predictors should be in the average?’ We first verify that the stacking error depends only on the span of the models. Then we propose using bootstrap samples from the data to generate empirical basis elements that can be used to form models. We use this in two computed examples to give stacking predictors that are (i) data driven, (ii) optimal with respect to the number of component predictors, and (iii) optimal with respect to the weight each predictor gets.
</p>projecteuclid.org/euclid.ba/1473276261_20170525220502Thu, 25 May 2017 22:05 EDTLatent Class Mixture Models of Treatment Effect Heterogeneityhttp://projecteuclid.org/euclid.ba/1473362569<strong>Zach Shahn</strong>, <strong>David Madigan</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 831--854.</p><p><strong>Abstract:</strong><br/>
We provide a general Bayesian framework for modeling treatment effect heterogeneity in experiments with non-categorical outcomes. Our modeling approach incorporates latent class mixture components to capture discrete heterogeneity and regression interaction terms to capture continuous heterogeneity. Flexible error distributions allow robust posterior inference on parameters of interest. Hierarchical shrinkage priors on relevant parameters address multiple comparisons concerns. Leave-one-out cross validation estimates of expected posterior predictive density obtained through importance sampling, together with posterior predictive checks, provide a convenient method for model selection and evaluation. We apply our approach to a clinical trial comparing two HIV treatments and to an instrumental variable analysis of a natural experiment on the effect of Medicaid enrollment on emergency department utilization.
</p>projecteuclid.org/euclid.ba/1473362569_20170525220502Thu, 25 May 2017 22:05 EDTIntrinsic Bayesian Analysis for Occupancy Modelshttp://projecteuclid.org/euclid.ba/1473431536<strong>Daniel Taylor-Rodríguez</strong>, <strong>Andrew J. Womack</strong>, <strong>Claudio Fuentes</strong>, <strong>Nikolay Bliznyuk</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 855--877.</p><p><strong>Abstract:</strong><br/>
Occupancy models are typically used to determine the probability of a species being present at a given site while accounting for imperfect detection. The survey data underlying these models often include information on several predictors that could potentially characterize habitat suitability and species detectability. Because these variables might not all be relevant, model selection techniques are necessary in this context. In practice, model selection is performed using the Akaike Information Criterion (AIC), as few other alternatives are available. This paper builds an objective Bayesian variable selection framework for occupancy models through the intrinsic prior methodology. The procedure incorporates priors on the model space that account for test multiplicity and respect the polynomial hierarchy of the predictors when higher-order terms are considered. The methodology is implemented using a stochastic search algorithm that is able to thoroughly explore large spaces of occupancy models. The proposed strategy is entirely automatic and provides control of false positives without sacrificing the discovery of truly meaningful covariates. The performance of the method is evaluated and compared to AIC through a simulation study. The method is illustrated on two datasets previously studied in the literature.
</p>projecteuclid.org/euclid.ba/1473431536_20170525220502Thu, 25 May 2017 22:05 EDTBayesian Analysis of Boundary and Near-Boundary Evidence in Econometric Models with Reduced Rankhttps://projecteuclid.org/euclid.ba/1501120970<strong>Nalan Baştürk</strong>, <strong>Lennart Hoogerheide</strong>, <strong>Herman K. van Dijk</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 3, 879--917.</p><p><strong>Abstract:</strong><br/>
Weak empirical evidence near and at the boundary of the parameter region is a predominant feature in econometric models. Examples are macroeconometric models with weak information on the number of stable relations, microeconometric models measuring connectivity between variables with weak instruments, financial econometric models like the random walk with weak evidence on the efficient market hypothesis and factor models for investment policies with weak information on the number of unobserved factors. A Bayesian analysis is presented of the common issue in these models, which refers to the topic of a reduced rank. Reduced rank is a boundary issue and its effect on the shape of the posteriors of the equation system parameters with a reduced rank is explored systematically. These shapes refer to ridges due to weak identification, fat tails and multimodality. Discussing several alternative routes to construct regularization priors, we show that flat posterior surfaces are integrable even though the marginal posterior tends to infinity if the parameters tend to the values corresponding to local non-identification. We introduce a lasso type shrinkage prior combined with orthogonal normalization which restricts the range of the parameters in a plausible way. This can be combined with other shrinkage, smoothness and data based priors using training samples or dummy observations. Using such classes of priors, it is shown how conditional probabilities of evidence near and at the boundary can be evaluated effectively. These results allow for Bayesian inference using mixtures of posteriors under the boundary state and the near-boundary state. The approach is applied to the estimation of education-income effect in all states of the US economy. The empirical results indicate that there exist substantial differences of this effect between almost all states. This may affect important national and state-wise policies on required length of education. The use of the proposed approach may, in general, lead to more accurate forecasting and decision analysis in other problems in economics, finance and marketing.
</p>projecteuclid.org/euclid.ba/1501120970_20170829220126Tue, 29 Aug 2017 22:01 EDTA Bayesian Nonparametric Approach to Testing for Dependence Between Random Variableshttps://projecteuclid.org/euclid.ba/1474463236<strong>Sarah Filippi</strong>, <strong>Chris C. Holmes</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 919--938.</p><p><strong>Abstract:</strong><br/>
Nonparametric and nonlinear measures of statistical dependence between pairs of random variables are important tools in modern data analysis. In particular the emergence of large data sets can now support the relaxation of linearity assumptions implicit in traditional association scores such as correlation. Here we describe a Bayesian nonparametric procedure that leads to a tractable, explicit and analytic quantification of the relative evidence for dependence vs independence. Our approach uses Pólya tree priors on the space of probability measures which can then be embedded within a decision theoretic test for dependence. Pólya tree priors can accommodate known uncertainty in the form of the underlying sampling distribution and provides an explicit posterior probability measure of both dependence and independence. Well known advantages of having an explicit probability measure include: easy comparison of evidence across different studies; encoding prior information; quantifying changes in dependence across different experimental conditions, and the integration of results within formal decision analysis.
</p>projecteuclid.org/euclid.ba/1474463236_20171117220542Fri, 17 Nov 2017 22:05 ESTJoint Species Distribution Modeling: Dimension Reduction Using Dirichlet Processeshttps://projecteuclid.org/euclid.ba/1478073617<strong>Daniel Taylor-Rodríguez</strong>, <strong>Kimberly Kaufeld</strong>, <strong>Erin M. Schliep</strong>, <strong>James S. Clark</strong>, <strong>Alan E. Gelfand</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 939--967.</p><p><strong>Abstract:</strong><br/> Species distribution models are used to evaluate the variables that affect the distribution and abundance of species and to predict biodiversity. Historically, such models have been fitted to each species independently. While independent models can provide useful information regarding distribution and abundance, they ignore the fact that, after accounting for environmental covariates, residual interspecies dependence persists. With stacking of individual models, misleading behaviors, may arise. In particular, individual models often imply too many species per location. Recently developed joint species distribution models have application to presence–absence, continuous or discrete abundance, abundance with large numbers of zeros, and discrete, ordinal, and compositional data. Here, we deal with the challenge of joint modeling for a large number of species. To appreciate the challenge in the simplest way, with just presence/absence (binary) response and say, $S$ species, we have an $S$ -way contingency table with $2^{S}$ cell probabilities. Even if $S$ is as small as $100$ this is an enormous table, infeasible to work with without some structure to reduce dimension. We develop a computationally feasible approach to accommodate a large number of species (say order $10^{3}$ ) that allows us to: 1) assess the dependence structure across species; 2) identify clusters of species that have similar dependence patterns; and 3) jointly predict species distributions. To do so, we build hierarchical models capturing dependence between species at the first or “data” stage rather than at a second or “mean” stage. We employ the Dirichlet process for clustering in a novel way to reduce dimension in the joint covariance structure. This last step makes computation tractable. We use Forest Inventory Analysis (FIA) data in the eastern region of the United States to demonstrate our method. It consists of presence–absence measurements for 112 tree species, observed east of the Mississippi. As a proof of concept for our dimension reduction approach, we also include simulations using continuous and binary data. </p>projecteuclid.org/euclid.ba/1478073617_20171117220542Fri, 17 Nov 2017 22:05 ESTVariable Selection in Seemingly Unrelated Regressions with Random Predictorshttps://projecteuclid.org/euclid.ba/1488855633<strong>David Puelz</strong>, <strong>P. Richard Hahn</strong>, <strong>Carlos M. Carvalho</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 969--989.</p><p><strong>Abstract:</strong><br/>
This paper considers linear model selection when the response is vector-valued and the predictors, either all or some, are randomly observed. We propose a new approach that decouples statistical inference from the selection step in a “post-inference model summarization” strategy. We study the impact of predictor uncertainty on the model selection procedure. The method is demonstrated through an application to asset pricing.
</p>projecteuclid.org/euclid.ba/1488855633_20171117220542Fri, 17 Nov 2017 22:05 ESTApproximate Bayesian Inference in Semiparametric Copula Modelshttps://projecteuclid.org/euclid.ba/1510110045<strong>Clara Grazian</strong>, <strong>Brunero Liseo</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 991--1016.</p><p><strong>Abstract:</strong><br/>
We describe a simple method for making inference on a functional of a multivariate distribution, based on its copula representation. We make use of an approximate Bayesian Monte Carlo algorithm, where the proposed values of the functional of interest are weighted in terms of their Bayesian exponentially tilted empirical likelihood. This method is particularly useful when the “true” likelihood function associated with the working model is too costly to evaluate or when the working model is only partially specified.
</p>projecteuclid.org/euclid.ba/1510110045_20171117220542Fri, 17 Nov 2017 22:05 ESTFast Simulation of Hyperplane-Truncated Multivariate Normal Distributionshttps://projecteuclid.org/euclid.ba/1488337478<strong>Yulai Cong</strong>, <strong>Bo Chen</strong>, <strong>Mingyuan Zhou</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1017--1037.</p><p><strong>Abstract:</strong><br/>
We introduce a fast and easy-to-implement simulation algorithm for a multivariate normal distribution truncated on the intersection of a set of hyperplanes, and further generalize it to efficiently simulate random variables from a multivariate normal distribution whose covariance (precision) matrix can be decomposed as a positive-definite matrix minus (plus) a low-rank symmetric matrix. Example results illustrate the correctness and efficiency of the proposed simulation algorithms.
</p>projecteuclid.org/euclid.ba/1488337478_20171117220542Fri, 17 Nov 2017 22:05 ESTBayesian Variable Selection Regression of Multivariate Responses for Group Datahttps://projecteuclid.org/euclid.ba/1508983455<strong>B. Liquet</strong>, <strong>K. Mengersen</strong>, <strong>A. N. Pettitt</strong>, <strong>M. Sutton</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1039--1067.</p><p><strong>Abstract:</strong><br/>
We propose two multivariate extensions of the Bayesian group lasso for variable selection and estimation for data with high dimensional predictors and multi-dimensional response variables. The methods utilize spike and slab priors to yield solutions which are sparse at either a group level or both a group and individual feature level. The incorporation of group structure in a predictor matrix is a key factor in obtaining better estimators and identifying associations between multiple responses and predictors. The approach is suited to many biological studies where the response is multivariate and each predictor is embedded in some biological grouping structure such as gene pathways. Our Bayesian models are connected with penalized regression, and we prove both oracle and asymptotic distribution properties under an orthogonal design. We derive efficient Gibbs sampling algorithms for our models and provide the implementation in a comprehensive R package called MBSGS available on the Comprehensive R Archive Network (CRAN). The performance of the proposed approaches is compared to state-of-the-art variable selection strategies on simulated data sets. The proposed methodology is illustrated on a genetic dataset in order to identify markers grouping across chromosomes that explain the joint variability of gene expression in multiple tissues.
</p>projecteuclid.org/euclid.ba/1508983455_20171117220542Fri, 17 Nov 2017 22:05 ESTInconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing Ithttps://projecteuclid.org/euclid.ba/1510974325<strong>Peter Grünwald</strong>, <strong>Thijs van Ommen</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1069--1103.</p><p><strong>Abstract:</strong><br/>
We empirically show that Bayesian inference can be inconsistent under misspecification in simple linear regression problems, both in a model averaging/selection and in a Bayesian ridge regression setting. We use the standard linear model, which assumes homoskedasticity, whereas the data are heteroskedastic (though, significantly, there are no outliers). As sample size increases, the posterior puts its mass on worse and worse models of ever higher dimension. This is caused by hypercompression , the phenomenon that the posterior puts its mass on distributions that have much larger KL divergence from the ground truth than their average, i.e. the Bayes predictive distribution. To remedy the problem, we equip the likelihood in Bayes’ theorem with an exponent called the learning rate, and we propose the SafeBayesian method to learn the learning rate from the data. SafeBayes tends to select small learning rates, and regularizes more, as soon as hypercompression takes place. Its results on our data are quite encouraging.
</p>projecteuclid.org/euclid.ba/1510974325_20171117220542Fri, 17 Nov 2017 22:05 ESTThe Horseshoe+ Estimator of Ultra-Sparse Signalshttps://projecteuclid.org/euclid.ba/1474572263<strong>Anindya Bhadra</strong>, <strong>Jyotishka Datta</strong>, <strong>Nicholas G. Polson</strong>, <strong>Brandon Willard</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1105--1131.</p><p><strong>Abstract:</strong><br/>
We propose a new prior for ultra-sparse signal detection that we term the “horseshoe+ prior.” The horseshoe+ prior is a natural extension of the horseshoe prior that has achieved success in the estimation and detection of sparse signals and has been shown to possess a number of desirable theoretical properties while enjoying computational feasibility in high dimensions. The horseshoe+ prior builds upon these advantages. Our work proves that the horseshoe+ posterior concentrates at a rate faster than that of the horseshoe in the Kullback–Leibler (K-L) sense. We also establish theoretically that the proposed estimator has lower posterior mean squared error in estimating signals compared to the horseshoe and achieves the optimal Bayes risk in testing up to a constant. For one-group global–local scale mixture priors, we develop a new technique for analyzing the marginal sparse prior densities using the class of Meijer-G functions. In simulations, the horseshoe+ estimator demonstrates superior performance in a standard design setting against competing methods, including the horseshoe and Dirichlet–Laplace estimators. We conclude with an illustration on a prostate cancer data set and by pointing out some directions for future research.
</p>projecteuclid.org/euclid.ba/1474572263_20171117220542Fri, 17 Nov 2017 22:05 ESTAsymptotic Optimality of One-Group Shrinkage Priors in Sparse High-dimensional Problemshttps://projecteuclid.org/euclid.ba/1475266758<strong>Prasenjit Ghosh</strong>, <strong>Arijit Chakrabarti</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1133--1161.</p><p><strong>Abstract:</strong><br/>
We study asymptotic optimality of inference in a high-dimensional sparse normal means model using a broad class of one-group shrinkage priors. Assuming that the proportion of non-zero means is known, we show that the corresponding Bayes estimates asymptotically attain the minimax risk (up to a multiplicative constant) for estimation with squared error loss. The constant is shown to be 1 for the important sub-class of “horseshoe-type” priors proving exact asymptotic minimaxity property for these priors, a result hitherto unknown in the literature. An empirical Bayes version of the estimator is shown to achieve the minimax rate in case the level of sparsity is unknown. We prove that the resulting posterior distributions contract around the true mean vector at the minimax optimal rate and provide important insight about the possible rate of posterior contraction around the corresponding Bayes estimator. Our work shows that for rate optimality, a heavy tailed prior with sufficient mass around zero is enough, a pole at zero like the horseshoe prior is not necessary. This part of the work is inspired by Pas et al. (2014). We come up with novel unifying arguments to extend their results over the general class of priors. Next we focus on simultaneous hypothesis testing for the means under the additive $0-1$ loss where the means are modeled through a two-groups mixture distribution. We study asymptotic risk properties of certain multiple testing procedures induced by the class of one-group priors under study, when applied in this set-up. Our key results show that the tests based on the “horseshoe-type” priors asymptotically achieve the risk of the optimal solution in this two-groups framework up to the correct constant and are thus asymptotically Bayes optimal under sparsity (ABOS). This is the first result showing that in a sparse problem a class of one-group priors can exactly mimic the performance of an optimal two-groups solution asymptotically. Our work shows an intrinsic technical connection between the theories of minimax estimation and simultaneous hypothesis testing for such one-group priors.
</p>projecteuclid.org/euclid.ba/1475266758_20171117220542Fri, 17 Nov 2017 22:05 ESTBayesian Analysis of the Stationary MAP 2https://projecteuclid.org/euclid.ba/1477321094<strong>P. Ramírez-Cobo</strong>, <strong>R. E. Lillo</strong>, <strong>M. P. Wiper</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1163--1194.</p><p><strong>Abstract:</strong><br/>
In this article we describe a method for carrying out Bayesian estimation for the two-state stationary Markov arrival process ( $\mathit{MAP}_{2}$ ), which has been proposed as a versatile model in a number of contexts. The approach is illustrated on both simulated and real data sets, where the performance of the $\mathit{MAP}_{2}$ is compared against that of the well-known $\mathit{MMPP}_{2}$ . As an extension of the method, we estimate the queue length and virtual waiting time distributions of a stationary $\mathit{MAP}_{2}/G/1$ queueing system, a matrix generalization of the $M/G/1$ queue that allows for dependent inter-arrival times. Our procedure is illustrated with applications in Internet traffic analysis.
</p>projecteuclid.org/euclid.ba/1477321094_20171117220542Fri, 17 Nov 2017 22:05 ESTMarginal Pseudo-Likelihood Learning of Discrete Markov Network Structureshttps://projecteuclid.org/euclid.ba/1477918728<strong>Johan Pensar</strong>, <strong>Henrik Nyman</strong>, <strong>Juha Niiranen</strong>, <strong>Jukka Corander</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1195--1215.</p><p><strong>Abstract:</strong><br/>
Markov networks are a popular tool for modeling multivariate distributions over a set of discrete variables. The core of the Markov network representation is an undirected graph which elegantly captures the dependence structure over the variables. Traditionally, the Bayesian approach of learning the graph structure from data has been done under the assumption of chordality since non-chordal graphs are difficult to evaluate for likelihood-based scores. Recently, there has been a surge of interest towards the use of regularized pseudo-likelihood methods as such approaches can avoid the assumption of chordality. Many of the currently available methods necessitate the use of a tuning parameter to adapt the level of regularization for a particular dataset. Here we introduce the marginal pseudo-likelihood which has a built-in regularization through marginalization over the graph-specific nuisance parameters. We prove consistency of the resulting graph estimator via comparison with the pseudo-Bayesian information criterion. To identify high-scoring graph structures in a high-dimensional setting we design a two-step algorithm that exploits the decomposable structure of the score. Using synthetic and existing benchmark networks, the marginal pseudo-likelihood method is shown to perform favorably against recent popular structure learning methods.
</p>projecteuclid.org/euclid.ba/1477918728_20171117220542Fri, 17 Nov 2017 22:05 ESTCorrection to: “Posterior Consistency of Bayesian Quantile Regression Based on the Misspecified Asymmetric Laplace Density”https://projecteuclid.org/euclid.ba/1505354708<strong>Karthik Sriram</strong>, <strong>R.V. Ramamoorthi</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1217--1219.</p><p><strong>Abstract:</strong><br/>
In this note, we highlight and provide corrections to two errors in the paper: Karthik Sriram, R.V. Ramamoorthi, Pulak Ghosh (2013) “Posterior Consistency of Bayesian Quantile Regression Based on the Misspecified Asymmetric Laplace Density”, Bayesian Analysis, Vol 8, Num 2, pg 479–504 .
</p>projecteuclid.org/euclid.ba/1505354708_20171117220542Fri, 17 Nov 2017 22:05 ESTUncertainty Quantification for the Horseshoe (with Discussion)https://projecteuclid.org/euclid.ba/1504231319<strong>Stéphanie van der Pas</strong>, <strong>Botond Szabó</strong>, <strong>Aad van der Vaart</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1221--1274.</p><p><strong>Abstract:</strong><br/>
We investigate the credible sets and marginal credible intervals resulting from the horseshoe prior in the sparse multivariate normal means model. We do so in an adaptive setting without assuming knowledge of the sparsity level (number of signals). We consider both the hierarchical Bayes method of putting a prior on the unknown sparsity level and the empirical Bayes method with the sparsity level estimated by maximum marginal likelihood. We show that credible balls and marginal credible intervals have good frequentist coverage and optimal size if the sparsity level of the prior is set correctly. By general theory honest confidence sets cannot adapt in size to an unknown sparsity level. Accordingly the hierarchical and empirical Bayes credible sets based on the horseshoe prior are not honest over the full parameter space. We show that this is due to over-shrinkage for certain parameters and characterise the set of parameters for which credible balls and marginal credible intervals do give correct uncertainty quantification. In particular we show that the fraction of false discoveries by the marginal Bayesian procedure is controlled by a correct choice of cut-off.
</p>projecteuclid.org/euclid.ba/1504231319_20171117220542Fri, 17 Nov 2017 22:05 ESTDeep Learning: A Bayesian Perspectivehttps://projecteuclid.org/euclid.ba/1510801992<strong>Nicholas G. Polson</strong>, <strong>Vadim Sokolov</strong>. <p><strong>Source: </strong>Bayesian Analysis, Volume 12, Number 4, 1275--1304.</p><p><strong>Abstract:</strong><br/>
Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and hyper-parameter tuning. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), reduced rank regression (RRR), projection pursuit regression (PPR) are all shown to be shallow learners. Their deep learning counterparts exploit multiple deep layers of data reduction which provide predictive performance gains. Stochastic gradient descent (SGD) training optimisation and Dropout (DO) regularization provide estimation and variable selection. Bayesian regularization is central to finding weights and connections in networks to optimize the predictive bias-variance trade-off. To illustrate our methodology, we provide an analysis of international bookings on Airbnb. Finally, we conclude with directions for future research.
</p>projecteuclid.org/euclid.ba/1510801992_20171117220542Fri, 17 Nov 2017 22:05 EST