Registered users receive a variety of benefits including the ability to customize email alerts, create favorite journals list, and save searches.
Please note that a Project Euclid web account does not automatically grant access to full-text content. An institutional or society member subscription is required to view non-Open Access content.
Contact email@example.com with any questions.
We discuss Bayesian model uncertainty analysis and forecasting in sequential dynamic modeling of multivariate time series. The perspective is that of a decision-maker with a specific forecasting objective that guides thinking about relevant models. Based on formal Bayesian decision-theoretic reasoning, we develop a time-adaptive approach to exploring, weighting, combining and selecting models that differ in terms of predictive variables included. The adaptivity allows for changes in the sets of favored models over time, and is guided by the specific forecasting goals. A synthetic example illustrates how decision-guided variable selection differs from traditional Bayesian model uncertainty analysis and standard model averaging. An applied study in one motivating application of long-term macroeconomic forecasting highlights the utility of the new approach in terms of improving predictions as well as its ability to identify and interpret different sets of relevant models over time with respect to specific, defined forecasting goals.
The median probability model (MPM) (Barbieri and Berger, 2004) is defined as the model consisting of those variables whose marginal posterior probability of inclusion is at least 0.5. The MPM rule yields the best single model for prediction in orthogonal and nested correlated designs. This result was originally conceived under a specific class of priors, such as the point mass mixtures of non-informative and g-type priors. The MPM rule, however, has become so very popular that it is now being deployed for a wider variety of priors and under correlated designs, where the properties of MPM are not yet completely understood. The main thrust of this work is to shed light on properties of MPM in these contexts by (a) characterizing situations when MPM is still safe under correlated designs, (b) providing significant generalizations of MPM to a broader class of priors (such as continuous spike-and-slab priors). We also provide new supporting evidence for the suitability of g-priors, as opposed to independent product priors, using new predictive matching arguments. Furthermore, we emphasize the importance of prior model probabilities and highlight the merits of non-uniform prior probability assignments using the notion of model aggregates.
We consider a binary response which is potentially affected by a set of continuous variables. Of special interest is the causal effect on the response due to an intervention on a specific variable. The latter can be meaningfully determined on the basis of observational data through suitable assumptions on the data generating mechanism. In particular we assume that the joint distribution obeys the conditional independencies (Markov properties) inherent in a Directed Acyclic Graph (DAG), and the DAG is given a causal interpretation through the notion of interventional distribution. We propose a DAG-probit model where the response is generated by discretization through a random threshold of a continuous latent variable and the latter, jointly with the remaining continuous variables, has a distribution belonging to a zero-mean Gaussian model whose covariance matrix is constrained to satisfy the Markov properties of the DAG; the latter is assigned a DAG-Wishart prior through the corresponding Cholesky parameters. Our model leads to a natural definition of causal effect conditionally on a given DAG. Since the DAG which generates the observations is unknown, we present an efficient MCMC algorithm whose target is the posterior distribution on the space of DAGs, the Cholesky parameters of the concentration matrix, and the threshold linking the response to the latent. Our end result is a Bayesian Model Averaging estimate of the causal effect which incorporates parameter, as well as model, uncertainty. The methodology is assessed using simulation experiments and applied to a gene expression data set originating from breast cancer stem cells.
We consider the problem of estimating the predictive density in a heteroskedastic Gaussian model under general divergence loss. Based on a conjugate hierarchical set-up, we consider generic classes of shrinkage predictive densities that are governed by location and scale hyper-parameters. For any α-divergence loss, we propose a risk-estimation based methodology for tuning these shrinkage hyper-parameters. Our proposed predictive density estimators enjoy optimal asymptotic risk properties that are in concordance with the optimal shrinkage calibration point estimation results established by Xie, Kou, and Brown (2012) for heteroskedastic hierarchical models. These α-divergence risk optimality properties of our proposed predictors are not shared by empirical Bayes predictive density estimators that are calibrated by traditional methods such as maximum likelihood and method of moments. We conduct several numerical studies to compare the non-asymptotic performance of our proposed predictive density estimators with other competing methods and obtain encouraging results.
Conditional heteroscedastic (CH) models are routinely used to analyze financial datasets. The classical models such as ARCH-GARCH with time-invariant coefficients are often inadequate to describe frequent changes over time due to market variability. However, we can achieve significantly better insight by considering the time-varying analogs of these models. In this paper, we propose a Bayesian approach to the estimation of such models and develop a computationally efficient MCMC algorithm based on Hamiltonian Monte Carlo (HMC) sampling. We also established posterior contraction rates with increasing sample size in terms of the average Hellinger metric. The performance of our method is compared with frequentist estimates and estimates from the time constant analogs. To conclude the paper we obtain time-varying parameter estimates for some popular Forex (currency conversion rate) and stock market datasets.
Assessing homogeneity of distributions is an old problem that has received considerable attention, especially in the nonparametric Bayesian literature. To this effect, we propose the semi-hierarchical Dirichlet process, a novel hierarchical prior that extends the hierarchical Dirichlet process of Teh et al. (2006) and that avoids the degeneracy issues of nested processes recently described by Camerlenghi et al. (2019a). We go beyond the simple yes/no answer to the homogeneity question and embed the proposed prior in a random partition model; this procedure allows us to give a more comprehensive response to the above question and in fact find groups of populations that are internally homogeneous when such populations are considered. We study theoretical properties of the semi-hierarchical Dirichlet process and of the Bayes factor for the homogeneity test when . Extensive simulation studies and applications to educational data are also discussed.
This article proposes a novel Bayesian implementation of regression with multi-dimensional array (tensor) response on scalar covariates. The recent emergence of complex datasets in various disciplines presents a pressing need to devise regression models with a tensor valued response. This article considers one such application of detecting neuronal activation in fMRI experiments in presence of tensor valued brain images and scalar predictors. The overarching goal in this application is to identify spatial regions (voxels) of a brain activated by an external stimulus. In such and related applications, we propose to regress responses from all cells (or voxels in brain activation studies) together as a tensor response on scalar predictors, accounting for the structural information inherent in the tensor response. To estimate model parameters with proper cell specific shrinkage, we propose a novel multiway stick breaking shrinkage prior distribution on tensor structured regression coefficients, enabling identification of cells which are related to the predictors. The major novelty of this article lies in the theoretical study of the contraction properties for the proposed shrinkage prior in the tensor response regression when the number of cells grows faster than the sample size. Specifically, estimates of tensor regression coefficients are shown to be asymptotically concentrated around the true sparse tensor in -sense under mild assumptions. Various simulation studies and analysis of a brain activation data empirically verify desirable performance of the proposed model in terms of estimation and inference on cell-level parameters.
Bayesian whole-brain functional magnetic resonance imaging (fMRI) analysis with three-dimensional spatial smoothing priors has been shown to produce state-of-the-art activity maps without pre-smoothing the data. The proposed inference algorithms are computationally demanding however, and the spatial priors used have several less appealing properties, such as being improper and having infinite spatial range. We propose a statistical inference framework for whole-brain fMRI analysis based on the class of Matérn covariance functions. The framework uses the Gaussian Markov random field (GMRF) representation of possibly anisotropic spatial Matérn fields via the stochastic partial differential equation (SPDE) approach of Lindgren et al. (2011). This allows for more flexible and interpretable spatial priors, while maintaining the sparsity required for fast inference in the high-dimensional whole-brain setting. We develop an accelerated stochastic gradient descent (SGD) optimization algorithm for empirical Bayes (EB) inference of the spatial hyperparameters. Conditionally on the inferred hyperparameters, we make a fully Bayesian treatment of the brain activity. The Matérn prior is applied to both simulated and experimental task-fMRI data and clearly demonstrates that it is a more reasonable choice than the previously used priors, using comparisons of activity maps, prior simulation and cross-validation.
Within a Bayesian framework, a comprehensive investigation of mixtures of finite mixtures (MFMs), i.e., finite mixtures with a prior on the number of components, is performed. This model class has applications in model-based clustering as well as for semi-parametric density estimation and requires suitable prior specifications and inference methods to exploit its full potential. We contribute by considering a generalized class of MFMs where the hyperparameter of a symmetric Dirichlet prior on the weight distribution depends on the number of components. We show that this model class may be regarded as a Bayesian non-parametric mixture outside the class of Gibbs-type priors. We emphasize the distinction between the number of components K of a mixture and the number of clusters , i.e., the number of filled components given the data. In the MFM model, is a random variable and its prior depends on the prior on K and on the hyperparameter . We employ a flexible prior distribution for the number of components K and derive the corresponding prior on the number of clusters for generalized MFMs. For posterior inference we propose the novel telescoping sampler which allows Bayesian inference for mixtures with arbitrary component distributions without resorting to reversible jump Markov chain Monte Carlo (MCMC) methods. The telescoping sampler explicitly samples the number of components, but otherwise requires only the usual MCMC steps of a finite mixture model. The ease of its application using different component distributions is demonstrated on several data sets.
We study the convergence properties of the Gibbs Sampler in the context of posterior distributions arising from Bayesian analysis of conditionally Gaussian hierarchical models. We develop a multigrid approach to derive analytic expressions for the convergence rates of the algorithm for various widely used model structures, including nested and crossed random effects. Our results apply to multilevel models with an arbitrary number of layers in the hierarchy, while most previous work was limited to the two-level nested case. The theoretical results provide explicit and easy-to-implement guidelines to optimize practical implementations of the Gibbs Sampler, such as indications on which parametrization to choose (e.g. centred and non-centred), which constraint to impose to guarantee statistical identifiability, and which parameters to monitor in the diagnostic process. Simulations suggest that the results are informative also in the context of non-Gaussian distributions and more general MCMC schemes, such as gradient-based ones.
Bayesian methods have proven themselves to be successful across a wide range of scientific problems and have many well-documented advantages over competing methods. However, these methods run into difficulties for two major and prevalent classes of problems: handling data sets with outliers and dealing with model misspecification. We outline the drawbacks of previous solutions to both of these problems and propose a new method as an alternative. When working with the new method, the data is summarized through a set of insufficient statistics, targeting inferential quantities of interest, and the prior distribution is updated with the summary statistics rather than the complete data. By careful choice of conditioning statistics, we retain the main benefits of Bayesian methods while reducing the sensitivity of the analysis to features of the data not captured by the conditioning statistics. For reducing sensitivity to outliers, classical robust estimators (e.g., M-estimators) are natural choices for conditioning statistics. A major contribution of this work is the development of a data augmented Markov chain Monte Carlo (MCMC) algorithm for the linear model and a large class of summary statistics. We demonstrate the method on simulated and real data sets containing outliers and subject to model misspecification. Success is manifested in better predictive performance for data points of interest as compared to competing methods.