Registered users receive a variety of benefits including the ability to customize email alerts, create favorite journals list, and save searches.
Please note that a Project Euclid web account does not automatically grant access to full-text content. An institutional or society member subscription is required to view non-Open Access content.
Contact firstname.lastname@example.org with any questions.
The Search for Certainty was published in (Burzdy, 2009) by Krzysztof Burdzy. It examines the "philosophical duopoly" of von Mises and de Finetti at the foundation of probability and statistics and find this duopoly missing. This review exposes the weakness of the arguments presented in the book, it questions the relevance of introducing a new set of probability axioms from a methodological perspective, and it concludes with the lack of impact of this book on statistical foundations and practice.
This article is a response to Christian Robert's review of The Search for Certainty by Krzysztof Burdzy. I provide my own review, some comments on Robert's review and a few general comments about the foundations of probability.
Portfolio balancing requires estimates of covariance between asset returns. Returns data have histories which greatly vary in length, since assets begin public trading at different times. This can lead to a huge amount of missing data---too much for the conventional imputation-based approach. Fortunately, a well-known factorization of the MVN likelihood under the prevailing historical missingness pattern leads to a simple algorithm of OLS regressions that is much more reliable. When there are more assets than returns, however, OLS becomes unstable. Gramacy et. al (2008) showed how classical shrinkage regression may be used instead, thus extending the state of the art to much bigger asset collections, with further accuracy and interpretation advantages. In this paper, we detail a fully Bayesian hierarchical formulation that extends the framework further by allowing for heavy-tailed errors, relaxing the historical missingness assumption, and accounting for estimation risk. We illustrate how this approach compares favorably to the classical one using synthetic data and an investment exercise with real returns. An accompanying R package is on CRAN.
We develop a new general purpose MCMC sampler for arbitrary continuous distributions that requires no tuning. We call this MCMC the t-walk. The t-walk maintains two independent points in the sample space, and all moves are based on proposals that are then accepted with a standard Metropolis-Hastings acceptance probability on the product space. Hence the t-walk is provably convergent under the usual mild requirements. We restrict proposal distributions, or `moves', to those that produce an algorithm that is invariant to scale, and approximately invariant to affine transformations of the state space. Hence scaling of proposals, and effectively also coordinate transformations, that might be used to increase efficiency of the sampler, are not needed since the t-walk's operation is identical on any scaled version of the target distribution. Four moves are given that result in an effective sampling algorithm.
We use the simple device of updating only a random subset of coordinates at each step to allow application of the t-walk to high-dimensional problems. In a series of test problems across dimensions we find that the t-walk is only a small factor less efficient than optimally tuned algorithms, but significantly outperforms general random-walk M-H samplers that are not tuned for specific problems. Further, the t-walk remains effective for target distributions for which no optimal affine transformation exists such as those where correlation structure is very different in differing regions of state space.
Several examples are presented showing good mixing and convergence characteristics, varying in dimensions from 1 to 200 and with radically different scale and correlation structure, using exactly the same sampler. The t-walk is available for R, Python, MatLab and C++ at http://www.cimat.mx/~jac/twalk/
In many contexts the predictive validation of models or their associated prediction strategies is of greater importance than model identification which may be practically impossible. This is particularly so in fields involving complex or high dimensional data where model selection, or more generally predictor selection is the main focus of effort. This paper suggests a unified treatment for predictive analyses based on six `desiderata'. These desiderata are an effort to clarify what criteria a good predictive theory of statistics should satisfy.
We develop a novel Bayesian density regression model based on logistic Gaussian processes and subspace projection. Logistic Gaussian processes provide an attractive alternative to the popular stick-breaking processes for modeling a family of conditional densities that vary smoothly in the conditioning variable. Subspace projection offers dimension reduction of predictors through multiple linear combinations, offering an alternative to the zeroing out theme of variable selection. We illustrate that logistic Gaussian processes and subspace projection combine well to produce a computationally tractable and theoretically sound density regression procedure that offers good out of sample prediction, accurate estimation of subspace projection and satisfactory estimation of subspace dimensionality. We also demonstrate that subspace projection may lead to better prediction than variable selection when predictors are well chosen and possibly dependent on each other, each having a moderate influence on the response.
Two approaches for model-based clustering of categorical time series based on time-homogeneous first-order Markov chains are discussed. For Markov chain clustering the individual transition probabilities are fixed to a group-specific transition matrix. In a new approach called Dirichlet multinomial clustering the rows of the individual transition matrices deviate from the group mean and follow a Dirichlet distribution with unknown group-specific hyperparameters. Estimation is carried out through Markov chain Monte Carlo. Various well-known clustering criteria are applied to select the number of groups. An application to a panel of Austrian wage mobility data leads to an interesting segmentation of the Austrian labor market.
Penalized regression methods for simultaneous variable selection and coefficient estimation, especially those based on the lasso of Tibshirani (1996), have received a great deal of attention in recent years, mostly through frequentist models. Properties such as consistency have been studied, and are achieved by different lasso variations. Here we look at a fully Bayesian formulation of the problem, which is flexible enough to encompass most versions of the lasso that have been previously considered. The advantages of the hierarchical Bayesian formulations are many. In addition to the usual ease-of-interpretation of hierarchical models, the Bayesian formulation produces valid standard errors (which can be problematic for the frequentist lasso), and is based on a geometrically ergodic Markov chain. We compare the performance of the Bayesian lassos to their frequentist counterparts using simulations, data sets that previous lasso papers have used, and a difficult modeling problem for predicting the collapse of governments around the world. In terms of prediction mean squared error, the Bayesian lasso performance is similar to and, in some cases, better than, the frequentist lasso.
Full Bayesian methods are useful tools to account for complex data structures in high-throughput data analyses. The Bayesian FDR, which is the posterior proportion of false positives relative to the total number of rejections, has been widely used to measure statistical significance for full Bayesian methods in microarray analyses. However, the Bayesian FDR is sensitive to prior specification and it is incomparable to the resampling-based FDR estimates employed by most frequentist and empirical Bayesian methods. In this paper, we propose a computationally efficient algorithm to evaluate the statistical significance for full Bayesian methods in the resampling-based framework. The resulting predictive Bayesian FDR is robust to prior specifications and it can produce a more accurate estimate of error rate. In addition, the proposed approach provides a general framework for the objective comparison of performance between full Bayesian methods and the other frequentist and empirical Bayes methods in microarray analyses, which has been an unaddressed issue. A simulation study and a real data example are presented.