Bayesian Analysis

Multiple-Shrinkage Multinomial Probit Models with Applications to Simulating Geographies in Public Use Data

Lane F. Burgette and Jerome P. Reiter

Full-text: Open access


Multinomial outcomes with many levels can be challenging to model. Information typically accrues slowly with increasing sample size, yet the parameter space expands rapidly with additional covariates. Shrinking all regression parameters towards zero, as often done in models of continuous or binary response variables, is unsatisfactory, since setting parameters equal to zero in multinomial models does not necessarily imply “no effect.” We propose an approach to modeling multinomial outcomes with many levels based on a Bayesian multinomial probit (MNP) model and a multiple shrinkage prior distribution for the regression parameters. The prior distribution encourages the MNP regression parameters to shrink toward a number of learned locations, thereby substantially reducing the dimension of the parameter space. Using simulated data, we compare the predictive performance of this model against two other recently-proposed methods for big multinomial models. The results suggest that the fully Bayesian, multiple shrinkage approach can outperform these other methods. We apply the multiple shrinkage MNP to simulating replacement values for areal identifiers, e.g., census tract indicators, in order to protect data confidentiality in public use datasets.

Article information

Bayesian Anal., Volume 8, Number 2 (2013), 453-478.

First available in Project Euclid: 24 May 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Confidentiality Dirichlet process disclosure spatial synthetic


Burgette, Lane F.; Reiter, Jerome P. Multiple-Shrinkage Multinomial Probit Models with Applications to Simulating Geographies in Public Use Data. Bayesian Anal. 8 (2013), no. 2, 453--478. doi:10.1214/13-BA816.

Export citation


  • Banerjee, S., Carlin, B., and Gelfand, A. (2004). Hierarchical modeling and analysis for spatial data, volume 101. Chapman & Hall/CRC.
  • Bivand, R. (2011). spdep: Spatial dependence: weighting schemes, statistics and models. R package version 0.5-41. URL
  • Blackwell, D. and MacQueen, J. (1973). “Ferguson distributions via Pólya urn schemes.” The Annals of Statistics, 1(2): 353–355.
  • Blei, D., Ng, A., and Jordan, M. (2003). “Latent Dirichlet allocation.” The Journal of Machine Learning Research, 3: 993–1022.
  • Burda, M., Harding, M., and Hausman, J. (2008). “A Bayesian mixed logit-probit model for multinomial choice.” Journal of Econometrics, 147(2): 232–246.
  • Burgette, L. F. and Hahn, P. R. (2010). “Symmetric Bayesian multinomial probit models.” Duke University Statistical Science Technical Report, 1–20.
  • Burgette, L. F. and Nordheim, E. V. (2012). “The trace restriction: An alternative identification strategy for the Bayesian multinomial probit model.” Journal of Business and Economic Statistics, 30(3): 404–410.
  • Carvalho, C., Polson, N., and Scott, J. (2010). “The horseshoe estimator for sparse signals.” Biometrika, 97(2): 465–480.
  • Cawley, G., Talbot, N., and Girolami, M. (2007). “Sparse multinomial logistic regression via Bayesian L1 regularisation.” Advances in neural information processing systems, 19: 209–216.
  • De Blasi, P., James, L., and Lau, J. (2010). “Bayesian nonparametric estimation and consistency of mixed multinomial logit choice models.” Bernoulli, 16(3): 679–704.
  • Duncombe, W., Robbins, M., and Wolf, D. (2001). “Retire to where? A discrete choice model of residential location.” International Journal of Population Geography, 7(4): 281–293.
  • Ferguson, T. (1973). “A Bayesian analysis of some nonparametric problems.” The Annals of Statistics, 1(2): 209–230.
  • Freedman, D. A. (1999). “Ecological Inference and the Ecological Fallacy.” In Smelser, N. J. and Baltes, P. B. (eds.), International Encyclopedia of the Social Sciences, volume 6, 4027–4030. Elsevier.
  • Friedman, J., Hastie, T., and Tibshirani, R. (2010). “Regularization paths for generalized linear models via coordinate descent.” Journal of Statistical Software, 33(1): 1–22.
  • Hans, C. (2009). “Bayesian lasso regression.” Biometrika, 96(4): 835–845.
  • Imai, K. and van Dyk, D. (2005). “A Bayesian analysis of the multinomial probit model using marginal data augmentation.” Journal of Econometrics, 124(2): 311–334.
  • Ishwaran, H. and James, L. (2002). “Approximate Dirichlet Process computing in finite normal mixtures.” Journal of Computational and Graphical Statistics, 11(3): 508–532.
  • Kim, J., Menzefricke, U., and Feinberg, F. (2004). “Assessing heterogeneity in discrete choice models using a Dirichlet process prior.” Review of Marketing Science, 2: 1–39.
  • Krishnapuram, B., Carin, L., Figueiredo, M. A. T., and Hartemink, A. J. (2005). “Sparse multinomial logistic regression: Fast algorithms and generalization bounds.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 957–968.
  • Lenk, P. and Orme, B. (2009). “The value of informative priors in Bayesian inference with sparse priors.” Journal of Marketing Research, 46: 832–845.
  • Machanavajjhala, A., Kifer, D., Abowd, J., Gehrke, J., and Vilhuber, L. (2008). “Privacy: Theory meets practice on the map.” In IEEE 24th International Conference on Data Engineering, 277–286.
  • MacLehose, R. and Dunson, D. (2010). “Bayesian Semiparametric Multiple Shrinkage.” Biometrics, 66(2): 455–462.
  • McCulloch, R. and Rossi, P. (1994). “An exact likelihood analysis of the multinomial probit model.” Journal of Econometrics, 64(1): 207–240.
  • McFadden, D. (1978). “Modelling the choice of residential location.” In Karlqvist, A., Lundqvist, L., Snickars, F., and Weibull, J. (eds.), Spatial Interaction Theory and Planning Models, 75–96. Amsterdam: North-Holland.
  • Park, T. and Casella, G. (2008). “The Bayesian lasso.” Journal of the American Statistical Association, 103(482): 681–686.
  • Reiter, J. (2003). “Inference for partially synthetic, public use microdata sets.” Survey Methodology, 29(2): 181–188.
  • Reiter, J. P. and Mitra, R. (2009). “Estimating risks of identification disclosure in patrially synthetic data.” Journal of Privacy and Confidentiality, 1: 99–110.
  • Reiter, J. P. and Raghunathan, T. E. (2007). “The multiple adaptations of multiple imputation.” Journal of the American Statistical Association, 102(480): 1462–1471.
  • Robinson, W. S. (1950). “Ecological correlations and the behavior of individuals.” American Sociological Review, 15: 351–357.
  • Sethuraman, J. (1994). “A constructive definition of Dirichlet priors.” Statistica Sinica, 4: 639–650.
  • Sha, N., Vannucci, M., Tadesse, M., Brown, P., Dragoni, I., Davies, N., Roberts, T., Contestabile, A., Salmon, M., Buckley, C., et al. (2004). “Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage.” Biometrics, 60(3): 812–819.
  • Shahbaba, B. and Neal, R. (2009). “Nonlinear models using Dirichlet process mixtures.” The Journal of Machine Learning Research, 10: 1829–1850.
  • Taddy, M. (2012). “Multinomial inverse regression for text analysis.” Journal of the American Statistical Association. Forthcoming.
  • Teh, Y., Jordan, M., Beal, M., and Blei, D. (2006). “Hierarchical Dirichlet processes.” Journal of the American Statistical Association, 101(476): 1566–1581.
  • Wang, H. and Reiter, J. (2012). “Multiple imputation for sharing precise geographies in public use data.” Annals of Applied Statistics, 6(1): 229–252.
  • Zhou, Y., Dominici, F., and Louis, T. A. (2010). “A smoothing approach for masking spatial data.” Annals of Applied Statistics, 4(3): 1451–1475.