Bayesian Analysis

A New Bayesian Approach to Robustness Against Outliers in Linear Regression

Philippe Gagnon, Alain Desgagné, and Mylène Bédard

Advance publication

This article is in its final form and can be cited using the date of online publication and the DOI.

Full-text: Open access

Abstract

Linear regression is ubiquitous in statistical analysis. It is well understood that conflicting sources of information may contaminate the inference when the classical normality of errors is assumed. The contamination caused by the light normal tails follows from an undesirable effect: the posterior concentrates in an area in between the different sources with a large enough scaling to incorporate them all. The theory of conflict resolution in Bayesian statistics (O’Hagan and Pericchi (2012)) recommends to address this problem by limiting the impact of outliers to obtain conclusions consistent with the bulk of the data. In this paper, we propose a model with super heavy-tailed errors to achieve this. We prove that it is wholly robust, meaning that the impact of outliers gradually vanishes as they move further and further away from the general trend. The super heavy-tailed density is similar to the normal outside of the tails, which gives rise to an efficient estimation procedure. In addition, estimates are easily computed. This is highlighted via a detailed user guide, where all steps are explained through a simulated case study. The performance is shown using simulation. All required code is given.

Article information

Source
Bayesian Anal., Advance publication (2018), 26 pages.

Dates
First available in Project Euclid: 23 May 2019

Permanent link to this document
https://projecteuclid.org/euclid.ba/1558598428

Digital Object Identifier
doi:10.1214/19-BA1157

Subjects
Primary: 62F35: Robustness and adaptive procedures
Secondary: 62J05: Linear regression

Keywords
ANOVA ANCOVA built-in robustness maximum likelihood estimation super heavy-tailed distributions variable selection whole robustness

Rights
Creative Commons Attribution 4.0 International License.

Citation

Gagnon, Philippe; Desgagné, Alain; Bédard, Mylène. A New Bayesian Approach to Robustness Against Outliers in Linear Regression. Bayesian Anal., advance publication, 23 May 2019. doi:10.1214/19-BA1157. https://projecteuclid.org/euclid.ba/1558598428


Export citation

References

  • Andrade, J. A. A. and O’Hagan, A. (2011). “Bayesian Robustness Modelling of Location and Scale Parameters.” Scandinavian Journal of Statistics, 38(4): 691–711.
  • Anscombe, F. J. and Guttman, I. (1960). “Rejection of Outliers.” Technometrics, 2(2): 123–147.
  • Box, G. E. P. and Tiao, G. C. (1968). “A Bayesian Approach to Some Outlier Problems.” Biometrika, 55(1): 119–129.
  • Bunke, O., Milhaud, X., et al. (1998). “Asymptotic behavior of Bayes estimates under possibly incorrect models.” Annals of Statistics, 26(2): 617–644.
  • Desgagné, A. (2013). “Full Robustness in Bayesian Modelling of a Scale Parameter.” Bayesian Analysis, 8(1): 187–220.
  • Desgagné, A. (2015). “Robustness to Outliers in Location–Scale Parameter Model using Log-Regularly Varying Distributions.” Annals of Statistics, 43(4): 1568–1595.
  • Desgagné, A. and Gagnon, P. (2019). “Bayesian robustness to outliers in linear regression and ratio estimation.” Brazilian Journal of Probability and Statistics, 33(2): 205–221. ArXiv:1612.05307.
  • Gagnon, P., Desgagné, A., and Bédard, M. (2019). “A New Bayesian Approach to Robustness Against Outliers in Linear Regression – Supplementary Material.” Bayesian Analysis.
  • Geman, S. and Geman, D. (1984). “Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 6(6): 721–741.
  • Gervini, D. and Yohai, V. J. (2002). “A class of robust and fully efficient regression estimators.” Annals of Statistics, 30(2): 583–616.
  • Green, P. J. (1995). “Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination.” Biometrika, 82(4): 711–732.
  • Hastings, W. K. (1970). “Monte Carlo sampling methods using Markov chains and their applications.” emphBiometrika, 57(1): 97–109.
  • Huber, P. J. (1973). “Robust Regression: Asymptotics, Conjectures and Monte Carlo.” Annals of Statistics, 799–821.
  • Karamata, J. (1930). “Sur un mode de croissance régulière des fonctions.” Mathematica (Cluj), 4: 38–53.
  • Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H., and Teller, E. (1953). “Equation of State Calculations by Fast Computing Machines.” Journal of Chemical Physics, 21: 1087.
  • Neal, R. M. (2011). “MCMC using Hamiltonian dynamics.” Handbook of Markov Chain Monte Carlo, 2(11).
  • O’Hagan, A. and Pericchi, L. (2012). “Bayesian heavy-tailed models and conflict resolution: A review.” Brazilian Journal of Probability and Statistics, 26(4): 372–401.
  • Peña, D., Zamar, R., and Yan, G. (2009). “Bayesian likelihood robustness in linear models.” Journal of Statistical Planning and Inference, 139(7): 2196–2207.
  • Rousseeuw, P. J. (1985). “Multivariate estimation with high breakdown point.” Mathematical Statistics and Applications, 37(8): 283–297.
  • Rousseeuw, P. J. and Yohai, V. J. (1984). “Robust regression by means of S-estimators.” In Robust and Nonlinear Time Series Analysis, 256–272. Springer.
  • Scheffé, H. (1947). “A Useful Convergence Theorem for Probability Distributions.” Annals of Mathematical Statistics, 434–438.
  • West, M. (1984). “Outlier Models and Prior Distributions in Bayesian Linear Regression.” Journal of the Royal Statistical Society. Series B. Statistical Methodology, 46(3): 431–439.
  • Yohai, V. J. (1987). “High breakdown-point and high efficiency robust estimates for regression.” Annals of Statistics, 15: 642–656.
  • Yu, C. and Yao, W. (2017). “Robust linear regression: A review and comparison.” Communications in Statistics – Simulation and Computation, 46(8): 6261–6282.

Supplemental materials