Electronic Journal of Statistics

A note on parameter estimation for misspecified regression models with heteroskedastic errors

James P. Long

Full-text: Open access


Misspecified models often provide useful information about the true data generating distribution. For example, if $y$ is a non–linear function of $x$ the least squares estimator $\widehat{\beta}$ is an estimate of $\beta$, the slope of the best linear approximation to the non–linear function. Motivated by problems in astronomy, we study how to incorporate observation measurement error variances into fitting parameters of misspecified models. Our asymptotic theory focuses on the particular case of linear regression where often weighted least squares procedures are used to account for heteroskedasticity. We find that when the response is a non–linear function of the independent variable, the standard procedure of weighting by the inverse of the observation variances can be counter–productive. In particular, ordinary least squares may have lower asymptotic variance. We construct an adaptive estimator which has lower asymptotic variance than either OLS or standard WLS. We demonstrate our theory in a small simulation and apply these ideas to the problem of estimating the period of a periodic function using a sinusoidal model.

Article information

Electron. J. Statist., Volume 11, Number 1 (2017), 1464-1490.

Received: August 2016
First available in Project Euclid: 19 April 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62J05: Linear regression
Secondary: 62F10: Point estimation

Heteroskedasticity model misspecification approximate models weighted least squares sandwich estimators astrostatistics

Creative Commons Attribution 4.0 International License.


Long, James P. A note on parameter estimation for misspecified regression models with heteroskedastic errors. Electron. J. Statist. 11 (2017), no. 1, 1464--1490. doi:10.1214/17-EJS1255. https://projecteuclid.org/euclid.ejs/1492567402

Export citation


  • [1] B. Blight and L. Ott. A bayesian approach to model inadequacy for polynomial regression., Biometrika, 62(1):79–88, 1975.
  • [2] A. Buja, R. Berk, L. Brown, E. George, E. Pitkin, M. Traskin, K. Zhan, and L. Zhao. Models as approximations: How random predictors and model violations invalidate classical inference in regression., arXiv preprint arXiv:1404.1578, 2014.
  • [3] R. J. Carroll. Adapting for heteroscedasticity in linear models., The Annals of Statistics, pages 1224–1233, 1982.
  • [4] R. J. Carroll and D. Ruppert. Robust estimation in heteroscedastic linear models., The Annals of Statistics, pages 429–441, 1982.
  • [5] J. Chen and J. Shao. Iterative weighted least squares estimators., The Annals of Statistics, pages 1071–1092, 1993.
  • [6] I. Czekala, S. M. Andrews, K. S. Mandel, D. W. Hogg, and G. M. Green. Constructing a flexible likelihood function for spectroscopic inference., The Astrophysical Journal, 812(2):128, 2015.
  • [7] J. H. Friedman. A variable span smoother. Technical report, DTIC Document, 1984.
  • [8] W. A. Fuller and J. Rao. Estimation for a linear regression model with unknown diagonal covariance matrix., The Annals of Statistics, pages 1149–1158, 1978.
  • [9] P. M. Hooper. Iterative weighted least squares estimation in heteroscedastic linear models., Journal of the American Statistical Association, 88(421):179–184, 1993.
  • [10] P. J. Huber. The behavior of maximum likelihood estimates under nonstandard conditions. In, Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 221–233, 1967.
  • [11] Ž. Ivezić, J. A. Smith, G. Miknaitis, H. Lin, D. Tucker, R. H. Lupton, J. E. Gunn, G. R. Knapp, M. A. Strauss, B. Sesar, et al. Sloan digital sky survey standard star catalog for stripe 82: The dawn of industrial 1% optical photometry., The Astronomical Journal, 134(3):973, 2007.
  • [12] J. Jobson and W. Fuller. Least squares estimation when the covariance matrix and parameter vector are functionally related., Journal of the American Statistical Association, 75(369):176–181, 1980.
  • [13] M. C. Kennedy and A. O’Hagan. Bayesian calibration of computer models., Journal of the Royal Statistical Society. Series B, Statistical Methodology, pages 425–464, 2001.
  • [14] J. P. Long, E. C. Chi, and R. G. Baraniuk. Estimating a common period for a set of irregularly sampled functions with applications to periodic variable star data., arXiv preprint arXiv:1412.6520, 2014.
  • [15] J. S. Long and L. H. Ervin. Using heteroscedasticity consistent standard errors in the linear regression model., The American Statistician, 54(3):217–224, 2000.
  • [16] Y. Ma and L. Zhu. Doubly robust and efficient estimators for heteroscedastic partially linear single-index models allowing high dimensional covariates., Journal of the Royal Statistical Society: Series B (Statistical Methodology), 75(2):305–322, 2013.
  • [17] Y. Ma, J.-M. Chiou, and N. Wang. Efficient semiparametric estimator for heteroscedastic partially linear models., Biometrika, 93(1):75–84, 2006.
  • [18] J. G. MacKinnon and H. White. Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties., Journal of econometrics, 29(3):305–325, 1985.
  • [19] N. Mondrik, J. P. Long, and J. L. Marshall. A multiband generalization of the analysis of variance period estimation algorithm and the effect of inter-band observing cadence on period recovery rate., arXiv preprint arXiv:1508.04772, 2015.
  • [20] J. W. Richards, A. B. Lee, C. M. Schafer, P. E. Freeman, et al. Prototype selection for parameter estimation in complex models., The Annals of Applied Statistics, 6(1):383–408, 2012.
  • [21] A. G. Riess, L. Macri, S. Casertano, H. Lampeitl, H. C. Ferguson, A. V. Filippenko, S. W. Jha, W. Li, and R. Chornock. A 3% solution: determination of the hubble constant with the hubble space telescope and wide field camera 3., The Astrophysical Journal, 730(2):119, 2011.
  • [22] B. Salmon, C. Papovich, S. L. Finkelstein, V. Tilvi, K. Finlator, P. Behroozi, T. Dahlen, R. Davé, A. Dekel, M. Dickinson, et al. The relation between star formation rate and stellar mass for galaxies at $3.5\leqz\leq6.5$ in candels., The Astrophysical Journal, 799(2):183, 2015.
  • [23] A. Schwarzenberg-Czerny. Fast and statistically optimal period search in uneven sampled observations., The Astrophysical Journal Letters, 460(2):L107, 1996.
  • [24] B. Sesar, Ž. Ivezić, R. H. Lupton, M. Jurić, J. E. Gunn, G. R. Knapp, N. De Lee, J. A. Smith, G. Miknaitis, H. Lin, et al. Exploring the variable sky with the sloan digital sky survey., The Astronomical Journal, 134(6) :2236, 2007.
  • [25] B. Sesar, Ž. Ivezić, S. H. Grammer, D. P. Morgan, A. C. Becker, M. Jurić, N. De Lee, J. Annis, T. C. Beers, X. Fan, et al. Light curve templates and galactic distribution of rr lyrae stars from sloan digital sky survey stripe 82., The Astrophysical Journal, 708(1):717, 2010.
  • [26] B. J. Shappee and K. Stanek. A new cepheid distance to the giant spiral m101 based on image subtraction of hubble space telescope/advanced camera for surveys observations., The Astrophysical Journal, 733(2):124, 2011.
  • [27] A. A. Szpiro, K. M. Rice, and T. Lumley. Model-robust regression and a bayesian “sandwich” estimator., The Annals of Applied Statistics, pages 2099–2113, 2010.
  • [28] A. Udalski, M. Szymanski, I. Soszynski, and R. Poleski. The optical gravitational lensing experiment. final reductions of the ogle-iii data., Acta Astronomica, 58:69–87, 2008.
  • [29] J. T. VanderPlas and Z. Ivezic. Periodograms for multiband astronomical time series., arXiv preprint arXiv:1502.01344, 2015.
  • [30] H. White. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity., Econometrica: Journal of the Econometric Society, pages 817–838, 1980a.
  • [31] H. White. Using least squares to approximate unknown regression functions., International Economic Review, pages 149–170, 1980b.
  • [32] M. Zechmeister and M. Kürster. The generalised lomb-scargle periodogram. A new formalism for the floating-mean and keplerian periodograms., Astronomy and Astrophysics, 496(2):577–584, 2009.