Electronic Journal of Statistics

Oracle estimation of parametric transformation models

Yair Goldberg, Wenbin Lu, and Jason Fine

Full-text: Open access


Transformation models, like the Box-Cox transformation, are widely used in regression to reduce non-additivity, non-normality, and heteroscedasticity. The question of whether one may or may not treat the estimated transformation parameter as fixed in inference about other model parameters has a long and controversial history (Bickel and Doksum, 1981, Hinkley and Runger, 1984). While the frequentist wisdom is that uncertainty regarding the true value of the transformation parameter cannot be ignored, in practice, difficulties in interpretation arise if the transformation is regarded as random and not fixed. In this paper, we suggest a golden mean methodology which attempts to reconcile these philosophies. Penalized estimation yields oracle estimates of transformations that enable treating the transformation parameter as known when the data indicate one of a prespecified set of transformations of scientific interest. When the true transformation is outside this set, rigorous frequentist inference is still achieved. The methodology permits multiple candidate values for the transformation, as is common in applications, as well as simultaneously accommodating variable selection in regression model. Theoretical issues, such as selection consistency and the oracle property, are rigorously established. Numerical studies, including extensive simulation studies and real data examples, illustrate the practical utility of the proposed methods.

Article information

Electron. J. Statist., Volume 10, Number 1 (2016), 90-120.

Received: July 2015
First available in Project Euclid: 17 February 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Box-Cox transformation maximum likelihood estimation oracle transformation shrinkage estimation


Goldberg, Yair; Lu, Wenbin; Fine, Jason. Oracle estimation of parametric transformation models. Electron. J. Statist. 10 (2016), no. 1, 90--120. doi:10.1214/15-EJS1083. https://projecteuclid.org/euclid.ejs/1455715958

Export citation


  • P. J. Bickel and K. A. Doksum. An analysis of transformations revisited., Journal of the American Statistical Association, 76(374):296–311, 1981.
  • G. E. P. Box and D. R. Cox. An analysis of transformations., Journal of the Royal Statistical Society. Series B, 26(2):211–252, 1964.
  • D. R. Brillinger., A Festschrift For Erich L. Lehmann, chapter, A Generalized Linear Model With “Gaussian” Regressor Variables, pages 97–114. Chapman and Hall, 1982.
  • R. J. Carroll. Prediction and power transformations when the choice of power is restricted to a finite set., Journal of the American Statistical Association, 77(380):908–915, 1982.
  • R. J. Carroll and D. Ruppert., Transformation and weighting in regression. Chapman and Hall, 1988.
  • K. Cho, I. K. Yeo, R. A. Johnson, and W. Y. Loh. Prediction interval estimation in transformed linear models., Statistics and Probability Letter, 51:345–350, 2001.
  • K. A. Doksum and C.-W. Wong. Statistical tests based on transformed data., Journal of the American Statistical Association, 78(382):411–417, 1983.
  • J. Fan and R. Li. Variable selection via nonconcave penalized likelihood and its oracle properties., Journal of the American Statistical Association, 96(456) :1348–1360, 2001.
  • J. H. Friedman. Fast sparse regression and classification., International Journal of Forecasting, 28(3):722–738, 2012.
  • Y. Goldberg, W. Lu, and J. Fine. Supplement to “Oracle estimation of parametric transformation models.”., DOI:10.1214/15-EJS1083SUPP, 2016.
  • F. Hernandez and R. A. Johnson. The large-sample behavior of transformations to normality., Journal of the American Statistical Association, 75(372):855–861, 1980.
  • D. V. Hinkley and G. Runger. The analysis of transformed data., Journal of the American Statistical Association, 79(386):302–309, 1984.
  • M. R. Kosorok., Introduction to Empirical Processes and Semiparametric Inference. Springer, 2008.
  • W. Lu, Y. Goldberg, and J. P. Fine. On the robustness of the adaptive lasso to model misspecification., Biometrika, 99:717–731, 2012.
  • J. Lv and Y. Fan. A unified approach to model selection and sparse recovery using regularized least squares., The Annals of Statistics, 37(6A) :3498–3528, 2009.
  • B. M. Pötscher and H. Leeb. On the distribution of penalized maximum likelihood estimators: The Lasso, SCAD, and thresholding., Journal of Multivariate Analysis, 100(9) :2065–2082, 2009.
  • B. M. Pötscher and U. Schneider. Confidence sets based on penalized maximum likelihood estimators in Gaussian regression., Electronic Journal of Statistics, 4:334–360, 2010.
  • T. M. Stoker. Consistent estimation of scaled coefficients., Econometrica, 54(6) :1461–1481, 1986.
  • R. Tibshirani. Regression shrinkage and selection via the Lasso., Journal of the Royal Statistical Society. Series B (Methodological), 58(1):267–288, 1996.
  • J. W. Tukey., Exploratory Data Analysis. Addison-Wesley, 1977.
  • A. W. van der Vaart., Asymptotic Statistics. Cambridge University Press, 2000.
  • H. White. Maximum likelihood estimation of misspecified models., Econometrica, 50(1):1–25, 1982.
  • I.-K. Yeo. Variable selection and transformation in linear regression models., Statistics & Probability Letters, 72(3):219–226, 2005.
  • I. K. Yeo and R. A. Johnson. A new family of power transformations to improve normality or symmetry., Biometrika, 87(4):954–959, 2000.
  • C. H. Zhang. Nearly unbiased variable selection under minimax concave penalty., The Annals of Statistics, 38(2):894–942, 2010.
  • H. Zou. The adaptive Lasso and its oracle properties., Journal of the American Statistical Association, 101(476) :1418–1429, 2006.
  • H. Zou and H. H. Zhang. On the adaptive elastic-net with a diverging number of parameters., The Annals of Statistics, 37(4) :1733–1751, 2009.

Supplemental materials