Electronic Journal of Statistics

Testing linearity and relevance of ordinal predictors

Jan Gertheiss and Franziska Oehrlein

Full-text: Open access

Abstract

In a linear model relevance of a categorical predictor with ordered levels is typically tested by use of the standard F-test (known from statistical textbooks). Such a test can also be applied for testing whether the regression function is linear in the ordinal predictor’s class labels. In this paper we propose an alternative (restricted) likelihood ratio test for these hypotheses which is especially suited for ordinal predictors and is based on the mixed model formulation of penalized dummy coefficients. We show in simulation studies that the new test is more powerful than the standard F-test in many situations. The advantage of the new test is especially striking when the number of ordered levels is moderate or large. Using the relationship to mixed effect models and robust existent fitting software obtaining the test and its null distribution is very fast; a fast R implementation is provided.

Article information

Source
Electron. J. Statist., Volume 5 (2011), 1935-1959.

Dates
First available in Project Euclid: 30 December 2011

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1325264853

Digital Object Identifier
doi:10.1214/11-EJS661

Mathematical Reviews number (MathSciNet)
MR2870153

Zentralblatt MATH identifier
1329.62312

Keywords
Classical linear model ordinal covariates likelihood ratio tests linear mixed models smoothing zero variance component

Citation

Gertheiss, Jan; Oehrlein, Franziska. Testing linearity and relevance of ordinal predictors. Electron. J. Statist. 5 (2011), 1935--1959. doi:10.1214/11-EJS661. https://projecteuclid.org/euclid.ejs/1325264853


Export citation

References

  • Cieza, A., G. Stucki, M. Weigl, L. Kullmann, T. Stoll, L. Kamen, N. Kostanjsek, and N. Walsh (2004). ICF Core Sets for chronic widespread pain., Journal of Rehabilitation Medicine Suppl. 44, 63–68.
  • Crainiceanu, C. M. and D. Ruppert (2004). Likelihood ratio test in linear mixed models with one variance component., Journal of the Royal Statistical Society B 66, 165–185.
  • Crainiceanu, C. M., D. Ruppert, G. Claeskens, and M. P. Wand (2005). Exact likelihood ratio test for penalised splines., Biometrika 92, 91–103.
  • Crainiceanu, C. M., D. Ruppert, and T. J. Vogelsang (2003). Some properties of likelihood ratio tests in linear mixed models., Technical Report . (Available from http://www.orie.cornell.edu/~davidr/papers/).
  • Eilers, P. H. C. and B. D. Marx (1996). Flexible smoothing with B-splines and Penalties., Statistical Science 11, 89–121.
  • Gertheiss, J. (2011)., ordPens: Selection and/or Smoothing of Ordinal Predictors. R package version 0.1-7.
  • Gertheiss, J., S. Hogger, C. Oberhauser, and G. Tutz (2011). Selection of ordinally scaled independent variables with applications to international classification of functioning core sets., Applied Statistics 60, 377–395.
  • Gertheiss, J. and G. Tutz (2009). Penalized regression with ordinal predictors., International Statistical Review 77, 345–365.
  • Gertheiss, J. and G. Tutz (2010). Sparse modeling of categorial explanatory variables., The Annals of Applied Statistics 4, 2150–2180.
  • Greven, S., C. Crainiceanu, H. Küchenhoff, and A. Peters (2008). Restricted likelihood ratio testing for zero variance components in linear mixed models., Journal of Computational and Graphical Statistics 17, 870–891.
  • Harville, D. A. (1977). Maximum likelihood approaches to variance component estimation and to related problems., Journal of the American Statistical Association 72, 320–338.
  • Kim, J.-O. (1975). Mulitivariate analysis of ordinal variables., American Journal of Sociology 81, 261–298.
  • Kim, J.-O. (1978). Multivariate analysis of ordinal variables revisited., American Journal of Sociology 84, 448–456.
  • Labowitz, S. (1970). The assignment of numbers to rank order categories., American Sociological Review 35, 515–524.
  • Mayer, L. S. (1970). Comment on “the assignment of numbers to rank order categories”., American Sociological Review 35, 916–917.
  • Mayer, L. S. (1971). A note on treating ordinal data as interval data., American Sociological Review 36, 519–520.
  • McHorney, C. A., J. E. Ware, and A. E. Raczek (1993). The MOS 36-item short-form health survey (SF-36): II. psychometric and clinical tests of validity in measuring physical and mental health constructs., Medical Care 31, 247–263.
  • Morrell, C. H. (1998). Likelihood ratio testing of variance components in the linear mixed-effects model using restricted maximum likelihood., Biometrics 54, 1560–1568.
  • R Development Core Team (2010)., R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. ISBN 3-900051-07-0.
  • Rao, C. R., H. Toutenburg, Shalabh, and C. Heumann (2008)., Linear Models and Generalizations. Berlin: Springer.
  • Ruppert, D., M. P. Wand, and R. J. Carroll (2003)., Semiparametric Regression. Cambridge: Cambridge University Press.
  • Scheipl, F. (2010)., RLRsim: Exact (Restricted) Likelihood Ratio tests for mixed and additive models. R package version 2.0-5.
  • Scheipl, F. (2011)., amer: Additive mixed models with lme4. R package version 0.6.10.
  • Scheipl, F., S. Greven, and H. Küchenhoff (2008). Size and power of tests for a zero random effect variance or polynomial regression in additive and linear mixed models., Computational Statistics & Data Analysis 52, 3283–3299.
  • Thomas, C. B. (1997). The relationship between values and success for managers in large corporations., Journal of Social Behavior and Personality 12, 671–688.
  • Ware, J. E. and C. Sherbourne (1992). The MOS 36-item short-form health survey (SF-36): I. conceptual framework and item selection., Medical Care 30, 473–483.
  • WHO (2001)., International Classification of Functioning, Disability and Health: ICF. Geneva: World Health Organization.
  • Winship, C. and R. D. Mare (1984). Regression models with ordinal variables., American Sociological Review 49, 512–525.
  • Wolberg, W. H. and O. L. Mangasarian (1990). Multisurface method of pattern separation for medical diagnosis applied to breast cytology., Proceeding of the Nationoal Academy of Science 87, 9193–9196.
  • Wood, S. N. (2006)., Generalized Additive Models: An Introduction with R. London: Chapman & Hall.
  • Wood, S. N. (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models., Journal of the Royal Statistical Society B 73, 3–36.
  • Yuan, M. and Y. Lin (2006). Model selection and estimation in regression with grouped variables., Journal of the Royal Statistical Society B 68, 49–67.