Source: Ann. Statist. Volume 35, Number 6
(2007), 2565-2588.
Cook’s [J. Roy. Statist. Soc. Ser. B 48 (1986) 133–169] local influence approach based on normal curvature is an important diagnostic tool for assessing local influence of minor perturbations to a statistical model. However, no rigorous approach has been developed to address two fundamental issues: the selection of an appropriate perturbation and the development of influence measures for objective functions at a point with a nonzero first derivative. The aim of this paper is to develop a differential–geometrical framework of a perturbation model (called the perturbation manifold) and utilize associated metric tensor and affine curvatures to resolve these issues. We will show that the metric tensor of the perturbation manifold provides important information about selecting an appropriate perturbation of a model. Moreover, we will introduce new influence measures that are applicable to objective functions at any point. Examples including linear regression models and linear mixed models are examined to demonstrate the effectiveness of using new influence measures for the identification of influential observations.
References
Amari, S. (1985). Differential-Geometrical Methods in Statistics. Lecture Notes in Statist. 28. Springer, Berlin.
Bates, D. M. and Watts, D. G. (1980). Relative curvature measures of nonlinearity (with discussion). J. Roy. Statist. Soc. Ser. B 42 1--25.
Beckman, R. J., Nachtsheim, C. J. and Cook, R. D. (1987). Diagnostics for mixed-model analysis of variance. Technometrics 29 413--426.
Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall, London.
Claeskens, G. and Hjort, N. L. (2004). Goodness of fit via non-parametric likelihood ratios. Scand. J. Statist. 31 487--513.
Coddington, E. A. (1961). An Introduction to Ordinary Differential Equations. Prentice Hall, Englewood Cliffs, NJ.
Cook, R. D. (1986). Assessment of local influence (with discussion). J. Roy. Statist. Soc. Ser. B 48 133--169.
Cook, R. D. and Weisberg, S. (1982). Residuals and Influence in Regression. Chapman and Hall, London.
Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. Roy. Statist. Soc. Ser. B 49 1--39.
Efron, B. (1975). Defining the curvature of a statistical problem (with applications to second order efficiency) (with discussion). Ann. Statist. 3 1189--1242.
Fung, W. and Kwan, C. (1997). A note on local influence based on normal curvature. J. Roy. Statist. Soc. Ser. B 59 839--843.
Fung, W., Zhu, Z., Wei, B. and He, X. (2002). Influence diagnostics and outlier tests for semiparametric mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 565--579.
Ibrahim, J. G., Chen, M.-H., Lipsitz, S. R. and Herring, A. H. (2005). Missing-data methods for generalized linear models: A comparative review. J. Amer. Statist. Assoc. 100 332--346.
Kass, R. E. and Vos, P. W. (1997). Geometrical Foundations of Asymptotic Inference. Wiley, New York.
Lauritzen, S. L. (1987). Statistical manifolds. In Differential Geometry in Statistical Inference (S. Amari, O. E. Barndorff-Nielsen, R. E. Kass, S. L. Lauritzen and C. R. Rao, eds.) 163--216. IMS, Hayward, CA.
Lawrance, A. J. (1988). Regression transformation diagnostics using local influence. J. Amer. Statist. Assoc. 83 1067--1072.
Lee, S. and Tang, N. (2004). Local influence analysis of nonlinear structural equation models, Psychometrika 69 573--592.
Li, B. and McCullagh, P. (1994). Potential functions and conservative estimating functions. Ann. Statist. 22 340--356.
McCullagh, P. and Cox, D. R. (1986). Invariants and likelihood ratio statistics. Ann. Statist. 14 1419--1430.
Murray, M. K. and Rice, J. W. (1993). Differential Geometry and Statistics. Chapman and Hall, London.
Ouwens, M. J. N., Tan, F. and Berger, M. (2001). Local influence to detect influential data structures for generalized linear mixed models. Biometrics 57 1166--1172.
Pan, J. and Fang, K. (2002). Growth Curve Models and Statistical Diagnostics. Springer, New York.
Poon, W. and Poon, Y. (1999). Conformal normal curvature and assessment of local influence. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 61 51--61.
Stier, D. M., Leventhal, J. M., Berg, A. T., Johnson, L. and Mezger, J. (1993). Are children born to young mothers at increased risk of maltreatment? Pediatrics 91 642--648.
St. Laurent, R. T. and Cook, R. D. (1993). Leverage, local influence and curvature in nonlinear regression. Biometrika 80 99--106.
Tsai, C. and Wu, X. (1992). Transformation-model diagnostics. Technometrics 34 197--202.
Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer, New York.
Verbeke, G., Molenberghs, G., Thijs, H., Lesaffre, E. and Kenward, M. G. (2001). Sensitivity analysis for nonrandom dropout: A local influence approach. Biometrics 57 7--14.
Wasserman, D. R. and Leventhal, J. M. (1993). Maltreatment of children born to cocaine-dependent mothers. American J. Diseases of Children 147 1324--1328.
Wei, B., Hu, Y. and Fung, W. (1998). Generalized leverage and its applications. Scand. J. Statist. 25 25--37.
Wu, X. and Luo, Z. (1993). Second-order approach to local influence. J. Roy. Statist. Soc. Ser. B 55 929--936.
Wu, X. and Luo, Z. (1993). Residual sum of squares and multiple potential, diagnostics by a second-order local approach. Statist. Probab. Lett. 16 289--296.
Yuan, K.-H. and Bentler, P. M. (2001). Effect of outliers on estimators and tests in covariance structure analysis. British J. Math. Statist. Psych. 54 161--175.
Zhang, H. (1997). Multivariate adaptive splines for the analysis of longitudinal data. J. Comput. Graph. Statist. 6 74--91.
Zhang, H. (1999). Analysis of infant growth curves using multivariate adaptive splines. Biometrics 55 452--459.
Zhong, X., Wei, B. and Fung, W. (2000). Influence analysis for linear measurement error models. Ann. Inst. Statist. Math. 52 367--379.
Zhu, H. and Lee, S. (2001). Local influence for incomplete data models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 111--126.
Zhu, H. and Lee, S. (2003). Local influence for generalized linear mixed models. Canad. J. Statist. 31 293--309.
Zhu, H. and Wei, B. (1997). Preferred point $\alpha$-manifold and Amari's $\alpha$-connections. Statist. Probab. Lett. 36 219--229.
Zhu, H. and Zhang, H. (2004). A diagnostic procedure based on local influence. Biometrika 91 579--589.
Zhu, Z., He, X. and Fung, W. (2003). Local influence analysis for penalized Gaussian likelihood estimators in partially linear models. Scand. J. Statist. 30 767--780.