The Annals of Statistics

Perturbation selection and influence measures in local influence analysis

Hongtu Zhu, Joseph G. Ibrahim, Sikyum Lee, and Heping Zhang
Source: Ann. Statist. Volume 35, Number 6 (2007), 2565-2588.

Abstract

Cook’s [J. Roy. Statist. Soc. Ser. B 48 (1986) 133–169] local influence approach based on normal curvature is an important diagnostic tool for assessing local influence of minor perturbations to a statistical model. However, no rigorous approach has been developed to address two fundamental issues: the selection of an appropriate perturbation and the development of influence measures for objective functions at a point with a nonzero first derivative. The aim of this paper is to develop a differential–geometrical framework of a perturbation model (called the perturbation manifold) and utilize associated metric tensor and affine curvatures to resolve these issues. We will show that the metric tensor of the perturbation manifold provides important information about selecting an appropriate perturbation of a model. Moreover, we will introduce new influence measures that are applicable to objective functions at any point. Examples including linear regression models and linear mixed models are examined to demonstrate the effectiveness of using new influence measures for the identification of influential observations.

First Page: Show Hide
Primary Subjects: 62J20
Secondary Subjects: 62-07
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1201012972
Digital Object Identifier: doi:10.1214/009053607000000343
Mathematical Reviews number (MathSciNet): MR2382658
Zentralblatt MATH identifier: 1129.62068

References

Amari, S. (1985). Differential-Geometrical Methods in Statistics. Lecture Notes in Statist. 28. Springer, Berlin.
Mathematical Reviews (MathSciNet): MR0788689
Zentralblatt MATH: 0559.62001
Bates, D. M. and Watts, D. G. (1980). Relative curvature measures of nonlinearity (with discussion). J. Roy. Statist. Soc. Ser. B 42 1--25.
Mathematical Reviews (MathSciNet): MR0567196
Beckman, R. J., Nachtsheim, C. J. and Cook, R. D. (1987). Diagnostics for mixed-model analysis of variance. Technometrics 29 413--426.
Mathematical Reviews (MathSciNet): MR0918527
Digital Object Identifier: doi:10.2307/1269452
Zentralblatt MATH: 0632.62068
Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1630517
Zentralblatt MATH: 0853.62048
Claeskens, G. and Hjort, N. L. (2004). Goodness of fit via non-parametric likelihood ratios. Scand. J. Statist. 31 487--513.
Mathematical Reviews (MathSciNet): MR2101536
Digital Object Identifier: doi:10.1111/j.1467-9469.2004.00403.x
Zentralblatt MATH: 1065.62056
Coddington, E. A. (1961). An Introduction to Ordinary Differential Equations. Prentice Hall, Englewood Cliffs, NJ.
Mathematical Reviews (MathSciNet): MR0126573
Zentralblatt MATH: 0123.27301
Cook, R. D. (1986). Assessment of local influence (with discussion). J. Roy. Statist. Soc. Ser. B 48 133--169.
Mathematical Reviews (MathSciNet): MR0867994
Cook, R. D. and Weisberg, S. (1982). Residuals and Influence in Regression. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR0675263
Zentralblatt MATH: 0564.62054
Cox, D. R. and Reid, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. Roy. Statist. Soc. Ser. B 49 1--39.
Mathematical Reviews (MathSciNet): MR0893334
Efron, B. (1975). Defining the curvature of a statistical problem (with applications to second order efficiency) (with discussion). Ann. Statist. 3 1189--1242.
Mathematical Reviews (MathSciNet): MR0428531
Digital Object Identifier: doi:10.1214/aos/1176343282
Project Euclid: euclid.aos/1176343282
Zentralblatt MATH: 0321.62013
Fung, W. and Kwan, C. (1997). A note on local influence based on normal curvature. J. Roy. Statist. Soc. Ser. B 59 839--843.
Mathematical Reviews (MathSciNet): MR1483218
Digital Object Identifier: doi:10.1111/1467-9868.00100
Fung, W., Zhu, Z., Wei, B. and He, X. (2002). Influence diagnostics and outlier tests for semiparametric mixed models. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 565--579.
Mathematical Reviews (MathSciNet): MR1924307
Digital Object Identifier: doi:10.1111/1467-9868.00351
Zentralblatt MATH: 1090.62039
Ibrahim, J. G., Chen, M.-H., Lipsitz, S. R. and Herring, A. H. (2005). Missing-data methods for generalized linear models: A comparative review. J. Amer. Statist. Assoc. 100 332--346.
Mathematical Reviews (MathSciNet): MR2166072
Digital Object Identifier: doi:10.1198/016214504000001844
Zentralblatt MATH: 1117.62360
Kass, R. E. and Vos, P. W. (1997). Geometrical Foundations of Asymptotic Inference. Wiley, New York.
Mathematical Reviews (MathSciNet): MR1461540
Zentralblatt MATH: 0880.62005
Lauritzen, S. L. (1987). Statistical manifolds. In Differential Geometry in Statistical Inference (S. Amari, O. E. Barndorff-Nielsen, R. E. Kass, S. L. Lauritzen and C. R. Rao, eds.) 163--216. IMS, Hayward, CA.
Zentralblatt MATH: 0694.62001
Lawrance, A. J. (1988). Regression transformation diagnostics using local influence. J. Amer. Statist. Assoc. 83 1067--1072.
Mathematical Reviews (MathSciNet): MR0997583
Digital Object Identifier: doi:10.2307/2290137
Lee, S. and Tang, N. (2004). Local influence analysis of nonlinear structural equation models, Psychometrika 69 573--592.
Mathematical Reviews (MathSciNet): MR2272465
Digital Object Identifier: doi:10.1007/BF02289856
Li, B. and McCullagh, P. (1994). Potential functions and conservative estimating functions. Ann. Statist. 22 340--356.
Mathematical Reviews (MathSciNet): MR1272087
Digital Object Identifier: doi:10.1214/aos/1176325372
Project Euclid: euclid.aos/1176325372
Zentralblatt MATH: 0805.62003
McCullagh, P. and Cox, D. R. (1986). Invariants and likelihood ratio statistics. Ann. Statist. 14 1419--1430.
Mathematical Reviews (MathSciNet): MR0868309
Digital Object Identifier: doi:10.1214/aos/1176350167
Project Euclid: euclid.aos/1176350167
Zentralblatt MATH: 0615.62041
Murray, M. K. and Rice, J. W. (1993). Differential Geometry and Statistics. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1293124
Zentralblatt MATH: 0804.53001
Ouwens, M. J. N., Tan, F. and Berger, M. (2001). Local influence to detect influential data structures for generalized linear mixed models. Biometrics 57 1166--1172.
Mathematical Reviews (MathSciNet): MR1973821
Digital Object Identifier: doi:10.1111/j.0006-341X.2001.01166.x
Pan, J. and Fang, K. (2002). Growth Curve Models and Statistical Diagnostics. Springer, New York.
Mathematical Reviews (MathSciNet): MR1937691
Zentralblatt MATH: 1024.62025
Poon, W. and Poon, Y. (1999). Conformal normal curvature and assessment of local influence. J. Roy. Stat. Soc. Ser. B Stat. Methodol. 61 51--61.
Mathematical Reviews (MathSciNet): MR1664096
Digital Object Identifier: doi:10.1111/1467-9868.00162
Zentralblatt MATH: 0913.62062
Stier, D. M., Leventhal, J. M., Berg, A. T., Johnson, L. and Mezger, J. (1993). Are children born to young mothers at increased risk of maltreatment? Pediatrics 91 642--648.
St. Laurent, R. T. and Cook, R. D. (1993). Leverage, local influence and curvature in nonlinear regression. Biometrika 80 99--106.
Mathematical Reviews (MathSciNet): MR1225217
Zentralblatt MATH: 0769.62044
Digital Object Identifier: doi:10.1093/biomet/80.1.99
Tsai, C. and Wu, X. (1992). Transformation-model diagnostics. Technometrics 34 197--202.
Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer, New York.
Mathematical Reviews (MathSciNet): MR1880596
Zentralblatt MATH: 0956.62055
Verbeke, G., Molenberghs, G., Thijs, H., Lesaffre, E. and Kenward, M. G. (2001). Sensitivity analysis for nonrandom dropout: A local influence approach. Biometrics 57 7--14.
Mathematical Reviews (MathSciNet): MR1833286
Digital Object Identifier: doi:10.1111/j.0006-341X.2001.00007.x
Wasserman, D. R. and Leventhal, J. M. (1993). Maltreatment of children born to cocaine-dependent mothers. American J. Diseases of Children 147 1324--1328.
Wei, B., Hu, Y. and Fung, W. (1998). Generalized leverage and its applications. Scand. J. Statist. 25 25--37.
Mathematical Reviews (MathSciNet): MR1614235
Digital Object Identifier: doi:10.1111/1467-9469.00086
Zentralblatt MATH: 0905.62070
Wu, X. and Luo, Z. (1993). Second-order approach to local influence. J. Roy. Statist. Soc. Ser. B 55 929--936.
Wu, X. and Luo, Z. (1993). Residual sum of squares and multiple potential, diagnostics by a second-order local approach. Statist. Probab. Lett. 16 289--296.
Yuan, K.-H. and Bentler, P. M. (2001). Effect of outliers on estimators and tests in covariance structure analysis. British J. Math. Statist. Psych. 54 161--175.
Mathematical Reviews (MathSciNet): MR1836858
Digital Object Identifier: doi:10.1348/000711001159366
Zhang, H. (1997). Multivariate adaptive splines for the analysis of longitudinal data. J. Comput. Graph. Statist. 6 74--91.
Mathematical Reviews (MathSciNet): MR1451991
Digital Object Identifier: doi:10.2307/1390725
Zhang, H. (1999). Analysis of infant growth curves using multivariate adaptive splines. Biometrics 55 452--459.
Zhong, X., Wei, B. and Fung, W. (2000). Influence analysis for linear measurement error models. Ann. Inst. Statist. Math. 52 367--379.
Mathematical Reviews (MathSciNet): MR1763569
Digital Object Identifier: doi:10.1023/A:1004126108349
Zentralblatt MATH: 0971.62038
Zhu, H. and Lee, S. (2001). Local influence for incomplete data models. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 111--126.
Mathematical Reviews (MathSciNet): MR1811994
Digital Object Identifier: doi:10.1111/1467-9868.00279
Zentralblatt MATH: 0976.62071
Zhu, H. and Lee, S. (2003). Local influence for generalized linear mixed models. Canad. J. Statist. 31 293--309.
Mathematical Reviews (MathSciNet): MR2030126
Digital Object Identifier: doi:10.2307/3316088
Zhu, H. and Wei, B. (1997). Preferred point $\alpha$-manifold and Amari's $\alpha$-connections. Statist. Probab. Lett. 36 219--229.
Mathematical Reviews (MathSciNet): MR1615358
Zhu, H. and Zhang, H. (2004). A diagnostic procedure based on local influence. Biometrika 91 579--589.
Mathematical Reviews (MathSciNet): MR2090623
Digital Object Identifier: doi:10.1093/biomet/91.3.579
Zentralblatt MATH: 1108.62031
Zhu, Z., He, X. and Fung, W. (2003). Local influence analysis for penalized Gaussian likelihood estimators in partially linear models. Scand. J. Statist. 30 767--780.
Mathematical Reviews (MathSciNet): MR2155482
Digital Object Identifier: doi:10.1111/1467-9469.00363

2012 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics