Source: Ann. Appl. Stat. Volume 3, Number 4
(2009), 1758-1775.
Outlying curves often occur in functional or longitudinal
datasets, and can be very influential on parameter estimators
and very hard to detect visually. In this article we introduce
estimators of the mean and the principal components that are
resistant to, and then can be used for detection of, outlying
sample trajectories. The estimators are based on reduced-rank
t-models and are specifically aimed at sparse and
irregularly sampled functional data. The outlier-resistance
properties of the estimators and their relative efficiency for
noncontaminated data are studied theoretically and by
simulation. Applications to the analysis of Internet traffic
data and glycated hemoglobin levels in diabetic children are
presented.
References
Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496.
Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. J. Amer. Statist. Assoc. 99 619–632.
Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. Test 10 419–440.
Gervini, D. (2006). Free-knot spline smoothing for functional data. J. Roy. Statist. Soc. Ser. B 68 671–687.
Gervini, D. (2008). Robust functional estimation using the median and spherical principal components. Biometrika 95 587–600.
Gohberg, I., Goldberg, S. and Kaashoek, M. A. (2003). Basic Classes of Linear Operators. Birkhäuser, Basel.
James, G., Hastie, T. G. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587–602.
Kneip, A. (1994). Nonparametric estimation of common regressors for similar curve data. Ann. Statist. 22 1386–1472.
Lange, K. L., Little, R. J. A. and Taylor, J. M. G. (1989). Robust statistical modeling using the t distribution. J. Amer. Statist. Assoc. 84 881–896.
Locantore, N., Marron, J. S., Simpson, D. G., Tripoli, N., Zhang, J. T. and Cohen, K. L. (1999). Robust principal components for functional data (with discussion). Test 8 1–28.
Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006). Robust Statistics. Theory and Methods. Wiley, New York.
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
Schoenle, E. J., Schoenle, D. Molinari, L. and Largo, R. H. (2002). Impaired intellectual development in children with type 1 diabetes mellitus: Association with high glycated hemoglobin and gender, but not with severe hypoglycemic episodes. Diabetologia 45 108–114.
Shen, X., Huang, H.-C. and Ye, J. (2004). Adaptive model selection and assessment for exponential family distributions. Technometrics 46 306–317.
Staniswalis, J. G. and Lee, J. J. (1998). Nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 93 1403–1418.
Van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
Yao, F. and Lee, T. C. M. (2006). Penalized spline models for functional principal component analysis. J. Roy. Statist. Soc. Ser. B 68 3–25.
Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. J. Amer. Statist. Assoc. 93 120–131.
Zhang, L., Marron, J. S., Shen, H. and Zhu, Z. (2007). Singular value decomposition and its visualization. J. Comput. Graph. Statist. 16 833–854.