The Annals of Applied Statistics

Detecting and handling outlying trajectories in irregularly sampled functional datasets

Daniel Gervini
Source: Ann. Appl. Stat. Volume 3, Number 4 (2009), 1758-1775.

Abstract

Outlying curves often occur in functional or longitudinal datasets, and can be very influential on parameter estimators and very hard to detect visually. In this article we introduce estimators of the mean and the principal components that are resistant to, and then can be used for detection of, outlying sample trajectories. The estimators are based on reduced-rank t-models and are specifically aimed at sparse and irregularly sampled functional data. The outlier-resistance properties of the estimators and their relative efficiency for noncontaminated data are studied theoretically and by simulation. Applications to the analysis of Internet traffic data and glycated hemoglobin levels in diabetic children are presented.

First Page: Show Hide

Related Works:

Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aoas/1267453963
Digital Object Identifier: doi:10.1214/09-AOAS257
Zentralblatt MATH identifier: 1184.62101
Mathematical Reviews number (MathSciNet): MR2752157

References

Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496.
Mathematical Reviews (MathSciNet): MR2336349
Zentralblatt MATH: 1195.62032
Digital Object Identifier: doi:10.1007/s00180-007-0053-0
Efron, B. (2004). The estimation of prediction error: Covariance penalties and cross-validation. J. Amer. Statist. Assoc. 99 619–632.
Mathematical Reviews (MathSciNet): MR2090899
Zentralblatt MATH: 1117.62324
Digital Object Identifier: doi:10.1198/016214504000000692
Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. Test 10 419–440.
Mathematical Reviews (MathSciNet): MR1881149
Zentralblatt MATH: 1016.62026
Digital Object Identifier: doi:10.1007/BF02595706
Gervini, D. (2006). Free-knot spline smoothing for functional data. J. Roy. Statist. Soc. Ser. B 68 671–687.
Mathematical Reviews (MathSciNet): MR2301014
Zentralblatt MATH: 1110.62044
Digital Object Identifier: doi:10.1111/j.1467-9868.2006.00561.x
Gervini, D. (2008). Robust functional estimation using the median and spherical principal components. Biometrika 95 587–600.
Mathematical Reviews (MathSciNet): MR2443177
Zentralblatt MATH: 05609534
Digital Object Identifier: doi:10.1093/biomet/asn031
Gohberg, I., Goldberg, S. and Kaashoek, M. A. (2003). Basic Classes of Linear Operators. Birkhäuser, Basel.
Mathematical Reviews (MathSciNet): MR2015498
Zentralblatt MATH: 1065.47001
James, G., Hastie, T. G. and Sugar, C. A. (2000). Principal component models for sparse functional data. Biometrika 87 587–602.
Mathematical Reviews (MathSciNet): MR1789811
Zentralblatt MATH: 0962.62056
Digital Object Identifier: doi:10.1093/biomet/87.3.587
Kneip, A. (1994). Nonparametric estimation of common regressors for similar curve data. Ann. Statist. 22 1386–1472.
Mathematical Reviews (MathSciNet): MR1311981
Zentralblatt MATH: 0817.62029
Digital Object Identifier: doi:10.1214/aos/1176325634
Project Euclid: euclid.aos/1176325634
Lange, K. L., Little, R. J. A. and Taylor, J. M. G. (1989). Robust statistical modeling using the t distribution. J. Amer. Statist. Assoc. 84 881–896.
Mathematical Reviews (MathSciNet): MR1134486
Locantore, N., Marron, J. S., Simpson, D. G., Tripoli, N., Zhang, J. T. and Cohen, K. L. (1999). Robust principal components for functional data (with discussion). Test 8 1–28.
Mathematical Reviews (MathSciNet): MR1707596
Zentralblatt MATH: 0980.62049
Digital Object Identifier: doi:10.1007/BF02595862
Maronna, R. A., Martin, R. D. and Yohai, V. J. (2006). Robust Statistics. Theory and Methods. Wiley, New York.
Mathematical Reviews (MathSciNet): MR2238141
Zentralblatt MATH: 1094.62040
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR2168993
Schoenle, E. J., Schoenle, D. Molinari, L. and Largo, R. H. (2002). Impaired intellectual development in children with type 1 diabetes mellitus: Association with high glycated hemoglobin and gender, but not with severe hypoglycemic episodes. Diabetologia 45 108–114.
Shen, X., Huang, H.-C. and Ye, J. (2004). Adaptive model selection and assessment for exponential family distributions. Technometrics 46 306–317.
Mathematical Reviews (MathSciNet): MR2082500
Digital Object Identifier: doi:10.1198/004017004000000338
Staniswalis, J. G. and Lee, J. J. (1998). Nonparametric regression analysis of longitudinal data. J. Amer. Statist. Assoc. 93 1403–1418.
Mathematical Reviews (MathSciNet): MR1666636
Zentralblatt MATH: 1064.62522
Digital Object Identifier: doi:10.1080/01621459.1998.10473801
Van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
Mathematical Reviews (MathSciNet): MR1652247
Zentralblatt MATH: 0910.62001
Yao, F., Müller, H.-G. and Wang, J.-L. (2005). Functional data analysis for sparse longitudinal data. J. Amer. Statist. Assoc. 100 577–590.
Mathematical Reviews (MathSciNet): MR2160561
Digital Object Identifier: doi:10.1198/016214504000001745
Yao, F. and Lee, T. C. M. (2006). Penalized spline models for functional principal component analysis. J. Roy. Statist. Soc. Ser. B 68 3–25.
Mathematical Reviews (MathSciNet): MR2212572
Zentralblatt MATH: 1141.62050
Digital Object Identifier: doi:10.1111/j.1467-9868.2005.00530.x
Ye, J. (1998). On measuring and correcting the effects of data mining and model selection. J. Amer. Statist. Assoc. 93 120–131.
Mathematical Reviews (MathSciNet): MR1614596
Zentralblatt MATH: 0920.62056
Digital Object Identifier: doi:10.1080/01621459.1998.10474094
Zhang, L., Marron, J. S., Shen, H. and Zhu, Z. (2007). Singular value decomposition and its visualization. J. Comput. Graph. Statist. 16 833–854.
Mathematical Reviews (MathSciNet): MR2412485
Digital Object Identifier: doi:10.1198/106186007X256080

2013 © Institute of Mathematical Statistics

The Annals of Applied Statistics

The Annals of Applied Statistics

Turn MathJax Off
What is MathJax?