## Statistical Science

### A Topologically Valid Definition of Depth for Functional Data

#### Abstract

The main focus of this work is on providing a formal definition of statistical depth for functional data on the basis of six properties, recognising topological features such as continuity, smoothness and contiguity. Amongst our depth defining properties is one that addresses the delicate challenge of inherent partial observability of functional data, with fulfillment giving rise to a minimal guarantee on the performance of the empirical depth beyond the idealised and practically infeasible case of full observability. As an incidental product, functional depths satisfying our definition achieve a robustness that is commonly ascribed to depth, despite the absence of a formal guarantee in the multivariate definition of depth. We demonstrate the fulfillment or otherwise of our properties for six widely used functional depth proposals, thereby providing a systematic basis for selection of a depth function.

#### Article information

Source
Statist. Sci., Volume 31, Number 1 (2016), 61-79.

Dates
First available in Project Euclid: 10 February 2016

https://projecteuclid.org/euclid.ss/1455115914

Digital Object Identifier
doi:10.1214/15-STS532

Mathematical Reviews number (MathSciNet)
MR3458593

Zentralblatt MATH identifier
06946212

#### Citation

Nieto-Reyes, Alicia; Battey, Heather. A Topologically Valid Definition of Depth for Functional Data. Statist. Sci. 31 (2016), no. 1, 61--79. doi:10.1214/15-STS532. https://projecteuclid.org/euclid.ss/1455115914

#### References

• Adler, R. J. (1981). The Geometry of Random Fields. Wiley, Chichester.
• Amaratunga, D. and Cabrera, J. (2003). Exploration and Analysis of DNA Microarray and Protein Array Data. Wiley, New York.
• Barnett, V. (1976). The ordering of multivariate data. J. Roy. Statist. Soc. Ser. A 139 318–355.
• Beale, N., Rand, D., Battey, H., Croxson, K., Nowak, M. A. and May, R. M. (2011). Individual versus systemic risk and the Regulator’s Dilemma. Proc. Natl. Acad. Sci. 108 12647–12652.
• Chakraborty, A. and Chaudhuri, P. (2014a). The spatial distribution in infinite dimensional spaces and related quantiles and depths. Ann. Statist. 42 1203–1231.
• Chakraborty, A. and Chaudhuri, P. (2014b). On data depth in infinite dimensional spaces. Ann. Inst. Statist. Math. 66 303–324.
• Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J. Amer. Statist. Assoc. 91 862–872.
• Chen, D. and Müller, H.-G. (2012). Nonlinear manifold representations for functional data. Ann. Statist. 40 1–29.
• Chiou, J.-M. and Müller, H.-G. (2014). Linear manifold modelling of multivariate functional data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 605–626.
• Claeskens, G., Hubert, M., Slaets, L. and Vakili, K. (2014). Multivariate functional halfspace depth. J. Amer. Statist. Assoc. 109 411–423.
• Cuesta-Albertos, J. A. and Nieto-Reyes, A. (2008). The random Tukey depth. Comput. Statist. Data Anal. 52 4979–4988.
• Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496.
• Cuevas, A. and Fraiman, R. (2009). On depth measures and dual statistics. A methodology for dealing with general data. J. Multivariate Anal. 100 753–766.
• Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. Springer, New York.
• Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge.
• Dutta, S., Ghosh, A. K. and Chaudhuri, P. (2011). Some intriguing properties of Tukey’s half-space depth. Bernoulli 17 1420–1434.
• Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. TEST 10 419–440.
• Genton, M. G. and Hall, P. (2015). A tilting approach to ranking influence. J. R. Stat. Soc. Ser. B. Stat. Methodol. DOI:10.1111/rssb.12102. To appear.
• Green, P. J. (1981). Peeling bivariate data. In Interpretting Multivariate Data (V. Barnett, ed.). Wiley, New York.
• Hampel, F. R. (1971). A general qualitative definition of robustness. Ann. Math. Statist. 42 1887–1896.
• Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York.
• Hlubinka, D., Gijbels, I., Omelka, M. and Nagy, S. (2015). Integrated data depth for smooth functions and its application in supervised classification. Comput. Statist. DOI:10.1007/s00180-015-0566-x. To appear.
• Huber, P. J. (1972). The 1972 Wald lecture. Robust statistics: A review. Ann. Math. Statist. 43 1041–1067.
• Li, J., Cuesta-Albertos, J. A. and Liu, R. Y. (2012). $DD$-classifier: Nonparametric classification procedure based on $DD$-plot. J. Amer. Statist. Assoc. 107 737–753.
• Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414.
• Liu, R. Y., Parelius, J. M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Statist. 27 783–858.
• López-Pintado, S. and Jornsten, R. (2007). Functional analysis via extensions of the band depth. In Complex Datasets and Inverse Problems. Institute of Mathematical Statistics Lecture Notes—Monograph Series 54 103–120. IMS, Beachwood, OH.
• López-Pintado, S. and Romo, J. (2009). On the concept of depth for functional data. J. Amer. Statist. Assoc. 104 718–734.
• López-Pintado, S. and Romo, J. (2011). A half-region depth for functional data. Comput. Statist. Data Anal. 55 1679–1695.
• Mosler, K. (2013). Depth statistics. In Robustness and Complex Data Structures 17–34. Springer, Heidelberg.
• Nieto-Reyes, A. (2011). On the properties of functional depth. In Recent Advances in Functional Data Analysis and Related Topics (F. Ferraty, ed.). Contrib. Statist. 239–244. Physica-Verlag/Springer, Heidelberg.
• Paindaveine, D. and Van Bever, G. (2013). From depth to local depth: A focus on centrality. J. Amer. Statist. Assoc. 108 1105–1119.
• Paindaveine, D. and Van Bever, G. (2015). Nonparametrically consistent depth-based classifiers. Bernoulli 21 62–82.
• Rice, S. O. (1945). Mathematical analysis of random noise. Bell Syst. Tech. J. 24 46–156.
• Rousseeuw, P. J. and Ruts, I. (1999). The depth function of a population distribution. Metrika 49 213–244.
• Serfling, R. (2002). A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on the $L_{1}$-Norm and Related Methods (Neuchâtel, 2002). Stat. Ind. Technol. 25–38. Birkhäuser, Basel.
• Serfling, R. (2006). Depth functions in nonparametric multivariate inference. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 1–16. Amer. Math. Soc., Providence, RI.
• Shapiro, A., Dentcheva, D. and Ruszczyński, A. (2009). Lectures on Stochastic Programming: Modeling and Theory. MPS/SIAM Series on Optimization 9. SIAM, Philadelphia, PA.
• Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24.
• Swierenga, H., de Weijer, A. P., van Wijk, R. J. and Buydens, L. M. C. (1999). Strategy for constructing robust multivariate calibration models. Chemom. Intell. Lab. Syst. 49 1–17.
• Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians (Vancouver, BC, 1974), Vol. 2 523–531. Canad. Math. Congress, Montreal, QC.
• van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York.
• Vardi, Y. and Zhang, C.-H. (2000). The multivariate $L_{1}$-median and associated data depth. Proc. Natl. Acad. Sci. USA 97 1423–1426 (electronic).
• Zuo, Y. and Serfling, R. (2000a). On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. J. Statist. Plann. Inference 84 55–79.
• Zuo, Y. and Serfling, R. (2000b). General notions of statistical depth function. Ann. Statist. 28 461–482.