Open Access
February 2016 A Topologically Valid Definition of Depth for Functional Data
Alicia Nieto-Reyes, Heather Battey
Statist. Sci. 31(1): 61-79 (February 2016). DOI: 10.1214/15-STS532
Abstract

The main focus of this work is on providing a formal definition of statistical depth for functional data on the basis of six properties, recognising topological features such as continuity, smoothness and contiguity. Amongst our depth defining properties is one that addresses the delicate challenge of inherent partial observability of functional data, with fulfillment giving rise to a minimal guarantee on the performance of the empirical depth beyond the idealised and practically infeasible case of full observability. As an incidental product, functional depths satisfying our definition achieve a robustness that is commonly ascribed to depth, despite the absence of a formal guarantee in the multivariate definition of depth. We demonstrate the fulfillment or otherwise of our properties for six widely used functional depth proposals, thereby providing a systematic basis for selection of a depth function.

References

1.

Adams, R. A. (1975). Sobolev Spaces. Academic Press, New York. MR450957Adams, R. A. (1975). Sobolev Spaces. Academic Press, New York. MR450957

2.

Adler, R. J. (1981). The Geometry of Random Fields. Wiley, Chichester. MR611857Adler, R. J. (1981). The Geometry of Random Fields. Wiley, Chichester. MR611857

3.

Amaratunga, D. and Cabrera, J. (2003). Exploration and Analysis of DNA Microarray and Protein Array Data. Wiley, New York.Amaratunga, D. and Cabrera, J. (2003). Exploration and Analysis of DNA Microarray and Protein Array Data. Wiley, New York.

4.

Barnett, V. (1976). The ordering of multivariate data. J. Roy. Statist. Soc. Ser. A 139 318–355. MR445726 10.2307/2344839Barnett, V. (1976). The ordering of multivariate data. J. Roy. Statist. Soc. Ser. A 139 318–355. MR445726 10.2307/2344839

5.

Beale, N., Rand, D., Battey, H., Croxson, K., Nowak, M. A. and May, R. M. (2011). Individual versus systemic risk and the Regulator’s Dilemma. Proc. Natl. Acad. Sci. 108 12647–12652.Beale, N., Rand, D., Battey, H., Croxson, K., Nowak, M. A. and May, R. M. (2011). Individual versus systemic risk and the Regulator’s Dilemma. Proc. Natl. Acad. Sci. 108 12647–12652.

6.

Chakraborty, A. and Chaudhuri, P. (2014a). The spatial distribution in infinite dimensional spaces and related quantiles and depths. Ann. Statist. 42 1203–1231. MR3224286 10.1214/14-AOS1226 euclid.aos/1403276912 Chakraborty, A. and Chaudhuri, P. (2014a). The spatial distribution in infinite dimensional spaces and related quantiles and depths. Ann. Statist. 42 1203–1231. MR3224286 10.1214/14-AOS1226 euclid.aos/1403276912

7.

Chakraborty, A. and Chaudhuri, P. (2014b). On data depth in infinite dimensional spaces. Ann. Inst. Statist. Math. 66 303–324. MR3171407 10.1007/s10463-013-0416-yChakraborty, A. and Chaudhuri, P. (2014b). On data depth in infinite dimensional spaces. Ann. Inst. Statist. Math. 66 303–324. MR3171407 10.1007/s10463-013-0416-y

8.

Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J. Amer. Statist. Assoc. 91 862–872. MR1395753 10.1080/01621459.1996.10476954Chaudhuri, P. (1996). On a geometric notion of quantiles for multivariate data. J. Amer. Statist. Assoc. 91 862–872. MR1395753 10.1080/01621459.1996.10476954

9.

Chen, D. and Müller, H.-G. (2012). Nonlinear manifold representations for functional data. Ann. Statist. 40 1–29. MR3013177 10.1214/11-AOS936 euclid.aos/1331830772 Chen, D. and Müller, H.-G. (2012). Nonlinear manifold representations for functional data. Ann. Statist. 40 1–29. MR3013177 10.1214/11-AOS936 euclid.aos/1331830772

10.

Chiou, J.-M. and Müller, H.-G. (2014). Linear manifold modelling of multivariate functional data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 605–626. MR3210730 10.1111/rssb.12038Chiou, J.-M. and Müller, H.-G. (2014). Linear manifold modelling of multivariate functional data. J. R. Stat. Soc. Ser. B. Stat. Methodol. 76 605–626. MR3210730 10.1111/rssb.12038

11.

Claeskens, G., Hubert, M., Slaets, L. and Vakili, K. (2014). Multivariate functional halfspace depth. J. Amer. Statist. Assoc. 109 411–423. MR3180573 10.1080/01621459.2013.856795Claeskens, G., Hubert, M., Slaets, L. and Vakili, K. (2014). Multivariate functional halfspace depth. J. Amer. Statist. Assoc. 109 411–423. MR3180573 10.1080/01621459.2013.856795

12.

Cuesta-Albertos, J. A. and Nieto-Reyes, A. (2008). The random Tukey depth. Comput. Statist. Data Anal. 52 4979–4988. MR2526207Cuesta-Albertos, J. A. and Nieto-Reyes, A. (2008). The random Tukey depth. Comput. Statist. Data Anal. 52 4979–4988. MR2526207

13.

Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496. MR2336349 10.1007/s00180-007-0053-0Cuevas, A., Febrero, M. and Fraiman, R. (2007). Robust estimation and classification for functional data via projection-based depth notions. Comput. Statist. 22 481–496. MR2336349 10.1007/s00180-007-0053-0

14.

Cuevas, A. and Fraiman, R. (2009). On depth measures and dual statistics. A methodology for dealing with general data. J. Multivariate Anal. 100 753–766. MR2478196 10.1016/j.jmva.2008.08.002Cuevas, A. and Fraiman, R. (2009). On depth measures and dual statistics. A methodology for dealing with general data. J. Multivariate Anal. 100 753–766. MR2478196 10.1016/j.jmva.2008.08.002

15.

Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. Springer, New York. MR1383093Devroye, L., Györfi, L. and Lugosi, G. (1996). A Probabilistic Theory of Pattern Recognition. Applications of Mathematics (New York) 31. Springer, New York. MR1383093

16.

Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge. MR1932358Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathematics 74. Cambridge Univ. Press, Cambridge. MR1932358

17.

Dutta, S., Ghosh, A. K. and Chaudhuri, P. (2011). Some intriguing properties of Tukey’s half-space depth. Bernoulli 17 1420–1434. MR2854779 10.3150/10-BEJ322 euclid.bj/1320417511 Dutta, S., Ghosh, A. K. and Chaudhuri, P. (2011). Some intriguing properties of Tukey’s half-space depth. Bernoulli 17 1420–1434. MR2854779 10.3150/10-BEJ322 euclid.bj/1320417511

18.

Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. TEST 10 419–440. MR1881149 10.1007/BF02595706Fraiman, R. and Muniz, G. (2001). Trimmed means for functional data. TEST 10 419–440. MR1881149 10.1007/BF02595706

19.

Genton, M. G. and Hall, P. (2015). A tilting approach to ranking influence. J. R. Stat. Soc. Ser. B. Stat. Methodol.  DOI:10.1111/rssb.12102. To appear.Genton, M. G. and Hall, P. (2015). A tilting approach to ranking influence. J. R. Stat. Soc. Ser. B. Stat. Methodol.  DOI:10.1111/rssb.12102. To appear.

20.

Green, P. J. (1981). Peeling bivariate data. In Interpretting Multivariate Data (V. Barnett, ed.). Wiley, New York. MR656974Green, P. J. (1981). Peeling bivariate data. In Interpretting Multivariate Data (V. Barnett, ed.). Wiley, New York. MR656974

21.

Hampel, F. R. (1971). A general qualitative definition of robustness. Ann. Math. Statist. 42 1887–1896. MR301858 10.1214/aoms/1177693054 euclid.aoms/1177693054 Hampel, F. R. (1971). A general qualitative definition of robustness. Ann. Math. Statist. 42 1887–1896. MR301858 10.1214/aoms/1177693054 euclid.aoms/1177693054

22.

Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York. MR829458Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics: The Approach Based on Influence Functions. Wiley, New York. MR829458

23.

Hlubinka, D., Gijbels, I., Omelka, M. and Nagy, S. (2015). Integrated data depth for smooth functions and its application in supervised classification. Comput. Statist.  DOI:10.1007/s00180-015-0566-x. To appear.Hlubinka, D., Gijbels, I., Omelka, M. and Nagy, S. (2015). Integrated data depth for smooth functions and its application in supervised classification. Comput. Statist.  DOI:10.1007/s00180-015-0566-x. To appear.

24.

Huber, P. J. (1972). The 1972 Wald lecture. Robust statistics: A review. Ann. Math. Statist. 43 1041–1067. MR314180 10.1214/aoms/1177692459 euclid.aoms/1177692459 Huber, P. J. (1972). The 1972 Wald lecture. Robust statistics: A review. Ann. Math. Statist. 43 1041–1067. MR314180 10.1214/aoms/1177692459 euclid.aoms/1177692459

25.

Li, J., Cuesta-Albertos, J. A. and Liu, R. Y. (2012). $DD$-classifier: Nonparametric classification procedure based on $DD$-plot. J. Amer. Statist. Assoc. 107 737–753. MR2980081 10.1080/01621459.2012.688462Li, J., Cuesta-Albertos, J. A. and Liu, R. Y. (2012). $DD$-classifier: Nonparametric classification procedure based on $DD$-plot. J. Amer. Statist. Assoc. 107 737–753. MR2980081 10.1080/01621459.2012.688462

26.

Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414. MR1041400 10.1214/aos/1176347507 euclid.aos/1176347507 Liu, R. Y. (1990). On a notion of data depth based on random simplices. Ann. Statist. 18 405–414. MR1041400 10.1214/aos/1176347507 euclid.aos/1176347507

27.

Liu, R. Y., Parelius, J. M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Statist. 27 783–858. MR1724033 euclid.aos/1018031260 Liu, R. Y., Parelius, J. M. and Singh, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Statist. 27 783–858. MR1724033 euclid.aos/1018031260

28.

López-Pintado, S. and Jornsten, R. (2007). Functional analysis via extensions of the band depth. In Complex Datasets and Inverse Problems. Institute of Mathematical Statistics Lecture Notes—Monograph Series 54 103–120. IMS, Beachwood, OH. MR2459182 10.1214/074921707000000085López-Pintado, S. and Jornsten, R. (2007). Functional analysis via extensions of the band depth. In Complex Datasets and Inverse Problems. Institute of Mathematical Statistics Lecture Notes—Monograph Series 54 103–120. IMS, Beachwood, OH. MR2459182 10.1214/074921707000000085

29.

López-Pintado, S. and Romo, J. (2009). On the concept of depth for functional data. J. Amer. Statist. Assoc. 104 718–734. MR2541590 10.1198/jasa.2009.0108López-Pintado, S. and Romo, J. (2009). On the concept of depth for functional data. J. Amer. Statist. Assoc. 104 718–734. MR2541590 10.1198/jasa.2009.0108

30.

López-Pintado, S. and Romo, J. (2011). A half-region depth for functional data. Comput. Statist. Data Anal. 55 1679–1695. MR2748671López-Pintado, S. and Romo, J. (2011). A half-region depth for functional data. Comput. Statist. Data Anal. 55 1679–1695. MR2748671

31.

Mosler, K. (2013). Depth statistics. In Robustness and Complex Data Structures 17–34. Springer, Heidelberg. MR3135871 10.1007/978-3-642-35494-6_2Mosler, K. (2013). Depth statistics. In Robustness and Complex Data Structures 17–34. Springer, Heidelberg. MR3135871 10.1007/978-3-642-35494-6_2

32.

Nieto-Reyes, A. (2011). On the properties of functional depth. In Recent Advances in Functional Data Analysis and Related Topics (F. Ferraty, ed.). Contrib. Statist. 239–244. Physica-Verlag/Springer, Heidelberg. MR2815588 10.1007/978-3-7908-2736-1_37Nieto-Reyes, A. (2011). On the properties of functional depth. In Recent Advances in Functional Data Analysis and Related Topics (F. Ferraty, ed.). Contrib. Statist. 239–244. Physica-Verlag/Springer, Heidelberg. MR2815588 10.1007/978-3-7908-2736-1_37

33.

Paindaveine, D. and Van Bever, G. (2013). From depth to local depth: A focus on centrality. J. Amer. Statist. Assoc. 108 1105–1119. MR3174687 10.1080/01621459.2013.813390Paindaveine, D. and Van Bever, G. (2013). From depth to local depth: A focus on centrality. J. Amer. Statist. Assoc. 108 1105–1119. MR3174687 10.1080/01621459.2013.813390

34.

Paindaveine, D. and Van Bever, G. (2015). Nonparametrically consistent depth-based classifiers. Bernoulli 21 62–82. MR3322313 10.3150/13-BEJ561 euclid.bj/1426597064 Paindaveine, D. and Van Bever, G. (2015). Nonparametrically consistent depth-based classifiers. Bernoulli 21 62–82. MR3322313 10.3150/13-BEJ561 euclid.bj/1426597064

35.

Rice, S. O. (1945). Mathematical analysis of random noise. Bell Syst. Tech. J. 24 46–156. MR11918 10.1002/j.1538-7305.1945.tb00453.xRice, S. O. (1945). Mathematical analysis of random noise. Bell Syst. Tech. J. 24 46–156. MR11918 10.1002/j.1538-7305.1945.tb00453.x

36.

Rousseeuw, P. J. and Ruts, I. (1999). The depth function of a population distribution. Metrika 49 213–244. MR1731769Rousseeuw, P. J. and Ruts, I. (1999). The depth function of a population distribution. Metrika 49 213–244. MR1731769

37.

Serfling, R. (2002). A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on the $L_{1}$-Norm and Related Methods (Neuchâtel, 2002). Stat. Ind. Technol. 25–38. Birkhäuser, Basel. MR2001302 10.1007/978-3-0348-8201-9_3Serfling, R. (2002). A depth function and a scale curve based on spatial quantiles. In Statistical Data Analysis Based on the $L_{1}$-Norm and Related Methods (Neuchâtel, 2002). Stat. Ind. Technol. 25–38. Birkhäuser, Basel. MR2001302 10.1007/978-3-0348-8201-9_3

38.

Serfling, R. (2006). Depth functions in nonparametric multivariate inference. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 1–16. Amer. Math. Soc., Providence, RI. MR2343109Serfling, R. (2006). Depth functions in nonparametric multivariate inference. In Data Depth: Robust Multivariate Analysis, Computational Geometry and Applications. DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 72 1–16. Amer. Math. Soc., Providence, RI. MR2343109

39.

Shapiro, A., Dentcheva, D. and Ruszczyński, A. (2009). Lectures on Stochastic Programming: Modeling and Theory. MPS/SIAM Series on Optimization 9. SIAM, Philadelphia, PA. MR2562798Shapiro, A., Dentcheva, D. and Ruszczyński, A. (2009). Lectures on Stochastic Programming: Modeling and Theory. MPS/SIAM Series on Optimization 9. SIAM, Philadelphia, PA. MR2562798

40.

Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24. MR1389877 10.1214/aos/1033066196 euclid.aos/1033066196 Silverman, B. W. (1996). Smoothed functional principal components analysis by choice of norm. Ann. Statist. 24 1–24. MR1389877 10.1214/aos/1033066196 euclid.aos/1033066196

41.

Swierenga, H., de Weijer, A. P., van Wijk, R. J. and Buydens, L. M. C. (1999). Strategy for constructing robust multivariate calibration models. Chemom. Intell. Lab. Syst. 49 1–17.Swierenga, H., de Weijer, A. P., van Wijk, R. J. and Buydens, L. M. C. (1999). Strategy for constructing robust multivariate calibration models. Chemom. Intell. Lab. Syst. 49 1–17.

42.

Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians (Vancouver, BC, 1974), Vol. 2 523–531. Canad. Math. Congress, Montreal, QC. MR426989Tukey, J. W. (1975). Mathematics and the picturing of data. In Proceedings of the International Congress of Mathematicians (Vancouver, BC, 1974), Vol. 2 523–531. Canad. Math. Congress, Montreal, QC. MR426989

43.

van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York. MR1385671van der Vaart, A. W. and Wellner, J. A. (1996). Weak Convergence and Empirical Processes. Springer, New York. MR1385671

44.

Vardi, Y. and Zhang, C.-H. (2000). The multivariate $L_{1}$-median and associated data depth. Proc. Natl. Acad. Sci. USA 97 1423–1426 (electronic). MR1740461 10.1073/pnas.97.4.1423Vardi, Y. and Zhang, C.-H. (2000). The multivariate $L_{1}$-median and associated data depth. Proc. Natl. Acad. Sci. USA 97 1423–1426 (electronic). MR1740461 10.1073/pnas.97.4.1423

45.

Zuo, Y. and Serfling, R. (2000a). On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. J. Statist. Plann. Inference 84 55–79. MR1747497 10.1016/S0378-3758(99)00142-1Zuo, Y. and Serfling, R. (2000a). On the performance of some robust nonparametric location measures relative to a general notion of multivariate symmetry. J. Statist. Plann. Inference 84 55–79. MR1747497 10.1016/S0378-3758(99)00142-1

46.

Zuo, Y. and Serfling, R. (2000b). General notions of statistical depth function. Ann. Statist. 28 461–482. MR1790005 10.1214/aos/1016218226 euclid.aos/1016218226 Zuo, Y. and Serfling, R. (2000b). General notions of statistical depth function. Ann. Statist. 28 461–482. MR1790005 10.1214/aos/1016218226 euclid.aos/1016218226
Copyright © 2016 Institute of Mathematical Statistics
Alicia Nieto-Reyes and Heather Battey "A Topologically Valid Definition of Depth for Functional Data," Statistical Science 31(1), 61-79, (February 2016). https://doi.org/10.1214/15-STS532
Published: February 2016
Vol.31 • No. 1 • February 2016
Back to Top