The Annals of Statistics

On depth and deep points: a calculus

Ivan Mizera

Full-text: Open access

Abstract

For a general definition of depth in data analysis a differential-like calculus is constructed in which the location case (the framework of Tukey's median) plays a fundamental role similar to that of linear functions in the mathematical analysis. As an application, a lower bound for maximal regression depth is proved in the general multidimensional case--as conjectured by Rousseeuw and Hubert and others. This lower bound is demonstrated to have an impact on the breakdown point of the maximum depth estimator.

Article information

Source
Ann. Statist., Volume 30, Number 6 (2002), 1681-1736.

Dates
First available in Project Euclid: 23 January 2003

Permanent link to this document
https://projecteuclid.org/euclid.aos/1043351254

Digital Object Identifier
doi:10.1214/aos/1043351254

Mathematical Reviews number (MathSciNet)
MR1969447

Zentralblatt MATH identifier
1039.62046

Subjects
Primary: 62H05: Characterization and structure theory
Secondary: 52A40: Inequalities and extremum problems 54C60: Set-valued maps [See also 26E25, 28B20, 47H04, 58C06] 55M25: Degree, winding number 90C29: Multi-objective and goal programming

Keywords
Centerpoint compactification degree of mapping depth halfspace Kronecker index median multivariate location regression set-valued analysis vector optimization weak convergence

Citation

Mizera, Ivan. On depth and deep points: a calculus. Ann. Statist. 30 (2002), no. 6, 1681--1736. doi:10.1214/aos/1043351254. https://projecteuclid.org/euclid.aos/1043351254


Export citation

References

  • ADROVER, J., MARONNA, R. and YOHAI, V. (2000). Relationships between maximum depth and projection regression estimates. J. Statist. Plann. Inference. 105 363-375.
  • AMENTA, N., BERN, M., EPPSTEIN, D. and TENG, S.-H. (2000). Regression depth and center points. Discrete Comput. Geom. 23 305-323.
  • BAI, Z.-D. and HE, X. (1999). Asy mptotic distributions of the maximal depth estimators for regression and multivariate location. Ann. Statist. 27 1616-1637.
  • BALEK, V. and MIZERA, I. (1997). The closeness of the range of a probability on a certain sy stem of random events-an elementary proof. Bull. Belg. Math. Soc. Simon Stevin 4 621-624.
  • BERN, M. and EPPSTEIN, D. (2000). Multivariate regression depth. In Proceedings of the 16th Annual Sy mposium on Computational Geometry 315-321. ACM Press, New York.
  • BICKEL, P. J. and MILLAR, P. W. (1992). Uniform convergence of probability measures on classes of functions. Statist. Sinica 2 1-15.
  • BILLINGSLEY, P. (1968). Convergence of Probability Measures. Wiley, New York.
  • BILLINGSLEY, P. (1971). Weak Convergence of Measures: Applications in Probability. SIAM, Philadelphia.
  • BIRCH, B. J. (1959). On 3N points in a plane. Proc. Cambridge Philos. Soc. 55 289-293.
  • BORISOVICH, Y. G., GEL'MAN, B. D., My SHKIS, A. D. and OBUKHOVSKII, V. V. (1980). Topological methods in the fixed-point theory of multi-valued maps. Uspekhi Mat. Nauk 35 59-126. [Translation (1980) Russian Math. Survey s 35 65-143.]
  • BORISOVICH, Y. G., GEL'MAN, B. D., My SHKIS, A. D. and OBUKHOVSKII, V. V. (1982). Multivalued mappings. In Mathematical Analy sis 19 127-230. Akad. Nauk SSSR, Vsesoy uz. Inst. Nauchn. i Tekhn. Informatsii, Moscow (in Russian).
  • CAPLIN, A. and NALEBUFF, B. (1988). On 64%-majority rule. Econometrica 56 787-814.
  • CAPLIN, A. and NALEBUFF, B. (1991a). Aggregation and social choice: A mean voter theorem. Econometrica 59 1-23.
  • CAPLIN, A. and NALEBUFF, B. (1991b). Aggregation and imperfect competition: On the existence of equilibrium. Econometrica 59 25-59.
  • CARRIZOSA, E. (1996). A characterization of halfspace depth. J. Multivariate Anal. 58 21-26.
  • CELLINA, A. (1969). Approximation of set-valued functions and fixed point theorems. Ann. Mat. Pura Appl. 82 17-24.
  • CHAMBERLIN, E. (1933). The Theory of Monopolistic Competition. Harvard Univ. Press.
  • DANIELS, H. E. (1954). A distribution-free test for regression parameters. Ann. Math. Statist. 25 499-513.
  • DAVIES, P. L. (1993). Aspects of robust linear regression. Ann. Statist. 21 1843-1899.
  • DODSON, C. T. J. and PARKER, P. E. (1997). A User's Guide to Algebraic Topology. Kluwer, Dordrecht.
  • DONOHO, D. L. and GASKO, M. (1992). Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Statist. 20 1803-1827.
  • DOOB, J. L. (1994). Measure Theory. Springer, New York.
  • EDELSBRUNNER, H. (1987). Algorithms in Combinatorial Geometry. Springer, Berlin.
  • EDGEWORTH, F. Y. (1888). On a new method of reducing observations relating to several quantities. Philosophical Magazine 25 184-191.
  • FRISCH, R. (1966). Maxima and Minima: Theory and Economic Applications. Reidel, Dordrecht.
  • HE, X. and PORTNOY, S. (1998). Asy mptotics of the deepest line. In Applied Statistical Science III: Papers in Honor of A. K. Md. E. Saleh (S. E. Ahmed, M. Ahsanullah and B. K. Sinha, eds.) 71-81. Nova Science Publications, Commack, NY.
  • HE, X. and WANG, G. (1997). Convergence of depth contours for multivariate datasets. Ann. Statist. 25 495-504.
  • HILL, B. M. (1960). A relationship between Hodges' bivariate sign test and a non-parametric test of Daniels. Ann. Math. Statist. 31 1190-1192.
  • HODGES, J. L., JR. (1955). A bivariate sign test. Ann. Math. Statist. 26 523-527.
  • HOTELLING, H. (1929). Stability in competition. Econom. J. 39 41-57.
  • HUBERT, M., ROUSSEEUW, P. J. and VAN AELST, S. (1999). Similarities between location depth and regression depth. In Statistics in Genetics and in the Environmental Sciences (L. Fernholz, S. Morgenthaler and W. Stahel, eds.) 159-172. Birkhäuser, Basel.
  • KLEIN, E. and THOMPSON, A. C. (1984). Theory of Correspondences. Wiley, New York.
  • LIU, R. Y., PARELIUS, J. M. and SINGH, K. (1999). Multivariate analysis by data depth: Descriptive statistics, graphics and inference. Ann. Statist. 27 783-840.
  • LIU, R. Y. and SINGH, K. (1993). A quality index based on data depth and multivariate rank tests. J. Amer. Statist. Assoc. 88 252-260.
  • MIZERA, I. (1998). On depth and deep points: A calculus (abstract). IMS Bull. 27 207.
  • NEUMANN, B. H. (1945). On an invariant of plane regions and mass distributions. J. London Math. Soc. 20 226-237.
  • NIKAIDO, H. (1968). Convex Structures and Economic Theory. Academic Press, New York.
  • NOLAN, D. (1992). Asy mptotics for multivariate trimming. Stochastic Process. Appl. 42 157-169.
  • NOLAN, D. (1999). On min-max majority and deepest points. Statist. Probab. Lett. 43 325-333.
  • ORTEGA, J. M. and RHEINBOLDT, W. C. (1970). Iterative Solution of Nonlinear Equations in Several Variables. Academic Press, New York.
  • PONSTEIN, J. (1967). Seven kinds of convexity. SIAM Rev. 9 115-119.
  • PORTNOY, S. and MIZERA, I. (1999). Comment on "Regression depth," by P. J. Rousseeuw and M. Hubert. J. Amer. Statist. Assoc. 94 417-419.
  • RADO, R. (1946). A theorem on general measure. J. London Math. Soc. 21 291-300.
  • ROCKAFELLAR, R. T. and WETS, R. J.-B. (1998). Variational Analy sis. Springer, Berlin.
  • ROTMAN, J. J. (1988). An Introduction to Algebraic Topology. Springer, New York.
  • ROUSSEEUW, P. J. and HUBERT, M. (1999a). Regression depth. J. Amer. Statist. Assoc. 94 388-402.
  • ROUSSEEUW, P. J. and HUBERT, M. (1999b). Depth in an arrangement of hy perplanes. Discrete Comput. Geom. 22 167-176.
  • SMALE, S. (1973). Global analysis and economics I: Pareto optimum and a generalization of Morse theory. In Dy namical Sy stems. Proc. Sy mposium Univ. Bahia, Salvador (M. M. Peixoto, ed.) 531-544. Academic Press, New York.
  • SMALE, S. (1975a). Sufficient conditions for an optimum. Dy namical Sy stems. Lecture Notes in Math. 468 287-292. Springer, Berlin.
  • SMALE, S. (1975b). Optimizing several functions. In Manifolds-Toky o 1973. Proc. International Conference on Manifolds and Related Topics in Topology (A. Hattori, ed.) 69-75. Univ. Toky o Press.
  • TJUR, T. (1980). Probability Based on Radon Measures. Wiley, New York.
  • TUKEY, J. W. (1975). Mathematics and the picturing of data. In Proc. International Congress of Mathematicians 2 523-531. Canad. Math. Congress, Montreal.
  • VAN AELST, S. and ROUSSEEUW, P. J. (2000). Robustness of deepest regression. J. Multivariate Anal. 73 82-106.
  • VAN AELST, S., ROUSSEEUW, P. J., HUBERT, M. and STRUy F, A. (2001). The deepest regression method. J. Multivariate Anal. 81 138-166.
  • WAN, Y. H. (1975). On local Pareto optima. J. Math. Econom. 2 35-42.
  • WAN, Y. H. (1978). On the structure and stability of local Pareto optima in a pure exchange economy. J. Math. Econom. 5 255-274.
  • EDMONTON, ALBERTA T6G 2G1 CANADA E-MAIL: mizera@stat.ualberta.ca