Statistical Science

Self-consistency: a fundamental concept in statistics

Bernard Flury and Thaddeus Tarpey

Full-text: Open access


The term "self-consistency" was introduced in 1989 by Hastie and Stuetzle to describe the property that each point on a smooth curve or surface is the mean of all points that project orthogonally onto it. We generalize this concept to self-consistent random vectors: a random vector Y is self-consistent for X if $\mathscr{E}[X|Y] = Y$ almost surely. This allows us to construct a unified theoretical basis for principal components, principal curves and surfaces, principal points, principal variables, principal modes of variation and other statistical methods. We provide some general results on self-consistent random variables, give examples, show relationships between the various methods, discuss a related notion of self-consistent estimators and suggest directions for future research.

Article information

Statist. Sci., Volume 11, Number 3 (1996), 229-243.

First available in Project Euclid: 17 September 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Elliptical distribution EM algorithm $k$-means algorithm mean squared error principal components principal curves principal modes of variation principal points principal variables regression self-organizing maps spherical distribution Voronoi region


Tarpey, Thaddeus; Flury, Bernard. Self-consistency: a fundamental concept in statistics. Statist. Sci. 11 (1996), no. 3, 229--243. doi:10.1214/ss/1032280215.

Export citation


  • Anderson, T. W. (1963). Asy mptotic theory for principal component analysis. Ann. Math. Statist. 34 122-148.
  • Bandeen-Roche, K. (1994). Resolution of additive mixtures into source components and contributions: a compositional approach. J. Amer. Statist. Assoc. 89 1450-1458.
  • Banfield, J. and Raftery, A. (1992). Ice floe identification in satellite images using mathematical morphology and clustering about principal curves. J. Amer. Statist. Assoc. 87 7-16.
  • Bickel, P. J. and Doksum, K. A. (1977). Mathematical Statistics. Holden-Day, San Francisco.
  • Casella, G. and Berger, R. L. (1990). Statistical Inference. Duxbury Press, Belmont, CA.
  • Castro, P. E., Lawton, W. H. and Sy lvestre, E. A. (1986). Principal modes of variation for processes with continuous sample curves. Technometrics 28 329-337.
  • Cox, D. R. (1957). Note on grouping. J. Amer. Statist. Assoc. 52 543-547.
  • Cox, D. R. and Oakes, D. (1984). Analy sis of Survival Data. Chapman and Hall, New York.
  • Cuesta, J. A. and Matran, C. (1988). The strong law of large numbers for k-means and best possible nets of Banach valued random variables. Probab. Theory Related Fields 78 523- 534.
  • Dalenius, T. (1950). The problem of optimum stratification. Skandinavisk Aktuarietidskrift 33 203-213.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1-38.
  • Efron, B. (1967). The two sample problem with censored data. Proc. Fifth Berkeley Sy mp. Math. Statist. Probab. 4 831-853. Univ. California Press, Berkeley.
  • Eubank, R. L. (1988). Optimal grouping, spacing, stratification, and piecewise constant approximation. SIAM Rev. 30 404- 420.
  • Fang, K. and He, S. (1982). The problem of selecting a given number of representative points in a normal population and a generalized Mill's ratio. Technical report, Dept. Statistics, Stanford Univ.
  • Fang, K., Kotz, S. and Ng, K. (1990). Sy mmetric Multivariate and Related Distributions. Chapman and Hall, New York.
  • Flury, B. (1990). Principal points. Biometrika 77 33-41.
  • Flury, B. (1993). Estimation of principal points. J. Roy. Statist. Soc. Ser. C 42 139-151.
  • Flury, B. and Tarpey, T. (1993). Representing a large collection of curves: a case for principal points. Amer. Statist. 47 304- 306.
  • Friedman, A. (1982). Foundations of Modern Analy sis. Dover, New York.
  • Hartigan, J. A. (1975). Clustering Algorithms. Wiley, New York.
  • Hastie, T. and Stuetzle, W. (1989). Principal curves. J. Amer. Statist. Assoc. 84 502-516.
  • IEEE (1982). IEEE Trans. Inform. Theory 28. (Special issue on quantization.)
  • Iy engar, S. and Solomon, H. (1983). Selecting representative points in normal populations. In Recent Advances in Statistics: Papers in Honor of Herman Chernoff on his 60th Birthday (M. H. Rizvi, J. Rustagi and D. Siegmund, eds.) 579-591. Academic Press, New York.
  • Jolicoeur, P. (1968). Interval estimation of the slope of the major axis of a bivariate normal distribution in the case of a small sample. Biometrics 24 679-682.
  • Jolicoeur, P. and Mosimann, J. E. (1960). Size and shape variation in the painted turtle; a principal component analysis. Growth 24 339-354.
  • Kohonen, T. (1995). Self-Organizing Maps. Springer, Berlin.
  • Kshirsagar, A. M. (1961). The goodness of fit of a single (nonisotropic) hy pothetical principal component. Biometrika 48 397-407.
  • Laird, N. (1988). Self-Consistency. In Ency clopedia of Statistical Sciences, 8 347-351. Wiley, New York.
  • Leblanc, M. and Tibshirani, R. (1994). Adaptive principal surfaces. J. Amer. Statist. Assoc. 89 53-64.
  • Little, J. and Rubin, D. B. (1987). Statistical Analy sis with Missing Data. Wiley, New York.
  • Lloy d, S. (1982). Least squares quantization in PCM. IEEE Trans. Inform. Theory 28 129-149.
  • MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proc. Fifth Berkeley Sy mp. Math. Statist. Probab. 3 281-297. Univ. California Press, Berkeley.
  • Mallows, C. L. (1961). Latent vectors of random sy mmetric matrices. Biometrika 48 133-149.
  • McCabe, G. P. (1984). Principal variables. Technometrics 26 137- 144.
  • P¨arna, K. (1990). On the existence and weak convergence of kcentres in Banach spaces. Acta et Commentationes Universitatis Tartuensis 893 17-28.
  • Pearson, K. (1901). On lines and planes of closest fit to sy stems of points in space. Philosophical Magazine 2 559-572.
  • Pollard, D. (1981). Strong consistency of K-means clustering. Ann. Statist. 9 135-140.
  • Rowe, S. (1996). An algorithm for computing principal points with respect to a loss function in the unidimensional case. Statistics and Computing 6 187-190.
  • Schott, J. R. (1991). A test for a specific principal component of a correlation matrix. J. Amer. Statist. Assoc. 86 747-751.
  • Tarpey, T. (1995). Principal points and self-consistent points of sy mmetric multivariate distributions. J. Multivariate Anal. 53 39-51.
  • Tarpey, T. (1996). Self-consistent patterns for sy mmetric, multivariate distributions. Unpublished manuscript.
  • Tarpey, T., Li, L. and Flury, B. (1995). Principal points and selfconsistent points of elliptical distributions. Ann. Statist. 23 103-112.
  • Tibshirani, R. (1992). Principal curves revisited. Statistics and Computing 2 183-190.
  • Ty ler, D. (1983). A class of asy mptotic tests for principal component vectors. Ann. Statist. 11 1243-1250.
  • Zopp e, A. (1995). Principal points of univariate continuous distributions. Statistics and Computing 5 127-132.