Institute of Mathematical Statistics Collections

On the history and use of some standard statistical models

E. L. Lehmann

Full-text: Open access


This paper tries to tell the story of the general linear model, which saw the light of day 200 years ago, and the assumptions underlying it. We distinguish three principal stages (ignoring earlier more isolated instances). The model was first proposed in the context of astronomical and geodesic observations, where the main source of variation was observational error. This was the main use of the model during the 19th century.

In the 1920’s it was developed in a new direction by R.A. Fisher whose principal applications were in agriculture and biology. Finally, beginning in the 1930’s and 40’s it became an important tool for the social sciences. As new areas of applications were added, the assumptions underlying the model tended to become more questionable, and the resulting statistical techniques more prone to misuse.

Chapter information

Deborah Nolan and Terry Speed, eds., Probability and Statistics: Essays in Honor of David A. Freedman (Beachwood, Ohio, USA: Institute of Mathematical Statistics, 2008), 114-126

First available in Project Euclid: 7 April 2008

Permanent link to this document

Digital Object Identifier

Zentralblatt MATH identifier

Primary: 62A01: Foundations and philosophical topics 62-03: Historical (must also be assigned at least one classification number from Section 01) 62J05: Linear regression

assumptions independence least squares linear model normality observational studies

Copyright © 2008, Institute of Mathematical Statistics


Lehmann, E. L. On the history and use of some standard statistical models. Probability and Statistics: Essays in Honor of David A. Freedman, 114--126, Institute of Mathematical Statistics, Beachwood, Ohio, USA, 2008. doi:10.1214/193940307000000419.

Export citation


  • [1] Airy, G. (1861, 1879). On the Algebraical and Numerical Theory of Errors of Observations, 3rd ed. Macmillan, London.
  • [2] Arrow, K. (1951). Mathematical models in the social sciences. In The Policy Sciences (Lerner and Lasswell, eds.). Stanford Univ. Press.
  • [3] Bahadur, R. and Savage, L. (1956). The nonexistence of certain statistical procedures in nonparametric problems. Ann. Math. Statist. 27 1115–1122.
  • [4] Box, G. (1953). Non-normality and tests of variances. Biometrika 40 318–335.
  • [5] Brownlee, K. (1960). Statistical Theory and Methodology in Science and Engineering. Wiley, New York.
  • [6] Brunt, D. (1917). The Combination of Observations. Cambridge Univ. Press.
  • [7] Brunt, D. (1931). The Combination of Observations. Cambridge Univ. Press.
  • [8] Carriquiry, A. and David, H. (2001). George Waddel Snedecor. In Statisticians of the Centuries (Heyde and Seneta, eds.). Springer, New York.
  • [9] Czuber, E. (1891). Theorie der Beobachtungsfehler. Teubner, Leipzig.
  • [10] De Moivre, A. (1733). Approximatio ad Summam Terminorum Binomii (a+b)n in Seriem Expansi. Printed for private circulation.
  • [11] Fisher, R. (1925). Statistical Methods for Research Workers. Oliver and Boyd, Edinburgh.
  • [12] Fisher, R. (1935). The Design of Experiments. Oliver and Boyd, Edinburgh.
  • [13] Freedman, D. (1987). As others see us (with discussion). J. Educ. Statist. 12 101–223. Reprinted in J. Shaffer, ed. The Role of Models in Nonexperimental Social Science AERA/ASA Washington, D.C. (1997).
  • [14] Freedman, D. (1991). Statistical models and shoe leather. In Sociological Methodology (P. Marsden, ed.). Amer. Social Assoc., Washington, D.C.
  • [15] Freedman, D. (2005). Statistical Models: Theory and Practice. Cambridge Univ. Press.
  • [16] Galton, F. (1877). Typical laws of heredity. Nature 15 492–495, 512–514, 532–533.
  • [17] Geary, R. (1947). Testing for normality. Biometrika 34 209–242.
  • [18] Gosset, W. (1970). Letters from W. S. Gosset to R. A. Fisher, 1915–1936 with summaries by R. A. Fisher and a Foreword by L. McMullen. Printed for private circulation.
  • [19] Haavelmo, T. (1944). The probability approach in econometrics. Econometrica 12 Supplement.
  • [20] Hald, A. (1998). A History of Mathematical Statistics from 1750 to 1930. Wiley, New York.
  • [21] Helmert, F. (1872). Die Ausgleichsrechnung nach der Methode der Kleinsten Quadrate. Teubner, Leipzig.
  • [22] Helmert, F. (1907). Die Ausgleichsrechnung nach der Methode der Kleinsten Quadrate. Teubner, Leipzig.
  • [23] Kolmogorov, A. (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung. Springer, Berlin.
  • [24] Koopmans, T. (1937). Linear Regression Analysis of Economic Time Series. Netherlands Economic Institute, Haarlem.
  • [25] Krüger, L., Gigerenzer, G. and Morgan, M., eds. (1987). The Probabilistic Revolution 2. MIT Press, Cambridge, MA.
  • [26] Kruskal, W. (1988). Miracles and statistics: The casual assumption of independence. J. Amer. Statist. Assoc. 83 929–940.
  • [27] Legendre, A. (1805). Nouvelles Méthodes pour la Détermination des Orbites des Comètes. Courcier, Paris.
  • [28] Lexis, W. (1877). Theorie der Massenerscheinungen in der Menschlichen Gesellschaft. Wagner, Freiburg.
  • [29] Merriman, M. (1884). A textbook on the method of least squares, 8th ed. 1900.
  • [30] Miller, R. (1986). Beyond ANOVA, Basics of Applied Statistics. Wiley, New York.
  • [31] Pearson, E. (1931). The analysis of variance in cases of non-normal variation. Biometrika 23 114–133.
  • [32] Pearson, E. (1990). Student. Clarendon Press, Oxford.
  • [33] Pearson, K. (1892). Grammar of Science. Walter Scott, London. 3rd ed. of 1911 reprinted by Meridian Books, 1957.
  • [34] Pearson, K. (1895). Contributions to the mathematical theory of evolution, II. Skew variation in homogeneous material. Phil. Trans. Roy. Soc. London 186 343–414.
  • [35] Pearson, K. (1897). On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proc. Roy. Soc. 60 489–497.
  • [36] Pearson, K. (1902). On the mathematical theory of errors of judgement, with special reference to the personal equation. Phil. Trans. Roy. Soc. London 198 235–299.
  • [37] Peirce, C. (1873). On the theory of errors of observations. Appendix 21 to the Report of the Superintendent of the U.S. Coast Survey for the year ending June 1870. 200–224. Reprinted in Stigler: American Contributions to Mathematical Statistics in the Nineteenth Century 2 (1980). Arno Press, New York.
  • [38] Plackett, R. (1972). The discovery of the method of least squares. Biometrika 59 239–251.
  • [39] Rosenbaum, P. (1995, 2002). Observational Studies. Springer, New York.
  • [40] Scheffé, H. (1959). The Analysis of Variance. Wiley, New York.
  • [41] Seal, H. (1967). The historical developmnent of the gauss linear model. Biometrika 54 1–24.
  • [42] Shafer, G. and Vovk, V. (2006). The sources of kolmogorovs Grundbegriffe. Statist. Sci. 21 70–98.
  • [43] Shewhart, W. and Winters, F. (1928). Small samples – new experimental results. J. Amer. Statist. Assoc. 23 144–153.
  • [44] Snedecor, G. (1937). Statistial Methods. The Iowa State College Press, Ames, Iowa.
  • [45] Stigler, S. (1986). The History of Statistics. Belknap Press, Cambridge, MA.
  • [46] Stigler, S. (1999). Statistics on the Table. Harvard Univ. Press, Cambridge, MA.
  • [47] Tolley, H. (1929). Economic data from the sampling point of view. J. Amer. Statist. Assoc. 24 69–72.
  • [48] von Mises, R. (1919). Grundlagen der Wahrscheinlichkeitsrechnung. Math. Zeitschrift 5 52–99.
  • [49] von Plato, J. (1994). Creating Modern Probability. Cambridge Univ. Press.