The Annals of Statistics

What is a statistical model?

Peter McCullagh
Source: Ann. Statist. Volume 30, Number 5 (2002), 1225-1310.

Abstract

This paper addresses two closely related questions, "What is a statistical model?" and "What is a parameter?" The notions that a model must "make sense," and that a parameter must "have a well-defined meaning" are deeply ingrained in applied statistical work, reasonably well understood at an instinctive level, but absent from most formal theories of modelling and inference. In this paper, these concepts are defined in algebraic terms, using morphisms, functors and natural transformations. It is argued that inference on the basis of a model is not possible unless the model admits a natural extension that includes the domain for which inference is required. For example, prediction requires that the domain include all future units, subjects or time points. Although it is usually not made explicit, every sensible statistical model admits such an extension. Examples are given to show why such an extension is necessary and why a formal theory is required. In the definition of a subparameter, it is shown that certain parameter functions are natural and others are not. Inference is meaningful only for natural parameters. This distinction has important consequences for the construction of prior distributions and also helps to resolve a controversy concerning the Box-Cox model.

First Page: Show Hide

Related Works:

Primary Subjects: 62AO5
Secondary Subjects: 62F99
Full-text: Open access
Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1035844977
Digital Object Identifier: doi:10.1214/aos/1035844977
Mathematical Reviews number (MathSciNet): MR1936320
Zentralblatt MATH identifier: 01916779

References

ALDOUS, D. (1981). Representations for partially exchangeable array s of random variables. J. Multivariate Analy sis 11 581-598.
Mathematical Reviews (MathSciNet): MR82m:60022
Zentralblatt MATH: 0474.60044
Digital Object Identifier: doi:10.1016/0047-259X(81)90099-3
ANDREWS, D. F. and HERZBERG, A. (1985). Data. Springer, New York.
BARNDORFF-NIELSEN, O. E. and COX, D. R. (1994). Inference and Asy mptotics. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR96b:62002
Zentralblatt MATH: 0826.62004
BARTLETT, M. S. (1978). Nearest neighbour models in the analysis of field experiments (with discussion). J. Roy. Statist. Soc. Ser. B 40 147-174.
Mathematical Reviews (MathSciNet): MR80c:62093
BERGER, J. O. (1985). Statistical Decision Theory and Bayesian Analy sis, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR804611
Zentralblatt MATH: 0572.62008
BERNARDO, J. M. and SMITH, A. F. M. (1994). Bayesian Theory. Wiley, New York.
Mathematical Reviews (MathSciNet): MR96a:62006
BESAG, J. (1974). Spatial interaction and the statistical analysis of lattice sy stems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192-236.
Mathematical Reviews (MathSciNet): MR51:9409
BESAG, J. and HIGDON, D. (1999). Bayesian analysis of agricultural field experiments (with discussion). J. Roy. Statist. Soc. Ser. B 61 691-746.
Mathematical Reviews (MathSciNet): MR1722238
Zentralblatt MATH: 0951.62091
Digital Object Identifier: doi:10.1111/1467-9868.00201
BESAG, J. and KOOPERBERG, C. (1995). On conditional and intrinsic autoregressions. Biometrika 82 733-746.
Mathematical Reviews (MathSciNet): MR97b:62164
Zentralblatt MATH: 0899.62123
BEST, N. G., ICKSTADT, K. and WOLPERT, R. L. (1999). Contribution to the discussion of Besag
(1999). J. Roy. Statist. Soc. Ser. B 61 728-729.
BICKEL, P. and DOKSUM, K. A. (1981). An analysis of transformations revisited. J. Amer. Statist. Assoc. 76 296-311.
Mathematical Reviews (MathSciNet): MR83b:62048
Zentralblatt MATH: 0464.62058
Digital Object Identifier: doi:10.2307/2287831
BILLINGSLEY, P. (1986). Probability and Measure, 2nd ed. Wiley, New York.
Mathematical Reviews (MathSciNet): MR87f:60001
BOX, G. E. P. and COX, D. R. (1964). An analysis of transformations (with discussion). J. Roy. Statist. Soc. Ser. B 26 211-252.
Mathematical Reviews (MathSciNet): MR33:836
BOX, G. E. P. and COX, D. R. (1982). An analysis of transformations revisited, rebutted. J. Amer. Statist. Assoc. 77 209-210.
Zentralblatt MATH: 0504.62058
Mathematical Reviews (MathSciNet): MR648047
Digital Object Identifier: doi:10.2307/2287791
COX, D. R. (1958). Planning of Experiments. Wiley, New York.
Mathematical Reviews (MathSciNet): MR95561
COX, D. R. (1986). Comment on Holland (1986). J. Amer. Statist. Assoc. 81 963-964.
COX, D. R. and HINKLEY, D. V. (1974). Theoretical Statistics. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR370837
COX, D. R. and REID, N. (1987). Parameter orthogonality and approximate conditional inference (with discussion). J. Roy. Statist. Soc. Ser. B 49 1-39.
Zentralblatt MATH: 0616.62006
Mathematical Reviews (MathSciNet): MR893334
COX, D. R. and SNELL, E. J. (1981). Applied Statistics. Chapman and Hall, London.
COX, D. R. and WERMUTH, N. (1996). Multivariate Dependencies. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR1456990
Zentralblatt MATH: 0880.62124
DALE, J. R. (1984). Local versus global association for bivariate ordered responses. Biometrika 71 507-514.
Mathematical Reviews (MathSciNet): MR86b:62091
Digital Object Identifier: doi:10.1093/biomet/71.3.507
DE FINETTI, B. (1975). Theory of Probability 2. Wiley, New York.
GELMAN, A., CARLIN, J. B., STERN, H. and RUBIN, D. B. (1995). Bayesian Data Analy sis. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR97c:62059
GOODMAN, L. A. (1979). Simple models for the analysis of association in cross-classifications having ordered categories. J. Amer. Statist. Assoc. 74 537-552.
Mathematical Reviews (MathSciNet): MR548257
Digital Object Identifier: doi:10.2307/2286971
GOODMAN, L. A. (1981). Association models and canonical correlation in the analysis of crossclassifications having ordered categories. J. Amer. Statist. Assoc. 76 320-334.
Mathematical Reviews (MathSciNet): MR624334
Digital Object Identifier: doi:10.2307/2287833
HAMADA, M. and WU, C. F. J. (1992). Analy sis of designed experiments with complex aliasing. J. Qual. Technology 24 130-137.
HARVILLE, D. A. and ZIMMERMANN, D. L. (1999). Contribution to the discussion of Besag (1999). J. Roy. Statist. Soc. Ser. B 61 733-734.
HELLAND, I. S. (1999a). Quantum mechanics from sy mmetry and statistical modelling. Internat. J. Theoret. Phy s. 38 1851-1881.
Mathematical Reviews (MathSciNet): MR1704291
Zentralblatt MATH: 0953.81003
Digital Object Identifier: doi:10.1023/A:1026676913271
HELLAND, I. S. (1999b). Quantum theory from sy mmetries in a general statistical parameter space. Technical report, Dept. Mathematics, Univ. Oslo.
HINKLEY, D. V. and RUNGER, G. (1984). The analysis of transformed data (with discussion). J. Amer. Statist. Assoc. 79 302-320.
Mathematical Reviews (MathSciNet): MR85m:62142
Zentralblatt MATH: 0553.62051
Digital Object Identifier: doi:10.2307/2288264
HOLLAND, P. (1986). Statistics and causal inference (with discussion). J. Amer. Statist. Assoc. 81 945-970.
Mathematical Reviews (MathSciNet): MR88k:62010
Zentralblatt MATH: 0607.62001
Digital Object Identifier: doi:10.2307/2289064
HORA, R. B. and BUEHLER, R. J. (1966). Fiducial theory and invariant estimation. Ann. Math. Statist. 37 643-656.
Mathematical Reviews (MathSciNet): MR33:8078
Zentralblatt MATH: 0148.13805
Digital Object Identifier: doi:10.1214/aoms/1177699458
Project Euclid: euclid.aoms/1177699458
KINGMAN, J. F. C. (1984). Present position and potential developments: Some personal views. Probability and random processes. J. Roy. Statist. Soc. Ser. A 147 233-244.
KINGMAN, J. F. C. (1993). Poisson Processes. Oxford Univ. Press.
Mathematical Reviews (MathSciNet): MR94a:60052
LAURITZEN, S. (1988). Extremal Families and Sy stems of Sufficient Statistics. Lecture Notes in Statist. 49. Springer, New York.
Mathematical Reviews (MathSciNet): MR90g:62010
Zentralblatt MATH: 0681.62009
LEHMANN, E. L. (1983). Theory of Point Estimation. Wiley, New York.
Mathematical Reviews (MathSciNet): MR85a:62001
LEHMANN, E. L. and CASELLA, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR99g:62025
LITTELL, R., FREUND, R. J. and SPECTOR, P. C. (1991). SAS Sy stem for Linear Models, 3rd ed. SAS Institute, Cary, NC.
MAC LANE, S. (1998). Categories for the Working Mathematician, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR2001j:18001
MCCULLAGH, P. (1980). Regression models for ordinal data (with discussion). J. Roy. Statist. Soc. Ser. B 42 109-142.
Mathematical Reviews (MathSciNet): MR81j:62107
MCCULLAGH, P. (1992). Conditional inference and Cauchy models. Biometrika 79 247-259.
Mathematical Reviews (MathSciNet): MR93h:62048
Zentralblatt MATH: 0753.62002
Digital Object Identifier: doi:10.1093/biomet/79.2.247
MCCULLAGH, P. (1996). Möbius transformation and Cauchy parameter estimation. Ann. Statist. 24 787-808.
Zentralblatt MATH: 0859.62007
Mathematical Reviews (MathSciNet): MR1394988
Digital Object Identifier: doi:10.1214/aos/1032894465
Project Euclid: euclid.aos/1032894465
MCCULLAGH, P. (1999). Quotient spaces and statistical models. Canad. J. Statist. 27 447-456.
Mathematical Reviews (MathSciNet): MR1745814
Digital Object Identifier: doi:10.2307/3316103
MCCULLAGH, P. (2000). Invariance and factorial models (with discussion). J. Roy. Statist. Soc. Ser. B 62 209-256.
Mathematical Reviews (MathSciNet): MR2002a:62102
Digital Object Identifier: doi:10.1111/1467-9868.00229
MCCULLAGH, P. and NELDER, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR727836
MCCULLAGH. P. and WIT, E. (2000). Natural transformation and the Bay es map. Technical report.
MERCER, W. B. and HALL, A. D. (1911). The experimental error of field trials. J. Agric. Research 50 331-357.
NELDER, J. A. (1977). A re-formulation of linear models (with discussion). J. Roy. Statist. Soc. Ser. A 140 48-77.
Mathematical Reviews (MathSciNet): MR56:16943
Digital Object Identifier: doi:10.2307/2344517
PEARSON, K. (1913). Note on the surface of constant association. Biometrika 9 534-537.
PLACKETT, R. L. (1965). A class of bivariate distributions. J. Amer. Statist. Assoc. 60 516-522.
Mathematical Reviews (MathSciNet): MR32:524
Digital Object Identifier: doi:10.2307/2282685
RUBIN, D. (1978). Bayesian inference for causal effects: The role of randomization. Ann. Statist. 6 34-58.
Mathematical Reviews (MathSciNet): MR472152
Zentralblatt MATH: 0383.62021
Digital Object Identifier: doi:10.1214/aos/1176344064
Project Euclid: euclid.aos/1176344064
RUBIN, D. (1986). Comment on Holland (1986). J. Amer. Statist. Assoc. 81 961-962.
SMITH, A. F. M. (1984). Present position and potential developments: some personal views. Bayesian statistics. J. Roy. Statist. Soc. Ser. A 147 245-259.
TJUR, T. (2000). Contribution to the discussion of McCullagh (2000). J. Roy. Statist. Soc. Ser. B 62 238-239.
WHITTLE, P. (1974). Contribution to the discussion of Besag (1974). J. Roy. Statist. Soc. Ser. B 36 228.
Mathematical Reviews (MathSciNet): MR373208
YANDELL, B. S. (1997). Practical Data Analy sis for Designed Experiments. Chapman and Hall, London.
CHICAGO, ILLINOIS 60637-1514 E-MAIL: pmcc@galton.uchicago.edu
berg (1995), Besag and Higdon (1999) and Rue and Tjelmeland (2002). However, spatial effects are often of secondary importance, as in variety trials, and the main intention is to absorb an appropriate level of spatial variation in the formulation, rather than produce a spatial model with scientifically interpretable parameters. Nevertheless, McCullagh's basic point is well taken. For example, I view the use of MRFs in geographical epidemiology [e.g., Besag, York and Mollié (1991)] as mainly of exploratory value, in suggesting additional spatially related covariates whose inclusion would ideally dispense with the need for a spatial formulation;
uniformity trials in Fairfield Smith (1938) and Pearce (1976). Of course, in a genuine variety trial, one might want to predict what the aggregate yield over the entire field would have been for a few individual varieties but this does not require any extension of the formulation to McCullagh's conceptual plots. Indeed, such calculations are especially well suited to the Bayesian paradigm, both theoretically, because one is supposed to deal with potentially observable quantities rather than merely with parameters, and in practice, via MCMC, because the posterior predictive distributions are available rigorously. That is, for the aggregate yield of variety A, one uses the observed yields on plots that were sown with A and generates a set of observations from the likelihood for those that were not for each MCMC sample of parameter values, hence building a corresponding distribution of total yield. One may also construct credible intervals for the difference in total yields between varieties A and B and easily address all manner of questions in ranking and selection that simply cannot be considered in a frequentist framework; for example, the posterior probability that the total yield obtained by sowing any particular variety (perhaps chosen in the light of the experiment) would have been at least 10% greater than that of growing any other test variety in the field.
ton (1986). The findings ty pically suggest that the gains from spatial analysis in a badly designed experiment provide improvements commensurate with standard analysis and optimal design. This is not a reason to adopt poor designs but the simple fact is that, despite the efforts of statisticians, many experiments are carried out using nothing better than randomized complete blocks. It is highly desirable that the representation of fertility is flexible but is also parsimonious because there are many variety effects to be estimated, with very limited replication. McCullagh's use of discrete approximations to harmonic functions in Section 8 fails on both counts: first, local maxima or minima cannot exist except (artificially) at plots on the edge of the trial; second, the degrees of freedom lost in the fit equals the number of such plots and is therefore substantial (in fact, four less in a rectangular lay out because the corner plots are ignored throughout the analysis!). Nevertheless, there is something appealing about the averaging property of harmonic functions, if only it were a little more flexible. What is required is a random effects (in frequentist terms) version and that is precisely the thinking behind the use of intrinsic autoregressions in BH and elsewhere. Indeed, such schemes fit McCullagh's discretized harmonic functions perfectly, except for edge effects (because BH embeds the array in a larger one to cater for such effects), and they also provide a good fit to more plausible fertility functions. For specific comments on the Mercer and Hall data, see below. Of course, spatial scale remains an important issue for variety trials and indeed is discussed empirically in Section 2.3 and in the rejoinder of BH. For onedimensional adjustment, the simplest plausible continuum process is Brownian motion with an arbitrary level, for which the necessary integrations can be
ATKINSON, A. C. and BAILEY, R. A. (2001). One hundred years of the design of experiments on and off the pages of Biometrika. Biometrika 88 53-97.
Mathematical Reviews (MathSciNet): MR2002b:62001
Zentralblatt MATH: 1037.62069
Digital Object Identifier: doi:10.1093/biomet/88.1.53
BESAG, J. E. (1974). Spatial interaction and the statistical analysis of lattice sy stems (with discussion). J. Roy. Statist. Soc. Ser. B 36 192-236.
Mathematical Reviews (MathSciNet): MR51:9409
BESAG, J. E. (1975). Statistical analysis of non-lattice data. The Statistician 24 179-195.
BESAG, J. E., GREEN, P. J., HIGDON, D. M. and MENGERSEN, K. L. (1995). Bayesian computation and stochastic sy stems (with discussion). Statist. Sci. 10 3-66.
Mathematical Reviews (MathSciNet): MR96m:62048
Digital Object Identifier: doi:10.1214/ss/1177010123
Project Euclid: euclid.ss/1177010123
BESAG, J. E. and HIGDON, D. M. (1993). Bayesian inference for agricultural field experiments. Bull. Internat. Statist. Inst. 55 121-136.
BESAG, J. E. and HIGDON, D. M. (1999). Bayesian analysis of agricultural field experiments (with discussion). J. Roy. Statist. Soc. Ser. B 61 691-746.
Mathematical Reviews (MathSciNet): MR1722238
Zentralblatt MATH: 0951.62091
Digital Object Identifier: doi:10.1111/1467-9868.00201
BESAG, J. E. and KEMPTON, R. A. (1986). Statistical analysis of field experiments using neighbouring plots. Biometrics 42 231-251.
Zentralblatt MATH: 0658.62129
BESAG, J. E. and KOOPERBERG, C. L. (1995). On conditional and intrinsic autoregressions. Biometrika 82 733-746.
Mathematical Reviews (MathSciNet): MR97b:62164
Zentralblatt MATH: 0899.62123
BESAG, J. E., YORK, J. C. and MOLLIÉ, A. (1991). Bayesian image restoration, with two applications in spatial statistics (with discussion). Ann. Inst. Statist. Math. 43 1-59.
Mathematical Reviews (MathSciNet): MR92d:62032
Zentralblatt MATH: 0760.62029
Digital Object Identifier: doi:10.1007/BF00116466
BREIMAN, L. (2001). Statistical modeling: the two cultures (with discussion). Statist. Sci. 16 199- 231.
Mathematical Reviews (MathSciNet): MR1874152
Digital Object Identifier: doi:10.1214/ss/1009213726
Project Euclid: euclid.ss/1009213726
By ERS, S. D. and BESAG, J. E. (2000). Inference on a collapsed margin in disease mapping. Statistics in Medicine 19 2243-2249.
FAIRFIELD SMITH, H. (1938). An empirical law describing heterogeneity in the yields of agricultural crops. J. Agric. Sci. 28 1-23.
FISHER, R. A. (1922). On the mathematical foundations of theoretical statistics. Philos. Trans. Roy. Soc. London Ser. A 222 309-368.
FISHER, R. A. (1928). Statistical Methods for Research Workers, 2nd ed. Oliver and Boy d, Edinburgh.
GILMOUR, A. R., CULLIS, B. R., SMITH, A. B. and VERBy LA, A. P. (1999). Discussion of paper by J. E. Besag and D. M. Higdon. J. Roy. Statist. Soc. B 61 731-732.
HEINE, V. (1955). Models for two-dimensional stationary stochastic processes. Biometrika 42 170- 178.
Mathematical Reviews (MathSciNet): MR17,167g
Zentralblatt MATH: 0067.36504
KÜNSCH, H. R. (1987). Intrinsic autoregressions and related models on the two-dimensional lattice. Biometrika 74 517-524.
Zentralblatt MATH: 0671.62082
MATÉRN, B. (1986). Spatial Variation. Springer, New York.
Mathematical Reviews (MathSciNet): MR867886
MCBRATNEY, A. B. and WEBSTER, R. (1981). Detection of ridge and furrow pattern by spectral analysis of crop yield. Internat. Statist. Rev. 49 45-52.
PEARCE, S. C. (1976). An examination of Fairfield Smith's law of environmental variation. J. Agric. Sci. 87 21-24.
RUE, H. and TJELMELAND, H. (2002). Fitting Gaussian Markov random fields to Gaussian fields. Scand. J. Statist. 29 31-49.
Mathematical Reviews (MathSciNet): MR1894379
Digital Object Identifier: doi:10.1111/1467-9469.00058
WHITTLE, P. (1962). Topographic correlation, power-law covariance functions, and diffusion. Biometrika 49 305-314.
Mathematical Reviews (MathSciNet): MR31:5305
Zentralblatt MATH: 0114.08003
SEATTLE, WASHINGTON 98195-4322 E-MAIL: julian@stat.washington.edu
recently by Chen, Lockhart and Stephens (2002). One reason for its attractiveness to me is that if one considers the more realistic semiparametric model, a(Y) = X +, (6) where a is an arbitrary monotone transformation and has a N (µ, 2) distribution then / is identifiable and estimable at the n-1/2 rate while is not identifiable. Bickel and Ritov (1997) discuss way s of estimating / and a which is also estimable at rate n-1/2 optimally and suggest approaches to algorithms in their paper. The choice (,) is of interest to me because its consideration is the appropriate response to the Hinkley-Runger critique. One needs to specify a joint confidence region for (,) making statements such as "the effect magnitude on the scale is consistent with the data." The effect of lack of knowledge of on the variance of remains interpretable. It would be more attractive if McCullagh could somehow divorce the calculus of this paper from the language of functors, morphisms and canonical diagrams for more analysis-oriented statisticians such as my self.
BICKEL, P. and RITOV, Y. (1997). Local asy mptotic normality of ranks and covariates in the transformation models. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgersen and G. Yang, eds.) 43-54. Springer, New York.
Mathematical Reviews (MathSciNet): MR98j:62027
Zentralblatt MATH: 0897.62017
CHEN, G., LOCKHART, R. A. and STEPHENS, M. A. (2002). Box-Cox tranformations in linear models: Large sample theory and tests of normality (with discussion). Canad. J. Statist. 30 177-234.
Mathematical Reviews (MathSciNet): MR1926062
Digital Object Identifier: doi:10.2307/3315946
BERKELEY, CALIFORNIA 94720-3860 E-MAIL: bickel@stat.berkeley.edu
MAC LANE, S. (1998). Categories for the Working Mathematician, 2nd ed. Springer, New York.
Mathematical Reviews (MathSciNet): MR2001j:18001
FRASER, D. A. S. (1968a). A black box or a comprehensive model. Technometrics 10 219-229.
Mathematical Reviews (MathSciNet): MR237024
Digital Object Identifier: doi:10.2307/1267040
FRASER, D. A. S. (1968b). The Structure of Inference. Wiley, New York.
Mathematical Reviews (MathSciNet): MR235643
Zentralblatt MATH: 0164.48703
MCCULLAGH, P. (1992). Conditional inference and Cauchy models. Biometrika 79 247-259.
Mathematical Reviews (MathSciNet): MR93h:62048
Zentralblatt MATH: 0753.62002
Digital Object Identifier: doi:10.1093/biomet/79.2.247
TORONTO, ONTARIO M5S 3G3 CANADA E-MAIL: reid@utstat.utoronto.ca
from Helland (2002). Let a group G be defined on the parameter space of a model. A measurable function from to another space is called a natural subparameter if ( 1) = ( 2) implies (g 1) = (g 2) for all g G. For example, in the location and scale case the location parameter µ and the scale parameter are natural, while the coefficient of variation µ/ is not natural (it is if the group is changed to the pure scale group). In general the parameter is natural iff the level sets of the function = () are transformed onto other
inconsistency discussed in detail by Dawid, Stone and Zidek (1973). Their main problem is a violation of the plausible reduction principle: assume that a general method of inference, applied to data (y, z), leads to an answer that in fact depends on z alone. Then the same answer should appear if the same method is applied to z alone. A Bayesian implementation of this principle runs as follows: assume first that the probability density p(y, z |,) depends on the parameter = (,) in such a way that the marginal density p(z |) only depends upon. Then the following implication should hold: if (a) the marginal posterior density ( | y, z) depends on the data (y, z) only through z, then (b) this ( | z) should be proportional to a()p(z |) for some function a(), so that it is proportional to a posterior based solely on the z data. For a proper prior (,) this can be shown to hold with a() being the appropriate marginal prior (). Dawid, Stone and Zidek (1973) gave several examples where the implication above is violated by improper priors of the kind that we sometimes expect to have in objective Bay es inference. For our purpose, the interesting case is when there is a transformation group G defined on the parameter space. Under the assumption that is maximal invariant under G and making some regularity conditions, it is then first shown by Dawid, Stone and Zidek (1973) that it necessarily follows that p(z |,) only depends upon, next (a) is shown to hold alway s, and finally (b) holds if and only if the prior is of the form G(d) d, where G is right Haar measure, and the measure
DAWID, A. P., STONE, M. and ZIDEK, J. V. (1973). Marginalization paradoxes in Bayesian and structural inference (with discussion). J. Roy. Statist. Soc. Ser. B 35 189-233.
Mathematical Reviews (MathSciNet): MR51:2057
HELLAND, I. S. (2001). Reduction of regression models under sy mmetry. In Algebraic Methods in Statistics and Probability (M. Viana and D. Richards, eds.) 139-153. Amer. Math. Soc., Providence, RI.
Mathematical Reviews (MathSciNet): MR2002k:62089
Zentralblatt MATH: 1012.62077
HELLAND, I. S. (2002). Statistical inference under a fixed sy mmetry group. Available at http:// www.math.uio.no/ ingeh/.
BROWN, L. D. (1984). The research of Jack Kiefer outside the area of experimental design. Ann. Statist. 12 406-415.
Zentralblatt MATH: 0549.01017
Mathematical Reviews (MathSciNet): MR740901
Digital Object Identifier: doi:10.1214/aos/1176346495
Project Euclid: euclid.aos/1176346495
CARTIER, P. (2001). A mad day's work: From Grothendieck to Connes and Kontsevich. The evolution of concepts of space and sy mmetry. Bull. Amer. Math. Soc. 38 389-408.
Mathematical Reviews (MathSciNet): MR1848254
Digital Object Identifier: doi:10.1090/S0273-0979-01-00913-2
GROTHENDIECK, A. (1955). Produits tensoriels topologiques et espaces nucléaires. Mem. Amer. Math. Soc. 16.
Mathematical Reviews (MathSciNet): MR17,763c
HUBER, P. J. (1961). Homotopy theory in general categories. Math. Ann. 144 361-385.
Mathematical Reviews (MathSciNet): MR27:187
Zentralblatt MATH: 0099.17905
Digital Object Identifier: doi:10.1007/BF01396534
LE CAM, L. (1964). Sufficiency and approximate sufficiency. Ann. Math. Statist. 35 1419-1455.
Mathematical Reviews (MathSciNet): MR34:6909
Zentralblatt MATH: 0129.11202
Digital Object Identifier: doi:10.1214/aoms/1177700372
Project Euclid: euclid.aoms/1177700372
ARBUTHNOTT, J. (1712). An argument for Divine Providence, taken from the constant regularity observed in the births of both sexes. Philos. Trans. Roy. Soc. London 27 186-190.
BAILEY, R. A. (1981). A unified approach to design of experiments. J. Roy. Statist. Soc. Ser. A 144 214-223.
Mathematical Reviews (MathSciNet): MR82h:62129
Digital Object Identifier: doi:10.2307/2981920
BAILEY, R. A. (1991). Strata for randomized experiments (with discussion). J. Roy. Statist. Soc. Ser. B 53 27-78.
Mathematical Reviews (MathSciNet): MR92k:62134
COX, D. R. (1990). Roles of models in statistical analysis. Statist. Sci. 5 169-174.
Mathematical Reviews (MathSciNet): MR1062575
Digital Object Identifier: doi:10.1214/ss/1177012165
Project Euclid: euclid.ss/1177012165
DIACONIS, P. (1988). Group Representations in Probability and Statistics. IMS, Hay ward, CA.
Mathematical Reviews (MathSciNet): MR90a:60001
Zentralblatt MATH: 0695.60012
DIACONIS, P., GRAHAM, R. L. and KANTOR, W. M. (1983). The mathematics of perfect shuffles. Adv. in Appl. Math. 4 175-196.
Mathematical Reviews (MathSciNet): MR84j:20040
Zentralblatt MATH: 0521.05005
Digital Object Identifier: doi:10.1016/0196-8858(83)90009-X
FURSTENBURG, H. (1963). Noncommuting random products. Trans. Amer. Math. Soc. 108 377-428.
Mathematical Reviews (MathSciNet): MR163345
Zentralblatt MATH: 0203.19102
Digital Object Identifier: doi:10.2307/1993589
GRENANDER, U. (1963). Probabilities on Algebraic Structures. Wiley, New York.
Mathematical Reviews (MathSciNet): MR34:6810
MCCULLAGH, P. (1999). Quotient spaces and statistical models. Canad. J. Statist. 27 447-456.
Mathematical Reviews (MathSciNet): MR1745814
Digital Object Identifier: doi:10.2307/3316103
MCCULLAGH, P. (2000). Invariance and factorial models (with discussion). J. Roy. Statist. Soc. Ser. B 62 209-256.
Mathematical Reviews (MathSciNet): MR2002a:62102
Digital Object Identifier: doi:10.1111/1467-9868.00229
PINCUS, S. and KALMAN, R. E. (1997). Not all (possibly) "random" sequences are created equal. Proc. Nat. Acad. Sci. U.S.A. 94 3513-3518.
Mathematical Reviews (MathSciNet): MR99d:68179
Zentralblatt MATH: 0873.11047
Digital Object Identifier: doi:10.1073/pnas.94.8.3513
PINCUS, S. and SINGER, B. H. (1996). Randomness and degrees of irregularity. Proc. Nat. Acad. Sci. U.S.A. 93 2083-2088.
Mathematical Reviews (MathSciNet): MR97g:65025
Zentralblatt MATH: 0849.60002
Digital Object Identifier: doi:10.1073/pnas.93.5.2083
GUILFORD, CONNECTICUT 06437 E-MAIL: stevepincus@alum.mit.edu
in McCullagh (1980). Suppose we are dealing with a universe where the natural models for handling of binary responses are the logistic regression models. This could be some socioeconomic research area where peoples' attitudes to various features of brands or service levels are recorded on a binary scale, and the interest lies in the dependence of these attitudes on all sorts of background variables. How do we extend this universe to deal with ordered categorical responses, for example, on three-point positive/indifferent/negative scales? A natural requirement seems to be that if data are dichotomized by the (arbitrary) selection of a cutpoint (putting, for example, negative and indifferent together in a single category), then the marginal model coming out of this is a logistic regression model. This is, after all, just a way of recording a binary response, and even though it would hurt any statistician to throw away information in this way, it is done all the time on more invisible levels. Another natural requirement is that the parameters of interest-with the constant term as an obvious exception-should not depend on how the cutpoint is selected. It is easy to show that these two requirements are met by one and only one class of models for ordered responses, namely the models that can
and Nelder (1989). Thus, we have here the absurd situation that the potentially canonical-but unfortunately nonexisting-answer to a simple and canonical question results in a collection of very useful methods. The overdispersion models exist as perfectly respectable operational objects, but not as mathematical objects. My personal opinion [Tjur (1998)] is that the simplest way of giving these models a concrete interpretation goes via approximation by nonlinear models for normal data and a small adjustment of the usual estimation method for these models. But neither this, nor the concept of quasi-likelihood, answers the fundamental question whether there is a way of modifying the conditions (1) and (2) above in such a way that a meaningful theory of generalized linear models with overdispersion comes out as the unique answer. It is tempting to ask, in the present context, whether it is a necessity at all that these models "exist" in the usual sense. Is it so, perhaps, that after a century or two people will find this question irrelevant, just as we find old discussions about existence of the number + irrelevant? If this is the case, a new attitude to statistical models is certainly required.
MCCULLAGH, P. (1980). Regression models for ordinal data (with discussion). J. Roy. Statist. Soc. Ser. B 42 109-142.
Mathematical Reviews (MathSciNet): MR81j:62107
MCCULLAGH, P. and NELDER, J. A. (1989). Generalized Linear Models, 2nd. ed. Chapman and Hall, London.
Mathematical Reviews (MathSciNet): MR727836
NELDER, J. A. and WEDDERBURN, R. W. M. (1972). Generalized linear models. J. Roy. Statist. Soc. Ser. A 135 370-384.
TJUR, T. (1998). Nonlinear regression, quasi likelihood, and overdispersion in generalized linear models. Amer. Statist. 52 222-227.
Mathematical Reviews (MathSciNet): MR99g:62089
Digital Object Identifier: doi:10.2307/2685928
has recently been obtained by Wichura (2001). Fraser and Reid ask whether category theory can do more than provide a framework. My experience here is similar to Huber's, namely that category theory is well suited for this purpose but, as a branch of logic, that is all we can expect from it. Regarding the coefficient of variation, I agree that there are applications in which this is a useful and natural parameter or statistic, just as there are (a few) applications in which the correlation coefficient is useful. The groups used in this paper are such that the origin is either fixed or completely arbitrary. In either case there is no room for hedging. In practice, things are rarely so clear cut. In order to justify the coefficient of variation, it seems to me that the applications must be such that the scale of measurement has a reasonably well-defined origin relevant to the problem. The Cauchy model with the real fractional linear group was originally used as an example to highlight certain inferential problems. I do not believe I have encountered an application in which it would be easy to make a convincing case for the relevance of this group. Nevertheless, I think it is helpful to study such examples for the light they may shed on foundational matters. The fact that the median is not a natural subparameter is an insight that casts serious doubt on the relevance of the group in "conventional" applications. To turn the argument around, the fact that the Cauchy model is closed under real fractional linear transformation is not, in itself, an adequate reason to choose that group as the base category. In that sense, I agree with a primary thesis of Fraser's Structure of Inference that the group supersedes the probability model. Tjur's remarks capture the spirit of what I am attempting to do. In the cumulative logit model, it is clear intuitively what is meant by the statement that the parameter of interest should not depend on how the cutpoints are selected. As is often the case, what is intuitively clear is not so easy to express in mathematical terms. It does not mean that the maximum-likelihood estimate is unaffected by this choice. For that reason, although Tjur's second condition on overdispersion models has a certain appeal, I do not think it carries the same force as the first. His description of natural subparameters in regression is a model of clarity.
given the values on the contour (Matheron, 1971). Both processes are also conformal, but the similarity ends there. The set of conformal processes is also closed under addition of independent processes. Thus, the sum of white noise and W is conformal but not Markov. Bey ond convolutions of white noise and
W, it appears most unlikely that there exists another conformal process with Gaussian increments. Whittle's (1954) family of stationary Gaussian processes has the Markov property [Chilès and Delfiner (1999)] but the family is not closed under conformal maps nor under convolution.
CHILÈS, J.-P. and DELFINER, P. (1999). Geostatistics. Wiley, New York.
FEy NMAN, R. P., LEIGHTON, R. B. and SANDS, M. (1964). The Fey nman Lectures on physics. Addison-Wesley, Reading, MA.
Mathematical Reviews (MathSciNet): MR213078
FRASER, D. A. S. (1968b). The Structure of Inference. Wiley, New York.
Mathematical Reviews (MathSciNet): MR235643
Zentralblatt MATH: 0164.48703
HELLAND, I. S. (1999a). Quantum mechanics from sy mmetry and statistical modelling. Internat. J. Theoret. Phy s. 38 1851-1881.
Mathematical Reviews (MathSciNet): MR1704291
Zentralblatt MATH: 0953.81003
Digital Object Identifier: doi:10.1023/A:1026676913271
KINGMAN, J. F. C. (1972). On random sequences with spherical sy mmetry. Biometrika 59 492-494.
Mathematical Reviews (MathSciNet): MR49:8161
Zentralblatt MATH: 0238.60025
Digital Object Identifier: doi:10.1093/biomet/59.2.492
MACCULLAGH, J. (1839). An essay towards the dy namical theory of cry stalline reflexion and refraction. Trans. Roy. Irish Academy 21 17-50.
MATHERON, G. (1971). The theory of regionalized variables and its applications. Cahiers du Centre de Morphologie Mathématique de Fontainbleu 5.
WHITTLE, P. (1954). On stationary processes in the plane. Biometrika 41 434-449.
Mathematical Reviews (MathSciNet): MR16,731c
Zentralblatt MATH: 0058.35601
WICHURA, M. (2001). Some de Finetti ty pe theorems. Preprint.
CHICAGO, ILLINOIS 60637-1514 E-MAIL: pmcc@galton.uchicago.edu

2013 © Institute of Mathematical Statistics

The Annals of Statistics

The Annals of Statistics

Turn MathJax Off
What is MathJax?