## The Annals of Statistics

### On the Choice of a Model to Fit Data from an Exponential Family

Dominique M. A. Haughton

#### Abstract

Let $X_1, \cdots, X_n$ be iid observations coming from an exponential family. The problem of interest is this: Given a finite number of models $m_j$ (smoothly curved manifolds in $\mathbb{R}^k$), choose the best model to fit the observations, with some penalty for choosing models with dimensions which are too large. A result of Schwarz is made more specific and is extended to the case where the models are curved manifolds. If $S(Y, n, j)$ is--up to a constant $C(n)$ independent of the model--the log of the posterior probability of the $j$th model, where the sample mean $Y_n = (1/n)\sum^n_{i=1}X_i$ has been replaced by $Y$, Schwarz suggested an asymptotic expansion of $S(Y, n, j)$ whose leading terms are $\gamma(Y, n, j) = n \sup_{\psi \in m_j \cap \Theta}(Y\psi - b(\psi)) - \frac{1}{2}k_j \log n$, in the case where the models are affine subspaces of $\mathbb{R}^k$. We establish a similar asymptotic expansion, including the next term, with uniform bounds for $Y$ in a compact neighborhood of $\nabla b(\theta)$, where $\theta$ is the true value of the parameter. We suggest a criterion for the choice of the best model that consists of maximizing the three leading terms in the expansion $S(Y, n, j)$. We show that the criterion gives the correct model with probabilities $P^n_\theta \rightarrow 1$ as $n \rightarrow + \infty$.

#### Article information

Source
Ann. Statist. Volume 16, Number 1 (1988), 342-355.

Dates
First available in Project Euclid: 12 April 2007

http://projecteuclid.org/euclid.aos/1176350709

Digital Object Identifier
doi:10.1214/aos/1176350709

Mathematical Reviews number (MathSciNet)
MR924875

Zentralblatt MATH identifier
0657.62037

JSTOR