Abstract
Let $X_1, \cdots, X_n$ be iid observations coming from an exponential family. The problem of interest is this: Given a finite number of models $m_j$ (smoothly curved manifolds in $\mathbb{R}^k$), choose the best model to fit the observations, with some penalty for choosing models with dimensions which are too large. A result of Schwarz is made more specific and is extended to the case where the models are curved manifolds. If $S(Y, n, j)$ is--up to a constant $C(n)$ independent of the model--the log of the posterior probability of the $j$th model, where the sample mean $Y_n = (1/n)\sum^n_{i=1}X_i$ has been replaced by $Y$, Schwarz suggested an asymptotic expansion of $S(Y, n, j)$ whose leading terms are $\gamma(Y, n, j) = n \sup_{\psi \in m_j \cap \Theta}(Y\psi - b(\psi)) - \frac{1}{2}k_j \log n$, in the case where the models are affine subspaces of $\mathbb{R}^k$. We establish a similar asymptotic expansion, including the next term, with uniform bounds for $Y$ in a compact neighborhood of $\nabla b(\theta)$, where $\theta$ is the true value of the parameter. We suggest a criterion for the choice of the best model that consists of maximizing the three leading terms in the expansion $S(Y, n, j)$. We show that the criterion gives the correct model with probabilities $P^n_\theta \rightarrow 1$ as $n \rightarrow + \infty$.
Citation
Dominique M. A. Haughton. "On the Choice of a Model to Fit Data from an Exponential Family." Ann. Statist. 16 (1) 342 - 355, March, 1988. https://doi.org/10.1214/aos/1176350709
Information