Statistical Science

The Indirect Method: Inference Based on Intermediate Statistics—A Synthesis and Examples

Wenxin Jiang and Bruce Turnbull

Full-text: Open access

Abstract

This article presents an exposition and synthesis of the theory and some applications of the so-called indirect method of inference. These ideas have been exploited in the field of econometrics, but less so in other fields such as biostatistics and epidemiology. In the indirect method, statistical inference is based on an intermediate statistic, which typically follows an asymptotic normal distribution, but is not necessarily a consistent estimator of the parameter of interest. This intermediate statistic can be a naive estimator based on a convenient but misspecified model, a sample moment or a solution to an estimating equation. We review a procedure of indirect inference based on the generalized method of moments, which involves adjusting the naive estimator to be consistent and asymptotically normal. The objective function of this procedure is shown to be interpretable as an “indirect likelihood” based on the intermediate statistic. Many properties of the ordinary likelihood function can be extended to this indirect likelihood. This method is often more convenient computationally than maximum likelihood estimation when handling such model complexities as random effects and measurement error, for example, and it can also serve as a basis for robust inference and model selection, with less stringent assumptions on the data generating mechanism. Many familiar estimation techniques can be viewed as examples of this approach. We describe applications to measurement error, omitted covariates and recurrent events. A dataset concerning prevention of mammary tumors in rats is analyzed using a Poisson regression model with overdispersion. A second dataset from an epidemiological study is analyzed using a logistic regression model with mismeasured covariates. A third dataset of exam scores is used to illustrate robust covariance selection in graphical models.

Article information

Source
Statist. Sci., Volume 19, Number 2 (2004), 239-263.

Dates
First available in Project Euclid: 14 January 2005

Permanent link to this document
https://projecteuclid.org/euclid.ss/1105714160

Digital Object Identifier
doi:10.1214/088342304000000152

Mathematical Reviews number (MathSciNet)
MR2140540

Zentralblatt MATH identifier
1100.62025

Keywords
Asymptotic normality bias correction consistency efficiency estimating equations generalized method of moments graphical models indirect inference indirect likelihood measurement error missing data model selection naive estimators omitted covariates overdispersion quasi-likelihood random effects robustness

Citation

Jiang, Wenxin; Turnbull, Bruce. The Indirect Method: Inference Based on Intermediate Statistics—A Synthesis and Examples. Statist. Sci. 19 (2004), no. 2, 239--263. doi:10.1214/088342304000000152. https://projecteuclid.org/euclid.ss/1105714160


Export citation

References

  • Andersen, P. K., Borgan, Ø., Gill, R. D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer, New York.
  • Berk, R. H. (1966). Limiting behavior of posterior distributions when the model is incorrect. Ann. Math. Statist. 37 51--58. [Correction 37 745--746.]
  • Bickel, P. (1988). Robust estimation. In Encyclopedia of Statistical Sciences (S. Kotz and N. L. Johnson, eds.) 8 157--163. Wiley, New York.
  • Bickel, P. J. and Doksum, K. A. (2001). Mathematical Statistics 1, 2nd ed. Prentice Hall, Upper Saddle River, NJ.
  • Box, G. E. P. and Tiao, G. C. (1973). Bayesian Inference in Statistical Analysis. Addison--Wesley, London.
  • Breslow, N. (1990). Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. J. Amer. Statist. Assoc. 85 565--571.
  • Broze, L. and Gouriéroux, C. (1998). Pseudo-maximum likelihood method, adjusted pseudo-maximum likelihood method and covariance estimators. J. Econometrics 85 75--98.
  • Carrasco, M. and Florens, J.-P. (2002). Simulation-based method of moments and efficiency. J. Bus. Econom. Statist. 20 482--492.
  • Carroll, R. J., Ruppert, D. and Stefanski, L. A. (1995). Measurement Error in Nonlinear Models. Chapman and Hall, London.
  • Chiang, C. L. (1956). On regular best asymptotically normal estimates. Ann. Math. Statist. 27 336--351.
  • Clark, L. C., Combs, G. F., Turnbull, B. W., Slate, E. H., Chalker, D. K., Chow, J., Davis, L. S., Glover, R. A., Graham, G. F., Gross, E. G., Krongrad, A., Lesher, J. L., Park, H. K., Sanders, B. B., Smith, C. L., Taylor, J. R. and the Nutritional Prevention of Cancer Study Group (1996). Effects of selenium supplementation for cancer prevention in patients with carcinoma of the skin: A randomized controlled trial. J. American Medical Association 276 1957--1963; Editorial 1984--1985.
  • Cox, D. R. (1962). Further results on tests of separate families of hypotheses. J. Roy. Statist. Soc. Ser. B 24 406--424.
  • Cox, D. R. (1972). Regression models and life-tables (with discussion). J. Roy. Statist. Soc. Ser. B 34 187--220.
  • Cox, D. R. (1983). Some remarks on overdispersion. Biometrika 70 269--274.
  • Cox, D. R. and Wermuth, N. (1990). An approximation to maximum likelihood estimates in reduced models. Biometrika 77 747--761.
  • Crowder, M. (1985). Gaussian estimation for correlated binomial data. J. Roy. Statist. Soc. Ser. B 47 229--237.
  • Crowder, M. (2001). On repeated measures analysis with misspecified covariance structure. J. R. Stat. Soc. Ser. B Stat. Methodol. 63 55--62.
  • Dawid, A. P. (1998). Conditional independence. In Encyclopedia of Statistical Sciences, Update Volume (S. Kotz, C. B. Read and D. L. Banks, eds.) 2 146--155. Wiley, New York.
  • de Luna, X. and Genton, M. G. (2001). Robust simulation-based estimation of ARMA models. J. Comput. Graph. Statist. 10 370--387.
  • de Luna, X. and Genton, M. G. (2002). Simulation-based inference for simultaneous processes on regular lattices. Stat. Comput. 12 125--134.
  • Dempster, A. P., Laird, N. M. and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. Roy. Statist. Soc. Ser. B 39 1--38.
  • Draper, D. (1995). Assessment and propagation of model uncertainty (with discussion). J. Roy. Statist. Soc. Ser. B 57 45--97.
  • Ferguson, T. S. (1958). A method of generating best asymptotically normal estimates with application to the estimation of bacterial densities. Ann. Math. Statist. 29 1046--1062.
  • Fisher, R. A. (1946). Statistical Methods for Research Workers, 10th ed. Oliver and Boyd, Edinburgh.
  • Florey, C. du V., Melia, R. J. W., Chinn, S., Goldstein, B. D., Brooks, A. G. F., John, H. H., Craighead, E. B. and Webster, X. (1979). The relation between respiratory illness in primary schoolchildren and the use of gas for cooking, III---Nitrogen dioxide, respiratory illness and lung function. International J. Epidemiology 8 347--353.
  • Foutz, R. V. and Srivastava, R. C. (1977). The performance of the likelihood ratio test when the model is incorrect. Ann. Statist. 5 1183--1194.
  • Fuller, W. A. (1987). Measurement Error Models. Wiley, New York.
  • Gail, M. H., Santner, T. and Brown, C. C. (1980). An analysis of comparative carcinogenesis experiments based on multiple times to tumor. Biometrics 36 255--266.
  • Gail, M. H., Wieand, S. and Piantadosi, S. (1984). Biased estimates of treatment effect in randomized experiments with nonlinear regressions and omitted covariates. Biometrika 71 431--444.
  • Gallant, A. R. and Long, J. R. (1997). Estimating stochastic differential equations efficiently by minimum chi-squared. Biometrika 84 125--141.
  • Gallant, A. R. and Tauchen, G. (1996). Which moments to match? Econometric Theory 12 657--681.
  • Gallant, A. R. and Tauchen, G. (1999). The relative efficiency of method of moments estimators. J. Econometrics 92 149--172.
  • Genton, M. G. and de Luna, X. (2000). Robust simulation-based estimation. Statist. Probab. Lett. 48 253--259.
  • Genton, M. G. and Ronchetti, E. (2003). Robust indirect inference. J. Amer. Statist. Assoc. 98 67--76.
  • Gouriéroux, C. and Monfort, A. (1993). Simulation-based inference---A survey with special reference to panel-data models. J. Econometrics 59 5--33.
  • Gouriéroux, C., Monfort, A. and Renault, E. (1993). Indirect inference. J. Applied Econometrics 8S 85--118.
  • Hájek, J. (1970). A characterization of limiting distributions of regular estimates. Z. Wahrsch. Verw. Gebiete 14 323--330.
  • Hampel, F. R. (1968). Contributions to the theory of robust estimation. Ph.D. dissertation, Univ. California, Berkeley.
  • Hampel, F. R., Ronchetti, E. M., Rousseeuw, P. J. and Stahel, W. A. (1986). Robust Statistics. The Approach Based on Influence Functions. Wiley, New York.
  • Hand, D. and Crowder, M. (1996). Practical Longitudinal Data Analysis. Chapman and Hall/CRC, London.
  • Hansen, L. P. (1982). Large sample properties of generalized method of moments estimators. Econometrica 50 1029--1054.
  • Hausman, J. A. (1978). Specification tests in econometrics. Econometrica 46 1251--1271.
  • Heckman, J. (1976). The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement 5 475--492.
  • Hengartner, N. W. and Sperlich, S. (2002). Rate optimal estimation with the integration method in the presence of many covariates. Working Paper 01-69, Carlos III de Madrid. Available at http://halweb.uc3m.es/esp/Personal/personas/stefan/papers/ may2002.pdf.
  • Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35 73--101.
  • Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proc. Fifth Berkeley Symp. Math. Statist. Probab. 1 221--233. Univ. California Press.
  • Imbens, G. W. (2002). Generalized method of moments and empirical likelihood. J. Bus. Econom. Statist. 20 493--506.
  • Jiang, W. (1996). Aspects of misspecification in statistical models: Applications to latent variables, measurement error, random effects, omitted covariates and incomplete data. Ph.D. dissertation, Cornell Univ.
  • Jiang, W. and Turnbull, B. W. (2003). The indirect method---Robust inference based on intermediate statistics. Technical Report 1377, School of Operations Research and Industrial Engineering, Cornell Univ. Available at http://www.orie.cornell.edu/trlist/trlist.html.
  • Jiang, W., Turnbull, B. W. and Clark, L. C. (1999). Semiparametric regression models for repeated events with random effects and measurement error. J. Amer. Statist. Assoc. 94 111--124.
  • Kent, J. T. (1982). Robust properties of likelihood ratio tests. Biometrika 69 19--27.
  • Kuk, A. Y. C. (1995). Asymptotically unbiased estimation in generalised linear models with random effects. J. Roy. Statist. Soc. Ser. B 57 395--407.
  • Lawless, J. F. and Nadeau, C. (1995). Some simple robust methods for the analysis of recurrent events. Technometrics 37 158--168.
  • Leaderer, B. P., Zagraniski, R. T., Berwick, M. and Stolwijk, J. A. J. (1986). Assessment of exposure to indoor air contaminants from combustion sources: Methodology and application. American J. Epidemiology 124 275--289.
  • Le Cam, L. (1956). On the asymptotic theory of estimation and testing hypotheses. Proc. Third Berkeley Symp. Math. Statist. Probab. 1 129--156. Univ. California Press.
  • Lehmann, E. L. and Casella, G. (1998). Theory of Point Estimation, 2nd ed. Springer, New York.
  • Liang, K. Y. and Zeger, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73 13--22.
  • Little, R. J. A. (1994). A class of pattern-mixture models for normal incomplete data. Biometrika 81 471--483.
  • MacKinnon, J. G. and Smith, A. A. (1998). Approximate bias correction in econometrics. J. Econometrics 85 205--230.
  • Mammen, E., Linton, O. and Nielsen, J. (1999). The existence and asymptotic properties of a backfitting projection algorithm under weak conditions. Ann. Statist. 27 1443--1490.
  • Mardia, K. V., Kent, J. T. and Bibby, J. (1979). Multivariate Analysis. Academic Press, New York.
  • Mátyás, L., ed. (1999). Generalized Method of Moments Estimation. Cambridge Univ. Press.
  • McCullagh, P. and Nelder, J. A. (1989). Generalized Linear Models, 2nd ed. Chapman and Hall, New York.
  • McFadden, D. (1989). A method of simulated moments for estimation of discrete response models without numerical integration. Econometrica 57 995--1026.
  • Newey, W. K. (1994). Kernel estimation of partial means and a general variance estimator. Econometric Theory 10 233--253.
  • Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis testing. In Handbook of Econometrics (R. F. Engle and D. L. McFadden, eds.) 4 2111--2245. North-Holland, Amsterdam.
  • Pakes, A. and Pollard, D. (1989). Simulation and the asymptotics of optimization estimators. Econometrica 57 1027--1057.
  • Qu, A., Lindsay, B. G. and Li, B. (2000). Improving generalised estimating equations using quadratic inference functions. Biometrika 87 823--836.
  • Quandt, R. E. and Ramsey, J. B. (1978). Estimating mixtures of normal distributions and switching regressions (with discussion). J. Amer. Statist. Assoc. 73 730--752.
  • Rao, C. R. (1973). Linear Statistical Inference and Its Applications, 2nd ed. Wiley, New York.
  • Reid, N. (1988). Influence functions. In Encyclopedia of Statistical Sciences (S. Kotz and N. L. Johnson, eds.) 4 117--119. Wiley, New York.
  • Ronchetti, E. and Trojani, F. (2001). Robust inference with GMM estimators. J. Econometrics 101 37--69.
  • Rosner, B., Spiegelman, D. and Willett, W. C. (1990). Correction of logistic regression relative risk estimates and confidence intervals for measurement error: The case of multiple covariates measured with error. American J. Epidemiology 132 734--745.
  • Rotnitzky, A. and Wypij, D. (1994). A note on the bias of estimators with missing data. Biometrics 50 1163--1170.
  • Schmidt, P. (1982). An improved version of the Quandt--Ramsey MGF estimator for mixtures of normal distributions and switching regressions. Econometrica 50 501--516.
  • Schwarz, G. (1978). Estimating the dimension of a model. Ann. Statist. 6 461--464.
  • Sen, P. K. and Singer, J. M. (1993). Large Sample Methods in Statistics. Chapman and Hall, New York.
  • Sen, S. (1998). Confidence intervals for gene location: The effect of model misspecification and smoothing. Ph.D. dissertation, Dept. Statistics, Univ. Chicago.
  • Serfling, R. J. (1980). Approximation Theorems of Mathematical Statistics. Wiley, New York.
  • Taylor, J. R. (1997). An Introduction to Error Analysis, 2nd ed. University Science Books, Sausalito, CA.
  • Thompson, H. F., Grubbs, C. J., Moon, R. C. and Sporn, M. B. (1978). Continual requirement of retinoid for maintenance of mammary cancer inhibition. Proc. Annual Meeting of the American Association for Cancer Research 19 74.
  • Turnbull, B. W., Jiang, W. and Clark, L. C. (1997). Regression models for recurrent event data: Parametric random effects models with measurement error. Statistics in Medicine 16 853--864.
  • Wedderburn, R. W. M. (1974). Quasi-likelihood functions, generalized linear models and the Gauss--Newton method. Biometrika 61 439--447.
  • Wei, L. J., Lin, D. Y. and Weissfeld, L. (1989). Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J. Amer. Statist. Assoc. 84 1065--1073.
  • White, H. (1994). Estimation, Inference and Specification Analysis. Cambridge Univ. Press.
  • Whittaker, J. (1990). Graphical Models in Applied Multivariate Statistics. Wiley, New York.
  • Whittemore, A. S. and Keller, J. B. (1988). Approximations for regression with covariate measurement error. J. Amer. Statist. Assoc. 83 1057--1066.
  • Whittle, P. (1961). Gaussian estimation in stationary time series. Bull. Internat. Statist. Inst. 39 105--129.
  • Wright, F. A. and Kong, A. (1997). Linkage mapping in experimental crosses: The robustness of single-gene models. Genetics 146 417--425.