Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics



The Annals of Statistics

Local central limit theorems, the high-order correlations of rejective sampling and logistic likelihood asymptotics

Richard Arratia, Larry Goldstein, and Bryan Langholz

Source: Ann. Statist. Volume 33, Number 2 (2005), 871-914.

Abstract

Let I1,…,In be independent but not necessarily identically distributed Bernoulli random variables, and let Xn=∑j=1nIj. For ν in a bounded region, a local central limit theorem expansion of $\mathbb {P}(X_{n}=\mathbb {E}X_{n}+\nu)$ is developed to any given degree. By conditioning, this expansion provides information on the high-order correlation structure of dependent, weighted sampling schemes of a population E (a special case of which is simple random sampling), where a set dE is sampled with probability proportional to ∏AdxA, where xA are positive weights associated with individuals AE. These results are used to determine the asymptotic information, and demonstrate the consistency and asymptotic normality of the conditional and unconditional logistic likelihood estimator for unmatched case-control study designs in which sets of controls of the same size are sampled with equal probability.

Primary Subjects: 62N02, 62D05, 60F05, 62F12
Keywords: Case-control studies; epidemiology; frequency matching

Full-text: Open access

Links and Identifiers

Permanent link to this document: http://projecteuclid.org/euclid.aos/1117114339
Digital Object Identifier: doi:10.1214/009053604000000706
Mathematical Reviews number (MathSciNet): MR2163162
Zentralblatt MATH identifier: 1068.62106

References

Andersen, P. K., Borgan, \O., Gill, R. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer, New York.
Mathematical Reviews (MathSciNet): MR1198884
Zentralblatt MATH: 0769.62061
Barlow, W. E. and Prentice, R. L. (1988). Residuals for relative risk regression. Biometrika 75 65--74.
Mathematical Reviews (MathSciNet): MR932818
Zentralblatt MATH: 0632.62102
Billingsley, P. (1961). Statistical Inference for Markov Processes. Univ. Chicago Press.
Mathematical Reviews (MathSciNet): MR123419
Zentralblatt MATH: 0106.34201
Borgan, Ø., Goldstein, L. and Langholz, B. (1995). Methods for the analysis of sampled cohort data in the Cox proportional hazards model. Ann. Statist. 23 1749--1778.
Mathematical Reviews (MathSciNet): MR1370306
Breslow, N. E. and Cain, K. (1988). Logistic regression for two stage case-control data. Biometrika 75 11--20.
Mathematical Reviews (MathSciNet): MR932812
Zentralblatt MATH: 0635.62110
Breslow, N. E. and Day, N. E. (1980). Statistical Methods in Cancer Research 1. The Analysis of Case-Control Studies. International Agency for Research on Cancer, Lyon.
Breslow, N. E. and Powers, W. (1978). Are there two logistic regressions for retrospective studies? Biometrics 34 100--105.
Breslow, N. E., Robins, J. and Wellner, J. (2000). On the semi-parametric efficiency of logistic regression under case-control sampling. Bernoulli 6 447--455.
Mathematical Reviews (MathSciNet): MR1762555
Project Euclid: euclid.bj/1081616700
Carroll, R., Wang, S. and Wang, C. (1995). Prospective analysis of logistic case-control studies. J. Amer. Statist. Assoc. 90 157--169.
Mathematical Reviews (MathSciNet): MR1325123
Cox, D. R. (1972). Regression models and life-tables (with discussion). J. Roy. Statist. Soc. Ser. B 34 187--220.
Mathematical Reviews (MathSciNet): MR341758
Cox, D. R. and Snell, E. (1989). Analysis of Binary Data, 2nd ed. Chapman and Hall, New York.
Mathematical Reviews (MathSciNet): MR1014891
Zentralblatt MATH: 0729.62004
Farewell, V. (1979). Some results on the estimation of logistic models based on retrospective data. Biometrika 66 27--32.
Mathematical Reviews (MathSciNet): MR529144
Zentralblatt MATH: 0448.62082
Gail, M., Lubin, J. and Rubenstein, L. (1981). Likelihood calculations for matched case-control studies and survival studies with tied death times. Biometrika 68 703--707.
Mathematical Reviews (MathSciNet): MR637792
Goldstein, L. and Langholz, B. (1992). Asymptotic theory for nested case-control sampling in the Cox regression model. Ann. Statist. 20 1903--1928.
Mathematical Reviews (MathSciNet): MR1193318
Gordon, L. (1983). Successive sampling in large finite populations. Ann. Statist. 11 702--706.
Mathematical Reviews (MathSciNet): MR696081
Hájek, J. (1964). Asymptotic theory of rejective sampling with varying probabilities from a finite population. Ann. Math. Statist. 35 1491--1523.
Mathematical Reviews (MathSciNet): MR178555
Harkness, W. (1965). Properties of the extended hypergeometric distribution. Ann. Math. Statist. 36 938--945.
Mathematical Reviews (MathSciNet): MR182073
Kelsey, J., Whittemore, A., Evans, A. and Thompson, W. (1996). Methods in Observational Epidemiology, 2nd ed. Oxford Univ. Press.
Kupper, L., McMichael, A. and Spirtas, R. (1975). A hybrid epidemiologic study design useful in estimating relative risk. J. Amer. Statist. Assoc. 70 524--528.
Langholz, B. and Goldstein, L. (2001). Conditional logistic analysis of case-control studies with complex sampling. Biostatistics 2 63--84.
Liang, K.-Y. and Qin, J. (2000). Regression analysis under non-standard situations: A pairwise pseudolikelihood approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 62 773--786.
Mathematical Reviews (MathSciNet): MR1796291
Digital Object Identifier: doi:10.1111/1467-9868.00263
Mantel, N. (1973). Synthetic retrospective studies and related topics. Biometrics 29 479--486.
Psaty, B. M., Heckbert, S. R., Koepsell, T. D., Siscovick, D. S., Raghunathan, T. E., Weiss, N. S., Rosendaal, F. R., Lemaitre, R. N., Smith, N. L., Wahl, P. W. et al. (1995). The risk of myocardial infarction associated with antihypertensive drug therapies. J. American Medical Association 274 620--625.
Prentice, R. L. and Pyke, R. (1979). Logistic disease incidence models and case-control studies. Biometrika 66 403--411.
Mathematical Reviews (MathSciNet): MR556730
Zentralblatt MATH: 0428.62078
Rothman, K. and Greenland, S., eds. (1998). Modern Epidemiology, 2nd ed. Lippincott--Raven, Philadelphia.
Rosén, B. (1972). Asymptotic theory for successive sampling with varying probabilities without replacement. I, II. Ann. Math. Statist. 43 373--397, 748--776.
Mathematical Reviews (MathSciNet): MR321223
Scott, A. and Wild, C. (1986). Fitting logistic models under case-control or choice based sampling. J. Roy. Statist. Soc. Ser. B 48 170--182.
Mathematical Reviews (MathSciNet): MR867995
Thomas, D. (1981). General relative-risk models for survival time and matched case-control analysis. Biometrics 37 673--686.
Wacholder, S., Silverman, D., McLaughlin, J. and Mandel, J. (1992). Selection of controls in case-control studies III. Design options. American J. Epidemiology 135 1042--1050.
Weinberg, C. and Wacholder, S. (1993). Prospective analysis of case-control data under general multiplicative-intercept risk models. Biometrika 80 461--465.
Mathematical Reviews (MathSciNet): MR1243520
Zentralblatt MATH: 0782.62101

2009 © Institute of Mathematical Statistics