The Annals of Statistics

Weighted polynomial models and weighted sampling schemes for finite population

Sean X. Chen

Full-text: Open access


This paper outlines a theoretical framework for finite population models with unequal sample probabilities, along with sampling schemes for drawing random samples from these models. We first present four exact weighted sampling schemes that can be used for any finite population model to satisfy such requirements as ordered/ unordered samples, with/without replacement, and fixed/nonfixed sample size. We then introduce a new class of finite population models called weighted polynomial models or, in short, WPM. The probability density of a WPM is defined through a symmetric polynomial of the weights of the units in the sample. The WPM is shown to have been applied in many statistical analyses including survey sampling, logistic regression, case-control studies, lottery, DNA sequence alignment and MCMC simulations. We provide general strategies that can help improve the efficiency of the exact weighted sampling schemes for any given WPM. We show that under a mild condition, sampling from any WPM can be implemented within polynomial time. A Metropolis-Hasting-type scheme is proposed for approximate weighted sampling when the exact sampling schemes become intractable for moderate population and sample sizes. We show that under a mild condition, the average acceptance rate of the approximate sampling scheme for any WPM can be expressed in closed form using only the inclusion probabilities.

Article information

Ann. Statist. Volume 26, Number 5 (1998), 1894-1915.

First available in Project Euclid: 21 June 2002

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62D05 62E15
Secondary: 62E25

Finite population Metropolis-Hasting algorithm polynomial theory survey sampling weighted sampling


Chen, Sean X. Weighted polynomial models and weighted sampling schemes for finite population. Ann. Statist. 26 (1998), no. 5, 1894--1915. doi:10.1214/aos/1024691362.

Export citation


  • CHEN, S. X. 1992. Metropolis algorithm and the nearly black object. Technical report, Dept. Statistics, Harvard Univ. Z.
  • CHEN, S. X., DEMPSTER, A. P. and LIU, J. S. 1994. Weighted finite population sampling to maximize entropy. Biometrika 81 457 469. Z.
  • CHEN, S. X. and LIU, J. S. 1997. Statistical applications of the Poisson-Binomial and conditional Bernoulli distributions. Statist. Sinica 7 875 892. Z.
  • HANIF, M. and BREWER, K. R. W. 1980. Sampling with unequal probabilities without replacement: a review. Internat. Statist. Rev. 48 317 335. Z.
  • JOE, H. 1990. A winning strategy for lotto games? Canad. J. Statist. 18 233 244. Z.
  • LAHIRI, D. B. 1951. A method for sample selection providing unbiased ratio estimates. Bull. Internat. Statist. Inst. 33 133 140. Z.
  • LIU, J. S., NEUWALD, A. F. and LAWRENCE C. E. 1995. Bayesian models for multiple local sequence alignment and Gibbs sampling strategies. J. Amer. Statist. Assoc. 90 1156 1170. Z.
  • SAMPFORD, M. R. 1967. On sampling without replacement with unequal probabilities of selection. Biometrika 54 499 513. Z.
  • SINGH, P. and SRIVASTAVA, A. K. 1980. Sampling schemes providing unbiased regression estimators. Biometrika 67 205 209. Z.
  • SMITH, A. F. M. and ROBERTS, G. O. 1993. Bayesian computation via the Gibbs sampler and related Markov chain Monte Carlo methods. J. Roy. Statist. Soc. B 55 3 23. Z.
  • STERN, H. and COVER, T. M. 1989. Maximum entropy and the lottery. J. Amer. Statist. Assoc. 84 980 985. Z.
  • USPENSKY, J. V. 1948. Theory of Equations. McGraw-Hill, New York.
  • NEW YORK, NEW YORK 10012 E-MAIL: schen3@stern.ny