Brazilian Journal of Probability and Statistics

Weighted sampling without replacement

Anna Ben-Hamou, Yuval Peres, and Justin Salez

Full-text: Access denied (no subscription detected)

We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber. If you have a personal subscription to this journal, then please login. If you are already logged in, then you may need to update your profile to register your subscription. Read more about accessing full-text

Abstract

Comparing concentration properties of uniform sampling with and without replacement has a long history which can be traced back to the pioneer work of Hoeffding (1963). The goal of this note is to extend this comparison to the case of non-uniform weights, using a coupling between samples drawn with and without replacement. When the items’ weights are arranged in the same order as their values, we show that the induced coupling for the cumulative values is a submartingale coupling. As a consequence, the powerful Chernoff-type upper-tail estimates known for sampling with replacement automatically transfer to the case of sampling without replacement. For general weights, we use the same coupling to establish a sub-Gaussian concentration inequality. As the sample size approaches the total number of items, the variance factor in this inequality displays the same kind of sharpening as Serfling (1974) identified in the case of uniform weights. We also construct an other martingale coupling which allows us to answer a question raised by Luh and Pippenger (2014) on sampling in Polya urns with different replacement numbers.

Article information

Source
Braz. J. Probab. Stat., Volume 32, Number 3 (2018), 657-669.

Dates
Received: April 2016
Accepted: March 2017
First available in Project Euclid: 8 June 2018

Permanent link to this document
https://projecteuclid.org/euclid.bjps/1528444876

Digital Object Identifier
doi:10.1214/17-BJPS359

Mathematical Reviews number (MathSciNet)
MR3812386

Zentralblatt MATH identifier
06930043

Keywords
Weighted sampling sampling without replacement concentration inequalities convex order martingale coupling

Citation

Ben-Hamou, Anna; Peres, Yuval; Salez, Justin. Weighted sampling without replacement. Braz. J. Probab. Stat. 32 (2018), no. 3, 657--669. doi:10.1214/17-BJPS359. https://projecteuclid.org/euclid.bjps/1528444876


Export citation

References

  • Alexander, K. S. (1989). A counterexample to a correlation inequality in finite sampling. The Annals of Statistics 17, 436–439.
  • Bardenet, R. and Maillard, O.-A. (2015). Concentration inequalities for sampling without replacement. Bernoulli 21, 1361–1385.
  • Bollobás, B. (1980). A probabilistic proof of an asymptotic formula for the number of labelled regular graphs. European Journal of Combinatorics 1, 311–316.
  • Bollobás, B. (1998). Modern Graph Theory. New York: Springer-Verlag.
  • Boucheron, S., Lugosi, G. and Massart, P. (2013). Concentration Inequalities. Oxford: Oxford University Press.
  • Gordon, L. (1983). Successive sampling in large finite populations. The Annals of Statistics 11, 702–706.
  • Hájek, J. (1960). Limiting distributions in simple random sampling from a finite population. Publications of the Mathematics Institute of the Hungarian Academy of Science 5, 361–374.
  • Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 13–30.
  • Holst, L. (1973). Some limit theorems with applications in sampling theory. The Annals of Statistics 1, 644–658.
  • Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling without replacement from a finite universe. Journal of the American Statistical Association 47, 663–685.
  • Joag-Dev, K. and Proschan, F. (1983). Negative association of random variables with applications. The Annals of Statistics 11, 286–295.
  • Luh, K. and Pippenger, N. (2014). Large-deviation bounds for sampling without replacement. American Mathematical Monthly 121, 449–454.
  • Müller, A. and Stoyan, D. (2002). Comparison Methods for Stochastic Models and Risks. Wiley Series in Probability and Statistics. Chichester: John Wiley & Sons, Ltd.
  • Pitman, J. and Tran, N. M. (2015). Size-biased permutation of a finite sequence with independent and identically distributed terms. Bernoulli 21, 2484–2512.
  • Pitman, J. and Yor, M. (1997). The two-parameter Poisson–Dirichlet distribution derived from a stable subordinator. The Annals of Probability 25, 855–900.
  • Rosén, B. (1972). Asymptotic theory for successive sampling with varying probabilities without replacement. I, II. Annals of Mathematical Statistics 43, 373–397; ibid. 43 (1972), 748–776.
  • Serfling, R. J. (1974). Probability inequalities for the sum in sampling without replacement. The Annals of Statistics 2, 39–48.
  • Shaked, M. and Shanthikumar, J. G. (2007). Stochastic Orders. Springer Series in Statistics. New York: Springer.
  • Strassen, V. (1965). The existence of probability measures with given marginals. Annals of Mathematical Statistics 36, 423–439.
  • Szekli, R. (1995). Stochastic Ordering and Dependence in Applied Probability. Lecture Notes in Statistics 97. New York: Springer.
  • Yu, Y. (2012). On the inclusion probabilities in some unequal probability sampling plans without replacement. Bernoulli 18, 279–289.