The Annals of Statistics

Estimation of sums of random variables: Examples and information bounds

Cun-Hui Zhang

Full-text: Open access


This paper concerns the estimation of sums of functions of observable and unobservable variables. Lower bounds for the asymptotic variance and a convolution theorem are derived in general finite- and infinite-dimensional models. An explicit relationship is established between efficient influence functions for the estimation of sums of variables and the estimation of their means. Certain “plug-in” estimators are proved to be asymptotically efficient in finite-dimensional models, while “u,v” estimators of Robbins are proved to be efficient in infinite-dimensional mixture models. Examples include certain species, network and data confidentiality problems.

Article information

Ann. Statist., Volume 33, Number 5 (2005), 2022-2041.

First available in Project Euclid: 25 November 2005

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62F10: Point estimation 62F12: Asymptotic properties of estimators 62G05: Estimation 62G20: Asymptotic properties
Secondary: 62F15: Bayesian inference

Empirical Bayes sum of variables utility efficient estimation information bound influence function species problem networks node degree data confidentiality disclosure risk


Zhang, Cun-Hui. Estimation of sums of random variables: Examples and information bounds. Ann. Statist. 33 (2005), no. 5, 2022--2041. doi:10.1214/009053605000000390.

Export citation


  • Benedetti, R. and Franconi, L. (1998). Statistical and technological solutions for controlled data dissemination. In Pre-proceedings of New Techniques and Technologies for Statistics, Sorrento 1 225--232.
  • Bethlehem, J., Keller, W. and Pannekoek, J. (1990). Disclosure control of microdata. J. Amer. Statist. Assoc. 85 38--45.
  • Bickel, P. J., Klaassen, C. A. J., Ritov, Y. and Wellner, J. A. (1993). Efficient and Adaptive Estimation for Semiparametric Models. Johns Hopkins Univ. Press, Baltimore.
  • Bunge, J. and Fitzpatrick, M. (1993). Estimating the number of species: A review. J. Amer. Statist. Assoc. 88 364--373.
  • Chao, A. (1984). Nonparametric estimation of the number of classes in a population. Scand. J. Statist. 11 265--270.
  • Chao, A. and Bunge, J. (2002). Estimating the number of species in a stochastic abundance model. Biometrics 58 531--539.
  • Clauset, A. and Moore, C. (2003). Traceroute sampling makes random graphs appear to have power law degree. Preprint.
  • Coates, A., Hero, A., Nowak, R. and Yu, B. (2002). Internet tomography. IEEE Signal Processing Magazine 19(3) 47--65.
  • Darroch, J. N. and Ratcliff, D. (1980). A note on capture--recapture estimation. Biometrics 36 149--153.
  • Duncan, G. T. and Pearson, R. W. (1991). Enhancing access to microdata while protecting confidentiality: Prospects for the future (with discussion). Statist. Sci. 6 219--239.
  • Engen, S. (1974). On species frequency models. Biometrika 61 263--270.
  • Faloutsos, M., Faloutsos, P. and Faloutsos, C. (1999). On power-law relationships of the Internet topology. In Proc. ACM SIGCOMM 1999 251--262. ACM Press, New York.
  • Fisher, R. A., Corbet, A. S. and Williams, C. B. (1943). The relation between the number of species and the number of individuals in a random sample of an animal population. J. Animal Ecology 12 42--58.
  • Good, I. J. (1953). The population frequencies of species and the estimation of population parameters. Biometrika 40 237--264.
  • Govindan, R. and Tangmunarunkit, H. (2000). Heuristics for Internet map discovery. In Proc. IEEE INFOCOM 2000 3 1371--1380. IEEE Press, New York.
  • Lakhina, A., Byers, J., Crovella, M. and Xie, P. (2003). Sampling biases in IP topology measurements. In Proc. IEEE INFOCOM 2003 1 332--341. IEEE Press, New York.
  • Pfanzagl, J. (with the assistance of W. Wefelmeyer) (1982). Contributions to a General Asymptotic Statistical Theory. Lecture Notes in Statist. 13. Springer, New York.
  • Polettini, S. and Seri, G. (2003). Guidelines for the protection of social micro-data using individual risk methodology. Application within $\mu$-Argus version 3.2, CASC Project Deliverable No. 1.2-D3. Available at
  • Rao, C. R. (1971). Some comments on the logarithmic series distribution in the analysis of insect trap data. In Statistical Ecology (G. P. Patil, E. C. Pielou and W. E. Waters, eds.) 1 131--142. Pennsylvania State Univ. Press, University Park.
  • Rieder, H. (2000). One-sided confidence about functionals over tangent cones. Available at
  • Rinott, Y. (2003). On models for statistical disclosure risk estimation. Working paper no. 16, Joint ECE/Eurostat Work Session on Data Confidentiality, Luxemburg, 2003. Available at
  • Robbins, H. (1977). Prediction and estimation for the compound Poisson distribution. Proc. Natl. Acad. Sci. U.S.A. 74 2670--2671.
  • Robbins, H. (1980). An empirical Bayes estimation problem. Proc. Natl. Acad. Sci. U.S.A. 77 6988--6989.
  • Robbins, H. (1988). The $u,v$ method of estimation. In Statistical Decision Theory and Related Topics IV (S. S. Gupta and J. O. Berger, eds.) 1 265--270. Springer, New York.
  • Robbins, H. and Zhang, C.-H. (1988). Estimating a treatment effect under biased sampling. Proc. Natl. Acad. Sci. U.S.A. 85 3670--3672.
  • Robbins, H. and Zhang, C.-H. (1989). Estimating the superiority of a drug to a placebo when all and only those patients at risk are treated with the drug. Proc. Natl. Acad. Sci. U.S.A. 86 3003--3005.
  • Robbins, H. and Zhang, C.-H. (1991). Estimating a multiplicative treatment effect under biased allocation. Biometrika 78 349--354.
  • Robbins, H. and Zhang, C.-H. (2000). Efficiency of the $u,v$ method of estimation. Proc. Natl. Acad. Sci. U.S.A. 97 12,976--12,979.
  • Sampford, M. R. (1955). The truncated negative binomial distribution. Biometrika 42 58--69.
  • Spring, N., Mahajan, R. and Wetherall, D. (2002). Measuring ISP topologies with rocketfuel. In Proc. ACM SIGCOMM 2002 133--145. ACM Press, New York.
  • van der Vaart, A. W. (1998). Asymptotic Statistics. Cambridge Univ. Press.
  • Vardi, Y. (1996). Network tomography: Estimating source-destination traffic intensities from link data. J. Amer. Statist. Assoc. 91 365--377.