Electronic Journal of Statistics

Auxiliary information: the raking-ratio empirical process

Mickael Albertus and Philippe Berthet

Full-text: Open access


We study the empirical measure associated to a sample of size $n$ and modified by $N$ iterations of the raking-ratio method. This empirical measure is adjusted to match the true probability of sets in a finite partition which changes each step. We establish asymptotic properties of the raking-ratio empirical process indexed by functions as $n\rightarrow +\infty $, for $N$ fixed. We study nonasymptotic properties by using a Gaussian approximation which yields uniform Berry-Esseen type bounds depending on $n,N$ and provides estimates of the uniform quadratic risk reduction. A closed-form expression of the limiting covariance matrices is derived as $N\rightarrow +\infty $. In the two-way contingency table case the limiting process has a simple explicit formula.

Article information

Electron. J. Statist., Volume 13, Number 1 (2019), 120-165.

Received: March 2018
First available in Project Euclid: 4 January 2019

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 62G30: Order statistics; empirical distribution functions 62G20: Asymptotic properties
Secondary: 60F05: Central limit and other weak theorems 60F17: Functional limit theorems; invariance principles

Raking-ratio method empirical processes strong approximation nonparametric statistics auxiliary information Sinkhorn algorithm

Creative Commons Attribution 4.0 International License.


Albertus, Mickael; Berthet, Philippe. Auxiliary information: the raking-ratio empirical process. Electron. J. Statist. 13 (2019), no. 1, 120--165. doi:10.1214/18-EJS1526. https://projecteuclid.org/euclid.ejs/1546570944

Export citation


  • [1] Alexander, K. S. (1984). Probability inequalities for empirical processes and a law of the iterated logarithm., The Annals of Probability 12 1041–1067.
  • [2] Bankier, M.D. (1986). Estimators based in several stratified samples with applications to multiple frame surveys., Journal of the American Statistical Association 81 1074–1079.
  • [3] Berthet, P. and Mason, D. M. (2006). Revisiting two strong approximation results of Dudley and Philipp., IMS Lecture Notes-Monograph Series High Dimensional Probability 51 155–172.
  • [4] Binder, D. A. and Théberge, A (1988). Estimating the variance of raking-ratio estimators., The Canadian Journal of Statistics 16 47–55.
  • [5] Birgé, L. and Massart, P. (1998). Minimum contrast estimators on sieves: exponential bounds and rates of convergence., Bernoulli. Official Journal of the Bernoulli Society for Mathematical Statistics and Probability 4 329–375.
  • [6] Brackstone, G. J. and Rao, J. N. K. (1979). An investigation of raking ratio estimators., The Indian journal of Statistics 41 97–114.
  • [7] Brown, D. T. (1959). A note on approximations to discrete probability distributions., Information and control 2 386–392.
  • [8] Choudhry, G.H. and Lee, H. (1987). Variance estimation for the Canadian Labour Force Survey., Survey Methodology 13 147–161.
  • [9] Cover, T. M. and Thomas, J. A. (2012). Elements of information theory., John Wiley & Sons
  • [10] Deming, W. E. and Stephan, F. F. (1940). On a least squares adjustment of a sampled frequency table when the expected marginal totals are known., The Annals of Mathematical Statistics 11 427–444.
  • [11] Deville, J-C. and Särndal, C-E (1992). Calibration estimators in survey sampling., Journal of the American Statistical Association 87 376–382.
  • [12] Deville, J-C. and Särndal, C-E (1993). Generalized raking procedures in survey sampling., Journal of the American Statistical Association 88 1013–1020.
  • [13] Dudley, R. M. (1989). Real analysis and probability., Wadsworth & Brooks/Cole Advanced Books & Software, Pacific Grove, CA
  • [14] Dudley, R. M. (2014). Uniform central limit theorems., Cambridge university press 142
  • [15] Franklin, J. and Lorenz, J. (1989). On the scaling of multidimensional matrices., Linear Algebra and its Applications 114-115 717–735.
  • [16] Ireland, C. T. and Kullback, S. (1968). Contingency tables with given marginals., Biometrika 55 179–188.
  • [17] Konijn, H. S. (1981). Biases, variances and covariances of raking ratio estimators for marginal and cell totals and averages of observed characteristics., Metrika 28 109–121.
  • [18] Lewis, P. M. (1959). Approximating probability distributions to reduce storage requirements., Information and control 2 214–225.
  • [19] Pollard, D. (1984). Convergence of stochastic processes., Springer Science & Business Media
  • [20] Pollard, D. (1990). Empirical processes: theory and applications., NSF-CBMS regional conference series in probability and statistics
  • [21] Shorack, G. R. and Wellner, J. A. (1986). Empirical processes with applications to statistics., Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics
  • [22] Sinkhorn, R. (1964). A relationship between arbitrary positive matrices and doubly stochastic matrices., The Annals of Mathematical Statistics 35 876–879.
  • [23] Sinkhorn, R. (1967). Diagonal equivalence to matrices with prescribed row and column sums., The American Mathematical Monthly 74 402–405.
  • [24] Sinkhorn, R. and Knopp, P. (1967). Concerning nonnegative matrices and doubly stochastic matrices., Pacific Journal of Mathematics 21 343–348.
  • [25] Stephan, F. F. (1942). An iterative method of adjusting sample frequency tables when expected marginal totals are known., The Annals of Mathematical Statistics 13 166–178.
  • [26] Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes., The Annals of Probability 22 28–76.
  • [27] Van der Vaart, A. W. and Wellner, J. A. (1996). Weak convergence and empirical processes with applications to statistics., Springer series in statistics
  • [28] Van der Vaart, A. W. (2000). Asymptotic statistics., Cambridge university press
  • [29] Wellner, J. A. (1992). Empirical processes in action: a review., International Statistical Review/Revue Internationale de Statistique 60 247–269.