Electronic Journal of Statistics

Fast rates for empirical vector quantization

Clément Levrard

Full-text: Open access


We consider the rate of convergence of the expected loss of empirically optimal vector quantizers. Earlier results show that the mean-squared expected distortion for any fixed probability distribution supported on a bounded set and satisfying some regularity conditions decreases at the rate $\mathcal{O}(\log n/n)$. We prove that this rate is actually $\mathcal{O}(1/n)$. Although these conditions are hard to check, we show that well-clustered distributions with continuous densities supported on a bounded set are included in the scope of this result.

Article information

Electron. J. Statist., Volume 7 (2013), 1716-1746.

First available in Project Euclid: 3 July 2013

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Quantization clustering localization fast rates


Levrard, Clément. Fast rates for empirical vector quantization. Electron. J. Statist. 7 (2013), 1716--1746. doi:10.1214/13-EJS822. https://projecteuclid.org/euclid.ejs/1372861686

Export citation


  • [1] András Antos. Improved minimax bounds on the test and training distortion of empirically designed vector quantizers., IEEE Trans. Inform. Theory, 51(11) :4022–4032, 2005.
  • [2] András Antos, László Györfi, and András György. Individual convergence rates in empirical vector quantizer design., IEEE Trans. Inform. Theory, 51(11) :4013–4022, 2005.
  • [3] Adrian Baddeley. Integrals on a moving manifold and geometrical probability., Advances in Appl. Probability, 9(3):588–603, 1977.
  • [4] Peter L. Bartlett, Tamás Linder, and Gábor Lugosi. The minimax distortion redundancy in empirical quantizer design., IEEE Trans. Inform. Theory, 44(5) :1802–1813, 1998.
  • [5] Gérard Biau, Luc Devroye, and Gábor Lugosi. On the performance of clustering in Hilbert spaces., IEEE Trans. Inform. Theory, 54(2):781–790, 2008.
  • [6] Gilles Blanchard, Olivier Bousquet, and Pascal Massart. Statistical performance of support vector machines., Ann. Statist., 36(2):489–531, 2008.
  • [7] Olivier Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes., C. R. Math. Acad. Sci. Paris, 334(6):495–500, 2002.
  • [8] Benoît Cadre and Quentin Paris. On Hölder fields clustering., TEST, 21(2):301–316, 2012.
  • [9] Philip A. Chou. The distortion of vector quantizers trained on $n$ vectors decreases to the optimum as $\mathcalO_p(1/n)$. In, Proc. IEEE Int. Symp. Inf. Theory, page 457, Trondheim, Norway, 1994.
  • [10] Aurélie Fischer. Quantization and clustering with Bregman divergences., J. Multivariate Anal., 101(9) :2207–2221, 2010.
  • [11] Allen Gersho and Robert M. Gray., Vector quantization and signal compression. Kluwer Academic Publishers, Norwell, MA, USA, 1991.
  • [12] Evarist Giné and Joel Zinn. Some limit theorems for empirical processes., Ann. Probab., 12(4):929–998, 1984. With discussion.
  • [13] Siegfried Graf and Harald Luschgy., Foundations of quantization for probability distributions, volume 1730 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2000.
  • [14] Stefan Junglen., Geometry of optimal codebooks and constructive quantization. PhD thesis, Universität Trier, Universitätsring 15, 54296 Trier, 2012.
  • [15] Vladimir Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization., Ann. Statist., 34(6) :2593–2656, 2006.
  • [16] Tamás Linder. Learning-theoretic methods in vector quantization. In, Principles of nonparametric learning (Udine, 2001), volume 434 of CISM Courses and Lectures, pages 163–210. Springer, Vienna, 2002.
  • [17] Tamás Linder, Gábor Lugosi, and Kenneth Zeger. Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding., IEEE Trans. Inform. Theory, 40(6) :1728–1740, 1994.
  • [18] Pascal Massart., Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard.
  • [19] Pascal Massart and Élodie Nédélec. Risk bounds for statistical learning., Ann. Statist., 34(5) :2326–2366, 2006.
  • [20] Neri Merhav and Jacob Ziv. On the amount of statistical side information required for lossy data compression., IEEE Trans. Inform. Theory, 43(4) :1112–1121, 1997.
  • [21] David Pollard. Strong consistency of $k$-means clustering., Ann. Statist., 9(1):135–140, 1981.
  • [22] David Pollard. A central limit theorem for empirical processes., J. Austral. Math. Soc. Ser. A, 33(2):235–248, 1982.
  • [23] David Pollard. A central limit theorem for $k$-means clustering., Ann. Probab., 10(4):919–926, 1982.
  • [24] David Pollard. Quantization and the method of k -means., IEEE Transactions on Information Theory, 28(2):199–204, 1982.
  • [25] Thaddeus Tarpey. Principal points and self-consistent points of symmetric multivariate distributions., J. Multivariate Anal., 53(1):39–51, 1995.