Electronic Journal of Statistics

Fast rates for empirical vector quantization

Clément Levrard

Full-text: Open access

Abstract

We consider the rate of convergence of the expected loss of empirically optimal vector quantizers. Earlier results show that the mean-squared expected distortion for any fixed probability distribution supported on a bounded set and satisfying some regularity conditions decreases at the rate $\mathcal{O}(\log n/n)$. We prove that this rate is actually $\mathcal{O}(1/n)$. Although these conditions are hard to check, we show that well-clustered distributions with continuous densities supported on a bounded set are included in the scope of this result.

Article information

Source
Electron. J. Statist., Volume 7 (2013), 1716-1746.

Dates
First available in Project Euclid: 3 July 2013

Permanent link to this document
https://projecteuclid.org/euclid.ejs/1372861686

Digital Object Identifier
doi:10.1214/13-EJS822

Mathematical Reviews number (MathSciNet)
MR3080408

Zentralblatt MATH identifier
1349.62038

Keywords
Quantization clustering localization fast rates

Citation

Levrard, Clément. Fast rates for empirical vector quantization. Electron. J. Statist. 7 (2013), 1716--1746. doi:10.1214/13-EJS822. https://projecteuclid.org/euclid.ejs/1372861686


Export citation

References

  • [1] András Antos. Improved minimax bounds on the test and training distortion of empirically designed vector quantizers., IEEE Trans. Inform. Theory, 51(11) :4022–4032, 2005.
  • [2] András Antos, László Györfi, and András György. Individual convergence rates in empirical vector quantizer design., IEEE Trans. Inform. Theory, 51(11) :4013–4022, 2005.
  • [3] Adrian Baddeley. Integrals on a moving manifold and geometrical probability., Advances in Appl. Probability, 9(3):588–603, 1977.
  • [4] Peter L. Bartlett, Tamás Linder, and Gábor Lugosi. The minimax distortion redundancy in empirical quantizer design., IEEE Trans. Inform. Theory, 44(5) :1802–1813, 1998.
  • [5] Gérard Biau, Luc Devroye, and Gábor Lugosi. On the performance of clustering in Hilbert spaces., IEEE Trans. Inform. Theory, 54(2):781–790, 2008.
  • [6] Gilles Blanchard, Olivier Bousquet, and Pascal Massart. Statistical performance of support vector machines., Ann. Statist., 36(2):489–531, 2008.
  • [7] Olivier Bousquet. A Bennett concentration inequality and its application to suprema of empirical processes., C. R. Math. Acad. Sci. Paris, 334(6):495–500, 2002.
  • [8] Benoît Cadre and Quentin Paris. On Hölder fields clustering., TEST, 21(2):301–316, 2012.
  • [9] Philip A. Chou. The distortion of vector quantizers trained on $n$ vectors decreases to the optimum as $\mathcalO_p(1/n)$. In, Proc. IEEE Int. Symp. Inf. Theory, page 457, Trondheim, Norway, 1994.
  • [10] Aurélie Fischer. Quantization and clustering with Bregman divergences., J. Multivariate Anal., 101(9) :2207–2221, 2010.
  • [11] Allen Gersho and Robert M. Gray., Vector quantization and signal compression. Kluwer Academic Publishers, Norwell, MA, USA, 1991.
  • [12] Evarist Giné and Joel Zinn. Some limit theorems for empirical processes., Ann. Probab., 12(4):929–998, 1984. With discussion.
  • [13] Siegfried Graf and Harald Luschgy., Foundations of quantization for probability distributions, volume 1730 of Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2000.
  • [14] Stefan Junglen., Geometry of optimal codebooks and constructive quantization. PhD thesis, Universität Trier, Universitätsring 15, 54296 Trier, 2012.
  • [15] Vladimir Koltchinskii. Local Rademacher complexities and oracle inequalities in risk minimization., Ann. Statist., 34(6) :2593–2656, 2006.
  • [16] Tamás Linder. Learning-theoretic methods in vector quantization. In, Principles of nonparametric learning (Udine, 2001), volume 434 of CISM Courses and Lectures, pages 163–210. Springer, Vienna, 2002.
  • [17] Tamás Linder, Gábor Lugosi, and Kenneth Zeger. Rates of convergence in the source coding theorem, in empirical quantizer design, and in universal lossy source coding., IEEE Trans. Inform. Theory, 40(6) :1728–1740, 1994.
  • [18] Pascal Massart., Concentration inequalities and model selection, volume 1896 of Lecture Notes in Mathematics. Springer, Berlin, 2007. Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6–23, 2003, With a foreword by Jean Picard.
  • [19] Pascal Massart and Élodie Nédélec. Risk bounds for statistical learning., Ann. Statist., 34(5) :2326–2366, 2006.
  • [20] Neri Merhav and Jacob Ziv. On the amount of statistical side information required for lossy data compression., IEEE Trans. Inform. Theory, 43(4) :1112–1121, 1997.
  • [21] David Pollard. Strong consistency of $k$-means clustering., Ann. Statist., 9(1):135–140, 1981.
  • [22] David Pollard. A central limit theorem for empirical processes., J. Austral. Math. Soc. Ser. A, 33(2):235–248, 1982.
  • [23] David Pollard. A central limit theorem for $k$-means clustering., Ann. Probab., 10(4):919–926, 1982.
  • [24] David Pollard. Quantization and the method of k -means., IEEE Transactions on Information Theory, 28(2):199–204, 1982.
  • [25] Thaddeus Tarpey. Principal points and self-consistent points of symmetric multivariate distributions., J. Multivariate Anal., 53(1):39–51, 1995.