Electronic Journal of Statistics

Online learning for changing environments using coin betting

Kwang-Sung Jun, Francesco Orabona, Stephen Wright, and Rebecca Willett

Full-text: Open access


A key challenge in online learning is that classical algorithms can be slow to adapt to changing environments. Recent studies have proposed “meta” algorithms that convert any online learning algorithm to one that is adaptive to changing environments, where the adaptivity is analyzed in a quantity called the strongly-adaptive regret. This paper describes a new meta algorithm that has a strongly-adaptive regret bound that is a factor of $\sqrt{\log (T)}$ better than other algorithms with the same time complexity, where $T$ is the time horizon. We also extend our algorithm to achieve a first-order (i.e., dependent on the observed losses) strongly-adaptive regret bound for the first time, to our knowledge. At its heart is a new parameter-free algorithm for the learning with expert advice (LEA) problem in which experts sometimes do not output advice for consecutive time steps (i.e., sleeping experts). This algorithm is derived by a reduction from optimal algorithms for the so-called coin betting problem. Empirical results show that our algorithm outperforms state-of-the-art methods in both learning with expert advice and metric learning scenarios.

Article information

Electron. J. Statist., Volume 11, Number 2 (2017), 5282-5310.

Received: June 2017
First available in Project Euclid: 15 December 2017

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Primary: 68T05: Learning and adaptive systems [See also 68Q32, 91E40]

Online learning changing environments learning with expert advice online convex optimization

Creative Commons Attribution 4.0 International License.


Jun, Kwang-Sung; Orabona, Francesco; Wright, Stephen; Willett, Rebecca. Online learning for changing environments using coin betting. Electron. J. Statist. 11 (2017), no. 2, 5282--5310. doi:10.1214/17-EJS1379SI. https://projecteuclid.org/euclid.ejs/1513306874

Export citation


  • [1] Adamskiy, Dmitry, Koolen, Wouter M, Chernov, Alexey, and Vovk, Vladimir. A Closer Look at Adaptive Regret. In, Proceedings of the International Conference on Algorithmic Learning Theory (ALT), pp. 290–304, 2012.
  • [2] Blum, Avrim. Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling Domain., Machine Learning, 26(1):5–23, 1997.
  • [3] Cesa-Bianchi, Nicolo and Lugosi, Gabor., Prediction, Learning, and Games. Cambridge University Press, 2006.
  • [4] Cesa-Bianchi, Nicolo, Gaillard, Pierre, Lugosi, Gábor, and Stoltz, Gilles. Mirror descent meets fixed share (and feels no regret). In, Advances in Neural Information Processing Systems (NIPS), pp. 980–988, 2012.
  • [5] Daniely, Amit, Gonen, Alon, and Shalev-Shwartz, Shai. Strongly Adaptive Online Learning., Proceedings of the International Conference on Machine Learning (ICML), pp. 1–18, 2015.
  • [6] Freund, Yoav, Schapire, Robert E, Singer, Yoram, and Warmuth, Manfred K. Using and combining predictors that specialize., Proceedings of the ACM symposium on Theory of computing (STOC), 37(3):334–343, 1997.
  • [7] Greenewald, Kristjan, Kelley, Stephen, and Hero, Alfred O. Dynamic metric learning from pairwise comparisons., 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2016.
  • [8] György, András, Linder, Tamás, and Lugosi, Gábor. Efficient tracking of large classes of experts., IEEE Transactions on Information Theory, 58(11) :6709–6725, 2012.
  • [9] Hazan, Elad and Seshadhri, Comandur. Adaptive Algorithms for Online Decision Problems., IBM Research Report, 10418:1–19, 2007.
  • [10] Herbster, Mark and Warmuth, Manfred K. Tracking the Best Expert., Mach. Learn., 32(2):151–178, 1998.
  • [11] Jun, Kwang-Sung, Orabona, Francesco, Wright, Stephen, and Willett, Rebecca. Improved Strongly Adaptive Online Learning using Coin Betting. In, Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), volume 54, pp. 943–951, 2017.
  • [12] Krichevsky, Raphail E and Trofimov, Victor K. The performance of universal encoding., IEEE Trans. Information Theory, 27(2):199–206, 1981.
  • [13] Kunapuli, Gautam and Shavlik, Jude. Mirror descent for metric learning: A unified approach. In, Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML/PKDD), pp. 859–874, 2012.
  • [14] Luo, Haipeng and Schapire, Robert E. Achieving All with No Parameters: AdaNormalHedge. In, Proceedings of the Conference on Learning Theory (COLT), pp. 1286–1304, 2015.
  • [15] Orabona, F., Crammer, K., and Cesa-Bianchi, N. A generalized online mirror descent with applications to classification and regression., Machine Learning, 99(3):411–435, 2015.
  • [16] Orabona, Francesco and Pál, David. Coin betting and parameter-free online learning. In, Advances in Neural Information Processing Systems (NIPS), pp. 577–585. 2016.
  • [17] Orabona, Francesco and Tommasi, Tatiana. Backprop without learning rates through coin betting. In, Advances in Neural Information Processing Systems (NIPS), 2017.
  • [18] Shalev-Shwartz, Shai., Online Learning: Theory, Algorithms, and Applications. PhD thesis, Hebrew University, 2007.
  • [19] Shalev-Shwartz, Shai. Online Learning and Online Convex Optimization., Found. Trends Mach. Learn., 4(2):107–194, 2012.
  • [20] Veness, Joel, White, Martha, Bowling, Michael, and György, András. Partition tree weighting. In, Proceedings of the 2013 Data Compression Conference, pp. 321–330. IEEE Computer Society, 2013.
  • [21] Zhang, Lijun, Yang, Tianbao, Jin, Rong, and Zhou, Zhi-Hua. Strongly adaptive regret implies optimally dynamic regret., CoRR, abs /1701.07570, 2017.