Electronic Journal of Statistics

Online learning for changing environments using coin betting

Abstract

A key challenge in online learning is that classical algorithms can be slow to adapt to changing environments. Recent studies have proposed “meta” algorithms that convert any online learning algorithm to one that is adaptive to changing environments, where the adaptivity is analyzed in a quantity called the strongly-adaptive regret. This paper describes a new meta algorithm that has a strongly-adaptive regret bound that is a factor of $\sqrt{\log (T)}$ better than other algorithms with the same time complexity, where $T$ is the time horizon. We also extend our algorithm to achieve a first-order (i.e., dependent on the observed losses) strongly-adaptive regret bound for the first time, to our knowledge. At its heart is a new parameter-free algorithm for the learning with expert advice (LEA) problem in which experts sometimes do not output advice for consecutive time steps (i.e., sleeping experts). This algorithm is derived by a reduction from optimal algorithms for the so-called coin betting problem. Empirical results show that our algorithm outperforms state-of-the-art methods in both learning with expert advice and metric learning scenarios.

Article information

Source
Electron. J. Statist. Volume 11, Number 2 (2017), 5282-5310.

Dates
First available in Project Euclid: 15 December 2017

https://projecteuclid.org/euclid.ejs/1513306874

Digital Object Identifier
doi:10.1214/17-EJS1379SI

Zentralblatt MATH identifier
06825047

Subjects

Citation

Jun, Kwang-Sung; Orabona, Francesco; Wright, Stephen; Willett, Rebecca. Online learning for changing environments using coin betting. Electron. J. Statist. 11 (2017), no. 2, 5282--5310. doi:10.1214/17-EJS1379SI. https://projecteuclid.org/euclid.ejs/1513306874

References

• [1] Adamskiy, Dmitry, Koolen, Wouter M, Chernov, Alexey, and Vovk, Vladimir. A Closer Look at Adaptive Regret., InProceedings of the International Conference on Algorithmic Learning Theory (ALT), pp. 290–304, 2012.
• [2] Blum, Avrim. Empirical Support for Winnow and Weighted-Majority Algorithms: Results on a Calendar Scheduling, Domain.Machine Learning, 26(1):5–23, 1997.
• [3] Cesa-Bianchi, Nicolo and Lugosi, Gabor.Prediction, Learning, and Games. Cambridge University Press, 2006.
• [4] Cesa-Bianchi, Nicolo, Gaillard, Pierre, Lugosi, Gábor, and Stoltz, Gilles. Mirror descent meets fixed share (and feels no regret)., InAdvances in Neural Information Processing Systems (NIPS), pp. 980–988, 2012.
• [5] Daniely, Amit, Gonen, Alon, and Shalev-Shwartz, Shai. Strongly Adaptive Online, Learning.Proceedings of the International Conference on Machine Learning (ICML), pp. 1–18, 2015.
• [6] Freund, Yoav, Schapire, Robert E, Singer, Yoram, and Warmuth, Manfred K. Using and combining predictors that, specialize.Proceedings of the ACM symposium on Theory of computing (STOC), 37(3):334–343, 1997.
• [7] Greenewald, Kristjan, Kelley, Stephen, and Hero, Alfred O. Dynamic metric learning from pairwise, comparisons.54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), 2016.
• [8] György, András, Linder, Tamás, and Lugosi, Gábor. Efficient tracking of large classes of, experts.IEEE Transactions on Information Theory, 58(11) :6709–6725, 2012.
• [9] Hazan, Elad and Seshadhri, Comandur. Adaptive Algorithms for Online Decision, Problems.IBM Research Report, 10418:1–19, 2007.
• [10] Herbster, Mark and Warmuth, Manfred K. Tracking the Best, Expert.Mach. Learn., 32(2):151–178, 1998.
• [11] Jun, Kwang-Sung, Orabona, Francesco, Wright, Stephen, and Willett, Rebecca. Improved Strongly Adaptive Online Learning using Coin Betting., InProceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS), volume 54, pp. 943–951, 2017.
• [12] Krichevsky, Raphail E and Trofimov, Victor K. The performance of universal, encoding.IEEE Trans. Information Theory, 27(2):199–206, 1981.
• [13] Kunapuli, Gautam and Shavlik, Jude. Mirror descent for metric learning: A unified approach., InProceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Database (ECML/PKDD), pp. 859–874, 2012.
• [14] Luo, Haipeng and Schapire, Robert E. Achieving All with No Parameters: AdaNormalHedge., InProceedings of the Conference on Learning Theory (COLT), pp. 1286–1304, 2015.
• [15] Orabona, F., Crammer, K., and Cesa-Bianchi, N. A generalized online mirror descent with applications to classification and, regression.Machine Learning, 99(3):411–435, 2015.
• [16] Orabona, Francesco and Pál, David. Coin betting and parameter-free online learning., InAdvances in Neural Information Processing Systems (NIPS), pp. 577–585. 2016.
• [17] Orabona, Francesco and Tommasi, Tatiana. Backprop without learning rates through coin betting., InAdvances in Neural Information Processing Systems (NIPS), 2017.
• [18] Shalev-Shwartz, Shai.Online Learning: Theory, Algorithms, and Applications. PhD thesis, Hebrew University, 2007.
• [19] Shalev-Shwartz, Shai. Online Learning and Online Convex, Optimization.Found. Trends Mach. Learn., 4(2):107–194, 2012.
• [20] Veness, Joel, White, Martha, Bowling, Michael, and György, András. Partition tree weighting., InProceedings of the 2013 Data Compression Conference, pp. 321–330. IEEE Computer Society, 2013.
• [21] Zhang, Lijun, Yang, Tianbao, Jin, Rong, and Zhou, Zhi-Hua. Strongly adaptive regret implies optimally dynamic, regret.CoRR, abs /1701.07570, 2017.