It is known that deterministic automata are generally not optimal in the problem of learning with finite memory. It is natural to ask how much memory is saved by randomization. In this note it is shown that the memory saving is arbitrarily large in the sense that for any memory size $m < \infty$, and $\delta > 0$, there exist problems such that all $m$-state deterministic algorithms have probability of error $P(e) \geqq \frac{1}{2} - \delta$, while the optimal two-state randomized algorithm has $P(e) \leqq \delta$.
Martin E. Hellman. Thomas M. Cover. "On Memory Saved by Randomization." Ann. Math. Statist. 42 (3) 1075 - 1078, June, 1971. https://doi.org/10.1214/aoms/1177693334