Prediction of dynamical time series with additive noise using support vector machines or kernel based regression is consistent for certain classes of discrete dynamical systems. Consistency implies that these methods are effective at computing the expected value of a point at a future time given the present coordinates. However, the present coordinates themselves are noisy, and therefore, these methods are not necessarily effective at removing noise. In this article, we consider denoising and prediction as separate problems for flows, as opposed to discrete time dynamical systems, and show that the use of smooth splines is more effective at removing noise. Combination of smooth splines and kernel based regression yields predictors that are more accurate on benchmarks typically by a factor of 2 or more. We prove that kernel based regression in combination with smooth splines converges to the exact predictor for time series extracted from any compact invariant set of any sufficiently smooth flow. As a consequence of convergence, one can find examples where the combination of kernel based regression with smooth splines is superior by even a factor of $100$. The predictors that we analyze and compute operate on delay coordinate data and not the full state vector, which is typically not observable.
Electron. J. Statist.
12(2):
2217-2237
(2018).
DOI: 10.1214/18-EJS1429
[1] T.M. Adams and A.B. Nobel. On density estimation from ergodic processes., The Annals of Probability, 25:794–804, 1998. 0932.62042 10.1214/aop/1022855650 euclid.aop/1022855650[1] T.M. Adams and A.B. Nobel. On density estimation from ergodic processes., The Annals of Probability, 25:794–804, 1998. 0932.62042 10.1214/aop/1022855650 euclid.aop/1022855650
[2] V. Araújo and I. Melbourne. Exponential decay of correlations for nonuniformly hyperbolic flows with a $c^1+\alpha $ stable foliation, including the classical Lorenz attractor., Ann. Henri Poincaré, 17 :2975–3004, 2015. 1367.37033 10.1007/s00023-016-0482-9[2] V. Araújo and I. Melbourne. Exponential decay of correlations for nonuniformly hyperbolic flows with a $c^1+\alpha $ stable foliation, including the classical Lorenz attractor., Ann. Henri Poincaré, 17 :2975–3004, 2015. 1367.37033 10.1007/s00023-016-0482-9
[3] A. Christmann and I. Steinwart. Consistency and robustness of kernel-based regression in convex risk minimization., Bernoulli, 3:799–819, 2007. 1129.62031 10.3150/07-BEJ5102 euclid.bj/1186503487[3] A. Christmann and I. Steinwart. Consistency and robustness of kernel-based regression in convex risk minimization., Bernoulli, 3:799–819, 2007. 1129.62031 10.3150/07-BEJ5102 euclid.bj/1186503487
[4] C. de Boor., A Practical Guide to Splines. Springer, New York, revised edition, 2001. 0987.65015[4] C. de Boor., A Practical Guide to Splines. Springer, New York, revised edition, 2001. 0987.65015
[5] P.P.B. Eggermont and V.N. LaRiccia. Uniform error bounds for smoothing splines. In, High Dimensional Probability, volume 51 of IMS Lecture Notes-Monograph Series, pages 220–237. Institute of Mathematical Statistics, 2006. MR2387772 1117.62039[5] P.P.B. Eggermont and V.N. LaRiccia. Uniform error bounds for smoothing splines. In, High Dimensional Probability, volume 51 of IMS Lecture Notes-Monograph Series, pages 220–237. Institute of Mathematical Statistics, 2006. MR2387772 1117.62039
[6] P.P.B. Eggermont and V.N. LaRiccia., Maximum Penalized Likelihood Estimation, volume II. Springer, New York, 2009. Springer Series in Statistics. 1184.62063[6] P.P.B. Eggermont and V.N. LaRiccia., Maximum Penalized Likelihood Estimation, volume II. Springer, New York, 2009. Springer Series in Statistics. 1184.62063
[7] I. Ekeland and R. Temam., Convex Analysis and Variational Problems. SIAM, Philadelphia, 1987. 0939.49002[7] I. Ekeland and R. Temam., Convex Analysis and Variational Problems. SIAM, Philadelphia, 1987. 0939.49002
[9] L. Györfi, W. Härdle, P Sarda, and P. Vieu., Nonparametric Curve Estimation from Time Series, volume 60 of Lecture Notes in Statistics. Springer, New York, 1989.[9] L. Györfi, W. Härdle, P Sarda, and P. Vieu., Nonparametric Curve Estimation from Time Series, volume 60 of Lecture Notes in Statistics. Springer, New York, 1989.
[11] H. Hang, Y. Feng, I. Steinwart, and J.A.K. Suykens. Learning theory estimates with observations from general stationary stochastic processes., Neural Computation, 28 :2853–2889, 2016.[11] H. Hang, Y. Feng, I. Steinwart, and J.A.K. Suykens. Learning theory estimates with observations from general stationary stochastic processes., Neural Computation, 28 :2853–2889, 2016.
[12] S. P. Lalley. Beneath the noise, chaos., The Annals of Statistics, 27:461–479, 1999. 0980.62085 10.1214/aos/1018031203 euclid.aos/1018031203[12] S. P. Lalley. Beneath the noise, chaos., The Annals of Statistics, 27:461–479, 1999. 0980.62085 10.1214/aos/1018031203 euclid.aos/1018031203
[13] S. P. Lalley and A. B. Nobel. Denoising deterministic time series., Dynamics of PDE, 3:259–279, 2006. 1137.37334 10.4310/DPDE.2006.v3.n4.a1[13] S. P. Lalley and A. B. Nobel. Denoising deterministic time series., Dynamics of PDE, 3:259–279, 2006. 1137.37334 10.4310/DPDE.2006.v3.n4.a1
[14] S.P. Lalley. Removing the noise from chaos plus noise. In A.I. Mees, editor, Nonlinear Dynamcs and Statistics, pages 233–244. Birkhäuser, Boston, 2001.[14] S.P. Lalley. Removing the noise from chaos plus noise. In A.I. Mees, editor, Nonlinear Dynamcs and Statistics, pages 233–244. Birkhäuser, Boston, 2001.
[15] D. Mattera and S. Haykin. Support vector machines for dynamic reconstrution of a chaotic system. In B. Schölkopf, C.J.C. Burges, and A.J. Smola, editors, Advances in Kernel Methods, pages 211–242. MIT Press, MA, 1999.[15] D. Mattera and S. Haykin. Support vector machines for dynamic reconstrution of a chaotic system. In B. Schölkopf, C.J.C. Burges, and A.J. Smola, editors, Advances in Kernel Methods, pages 211–242. MIT Press, MA, 1999.
[16] K. McGoff, S. Mukherjee, and N. Pillai. Statistical inference for dynamical systems: A review., Statistics Surveys, 9:209–252, 2015. 1327.62458 10.1214/15-SS111[16] K. McGoff, S. Mukherjee, and N. Pillai. Statistical inference for dynamical systems: A review., Statistics Surveys, 9:209–252, 2015. 1327.62458 10.1214/15-SS111
[17] K. McGoff and A.B. Nobel. Empirical risk minimization and complexity of dynamical models., www.arxiv.org, 2016.[17] K. McGoff and A.B. Nobel. Empirical risk minimization and complexity of dynamical models., www.arxiv.org, 2016.
[18] S. Mukherjee, E. Osuna, and F. Girosi. Nonlinear prediction of chaotic time series using a support vector machine. In, Neural Networks for Signal Processing VII—Proceedings of the 1997 IEEE Workshop, pages 511–520. IEEE, 1997.[18] S. Mukherjee, E. Osuna, and F. Girosi. Nonlinear prediction of chaotic time series using a support vector machine. In, Neural Networks for Signal Processing VII—Proceedings of the 1997 IEEE Workshop, pages 511–520. IEEE, 1997.
[19] K.-S. Müller, A.J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, and V.N. Vapnik. Using support vector machines for time series prediction. In B. Scholkopf, C.J.C. Burges, and S. Mika, editors, Advances in Kernel Methods, pages 243–253. MIT Press, 1998.[19] K.-S. Müller, A.J. Smola, G. Rätsch, B. Schölkopf, J. Kohlmorgen, and V.N. Vapnik. Using support vector machines for time series prediction. In B. Scholkopf, C.J.C. Burges, and S. Mika, editors, Advances in Kernel Methods, pages 243–253. MIT Press, 1998.
[20] N.I. Sapankevych and R. Sankar. Time series prediction using support vector machines: a survey., IEEE Computational Intelligence Magazine, pages 24–38, May 2009.[20] N.I. Sapankevych and R. Sankar. Time series prediction using support vector machines: a survey., IEEE Computational Intelligence Magazine, pages 24–38, May 2009.
[21] T. Sauer, J.A. Yorke, and M. Casdagli. Embedology., Journal of Statistical Physics, 65:579–616, 1991. 0943.37506 10.1007/BF01053745[21] T. Sauer, J.A. Yorke, and M. Casdagli. Embedology., Journal of Statistical Physics, 65:579–616, 1991. 0943.37506 10.1007/BF01053745
[23] I. Steinwart. On the influence of the kernel on the consistency of support vector machines., Journal of Machine Learning Research, 2:67–93, 2001. 1009.68143[23] I. Steinwart. On the influence of the kernel on the consistency of support vector machines., Journal of Machine Learning Research, 2:67–93, 2001. 1009.68143
[24] I. Steinwart and M. Anghel. Consistency of support vector machines for forecasting the evolution of an unknown ergodic dynamical system from observations with unknown noise., The Annals of Statistics, 37:841–875, 2009. 1162.62089 10.1214/07-AOS562 euclid.aos/1236693152[24] I. Steinwart and M. Anghel. Consistency of support vector machines for forecasting the evolution of an unknown ergodic dynamical system from observations with unknown noise., The Annals of Statistics, 37:841–875, 2009. 1162.62089 10.1214/07-AOS562 euclid.aos/1236693152
[25] I. Steinwart, D. Hush, and C. Scovel. Learning from dependent observations., Journal of Multivariate Analysis, 100:175–194, 2009. 1158.68040 10.1016/j.jmva.2008.04.001[25] I. Steinwart, D. Hush, and C. Scovel. Learning from dependent observations., Journal of Multivariate Analysis, 100:175–194, 2009. 1158.68040 10.1016/j.jmva.2008.04.001
[26] C. Stone. Optimal rates of convergence for nonparametric regression., Annals of Statistics, 10 :1040–1063, 1982. 0511.62048 10.1214/aos/1176345969 euclid.aos/1176345969[26] C. Stone. Optimal rates of convergence for nonparametric regression., Annals of Statistics, 10 :1040–1063, 1982. 0511.62048 10.1214/aos/1176345969 euclid.aos/1176345969
[28] D. Viswanath, X. Liang, and K. Serkh. Metric entropy and the optimal prediction of chaotic signals., SIAM Journal on Applied Dynamical Systems, 12 :1085–1113, 2013. 1359.37150 10.1137/110824772[28] D. Viswanath, X. Liang, and K. Serkh. Metric entropy and the optimal prediction of chaotic signals., SIAM Journal on Applied Dynamical Systems, 12 :1085–1113, 2013. 1359.37150 10.1137/110824772
[29] G. Wahba., Spline Models for Observational Data. SIAM, Philadelphia, 1990. 0813.62001[29] G. Wahba., Spline Models for Observational Data. SIAM, Philadelphia, 1990. 0813.62001
[30] D.-X. Zhou. Derivative reproducing properties for kernel methods in learning theory., Journal of Computational and Applied Mathematics, 220:456–463, 2008. 1152.68049 10.1016/j.cam.2007.08.023[30] D.-X. Zhou. Derivative reproducing properties for kernel methods in learning theory., Journal of Computational and Applied Mathematics, 220:456–463, 2008. 1152.68049 10.1016/j.cam.2007.08.023