The problem of multiple kernel learning based on penalized empirical risk minimization is discussed. The complexity penalty is determined jointly by the empirical L2 norms and the reproducing kernel Hilbert space (RKHS) norms induced by the kernels with a data-driven choice of regularization parameters. The main focus is on the case when the total number of kernels is large, but only a relatively small number of them is needed to represent the target function, so that the problem is sparse. The goal is to establish oracle inequalities for the excess risk of the resulting prediction rule showing that the method is adaptive both to the unknown design distribution and to the sparsity of the problem.
"Sparsity in multiple kernel learning." Ann. Statist. 38 (6) 3660 - 3695, December 2010. https://doi.org/10.1214/10-AOS825