We connect high-dimensional subset selection and submodular maximization. Our results extend the work of Das and Kempe [In ICML (2011) 1057–1064] from the setting of linear regression to arbitrary objective functions. For greedy feature selection, this connection allows us to obtain strong multiplicative performance bounds on several methods without statistical modeling assumptions. We also derive recovery guarantees of this form under standard assumptions. Our work shows that greedy algorithms perform within a constant factor from the best possible subset-selection solution for a broad class of general objective functions. Our methods allow a direct control over the number of obtained features as opposed to regularization parameters that only implicitly control sparsity. Our proof technique uses the concept of weak submodularity initially defined by Das and Kempe. We draw a connection between convex analysis and submodular set function theory which may be of independent interest for other statistical learning applications that have combinatorial structure.
"Restricted strong convexity implies weak submodularity." Ann. Statist. 46 (6B) 3539 - 3568, December 2018. https://doi.org/10.1214/17-AOS1679