Density-sensitive semisupervised inference

Martin Azizyan, Aarti Singh, and Larry Wasserman
Source: Ann. Statist. Volume 41, Number 2 (2013), 751-771.

Abstract

Semisupervised methods are techniques for using labeled data $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ together with unlabeled data $X_{n+1},\ldots,X_{N}$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_{X}$ of $X$ to the regression function $f(x)$. For example, it is common to assume that $f$ is very smooth over high density regions of $P_{X}$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_{X}$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.

First Page:

Related Works:

Primary Subjects: 62G15
Secondary Subjects: 62G07
We're sorry, but we are unable to provide you with the full text of this article because we are not able to identify you as a subscriber.

Permanent link to this document: http://projecteuclid.org/euclid.aos/1368018172
Digital Object Identifier: doi:10.1214/13-AOS1092
Zentralblatt MATH identifier: 1267.62057
Mathematical Reviews number (MathSciNet): MR3099120

References

Azizyan, M., Singh, A. and Wasserman, L. (2013). Supplement to “Density-sensitive semisupervised inference.” DOI:10.1214/13-AOS1092SUPP.
Belkin, M. and Niyogi, P. (2004). Semi-supervised learning on Riemannian manifolds. Machine Learning 56 209–239.
Ben-David, S., Lu, T. and Pal, D. (2008). Does unlabeled data provably help? Worst-case analysis of the sample complexity of semi-supervised learning. In 21st Annual Conference on Learning Theory (COLT). Available at http://www.informatik.uni-trier.de/~ley/db/conf/colt/colt2008.html.
Bijral, A., Ratliff, N. and Srebro, N. (2011). Semi-supervised learning with density based distances. In 27th Conference on Uncertainty in Artificial Intelligence. Available at http://auai.org/uai2011/accepted.html.
Bousquet, O., Chapelle, O. and Hein, M. (2004). Measure based regularization. In Advances in Neural Information Processing Systems 16. MIT Press, Cambridge, MA.
Castelli, V. and Cover, T. M. (1995). On the exponential value of labeled samples. Pattern Recognition Letters 16 105–111.
Castelli, V. and Cover, T. M. (1996). The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. Inform. Theory 42 2102–2117.
Mathematical Reviews (MathSciNet): MR1447517
Digital Object Identifier: doi:10.1109/18.556600
Craig, C. C. (1933). On the Tchebychef inequality of Bernstein. Ann. Math. Statist. 4 94–102.
Culp, M. (2011a). On propagated scoring for semisupervised additive models. J. Amer. Statist. Assoc. 106 248–259.
Mathematical Reviews (MathSciNet): MR2816718
Digital Object Identifier: doi:10.1198/jasa.2011.tm09316
Culp, M. (2011b). spa: Semi-supervised semi-parametric graph-based estimation in R. Journal of Statistical Software 40 1–29.
Culp, M. and Michailidis, G. (2008). An iterative algorithm for extending learners to a semi-supervised setting. J. Comput. Graph. Statist. 17 545–571.
Mathematical Reviews (MathSciNet): MR2451341
Digital Object Identifier: doi:10.1198/106186008X344748
Györfi, L., Kohler, M., Krzyżak, A. and Walk, H. (2002). A Distribution-Free Theory of Nonparametric Regression. Springer, New York.
Mathematical Reviews (MathSciNet): MR1920390
Haupt, J. and Nowak, R. (2006). Signal reconstruction from noisy random projections. IEEE Trans. Inform. Theory 52 4036–4048.
Mathematical Reviews (MathSciNet): MR2298532
Digital Object Identifier: doi:10.1109/TIT.2006.880031
Kpotufe, S. (2011). $k$-NN regression adapts to local intrinsic dimension. In Advances in Neural Information Processing Systems 24 729–737. MIT Press, Cambridge, MA.
Lafferty, J. and Wasserman, L. (2007). Statistical analysis of semi-supervised regression. In Advances in Neural Information Processing Systems 20 801–808. MIT Press, Cambridge, MA.
Lee, A. B. and Wasserman, L. (2008). Spectral connectivity analysis. Preprint. Available at arXiv:0811.0121.
Mathematical Reviews (MathSciNet): MR2752618
Digital Object Identifier: doi:10.1198/jasa.2010.tm09754
Liang, F., Mukherjee, S. and West, M. (2007). The use of unlabeled data in predictive modeling. Statist. Sci. 22 189–205.
Mathematical Reviews (MathSciNet): MR2408958
Digital Object Identifier: doi:10.1214/088342307000000032
Project Euclid: euclid.ss/1190905518
Nadler, B., Srebro, N. and Zhou, X. (2009). Statistical analysis of semi-supervised learning: The limit of infinite unlabelled data. In Advances in Neural Information Processing Systems 22 1330–1338. MIT Press, Cambridge, MA.
Niyogi, P. (2008). Manifold regularization and semi-supervised learning: Some theoretical analyses. Technical Report TR-2008-01, Computer Science Dept., Univ. Chicago. Available at http://people.cs.uchicago.edu/~niyogi/papersps/ssminimax2.pdf.
Ratsaby, J. and Venkatesh, S. S. (1995). Learning from a mixture of labeled and unlabeled examples with parametric side information. In Proceedings of the Eighth Annual Conference on Computational Learning Theory 412–417. ACM, New York.
Rigollet, P. (2007). Generalized error bounds in semi-supervised classification under the cluster assumption. J. Mach. Learn. Res. 8 1369–1392.
Mathematical Reviews (MathSciNet): MR2332435
Sajama andOrlitsky, A. (2005). Estimating and computing density based distance metrics. In Proceedings of the 22nd International Conference on Machine Learning. ICML 2005 760–767. ACM, New York.
Singh, A., Nowak, R. D. and Zhu, X. (2008). Unlabeled data: Now it helps, now it doesn’t. Technical report, ECE Dept., Univ. Wisconsin–Madison. Available at http://www.cs.cmu.edu/~aarti/pubs/SSL_TR.pdf.
Sinha, K. and Belkin, M. (2009). Semi-supervised learning using sparse eigenfunction bases. In Advances in Neural Information Processing Systems 22 (Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams and A. Culotta, eds.) 1687–1695. MIT Press, Cambridge, MA.