Abstract
Semisupervised methods are techniques for using labeled data $(X_{1},Y_{1}),\ldots,(X_{n},Y_{n})$ together with unlabeled data $X_{n+1},\ldots,X_{N}$ to make predictions. These methods invoke some assumptions that link the marginal distribution $P_{X}$ of $X$ to the regression function $f(x)$. For example, it is common to assume that $f$ is very smooth over high density regions of $P_{X}$. Many of the methods are ad-hoc and have been shown to work in specific examples but are lacking a theoretical foundation. We provide a minimax framework for analyzing semisupervised methods. In particular, we study methods based on metrics that are sensitive to the distribution $P_{X}$. Our model includes a parameter $\alpha$ that controls the strength of the semisupervised assumption. We then use the data to adapt to $\alpha$.
Citation
Martin Azizyan. Aarti Singh. Larry Wasserman. "Density-sensitive semisupervised inference." Ann. Statist. 41 (2) 751 - 771, April 2013. https://doi.org/10.1214/13-AOS1092
Information