The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 3 (2018), 1831-1852.
Using missing types to improve partial identification with application to a study of HIV prevalence in Malawi
Frequently, empirical studies are plagued with missing data. When the data are missing not at random, the parameter of interest is not identifiable in general. Without additional assumptions, we can derive bounds of the parameters of interest, which, unfortunately, are often too wide to be informative. Therefore, it is of great importance to sharpen these worst-case bounds by exploiting additional information. Traditional missing data analysis uses only the information of the binary missing data indicator, that is, a certain data point is either missing or not. Nevertheless, real data often provide more information than a binary missing data indicator, and they often record different types of missingness. In a motivating HIV status survey, missing data may be due to the units’ unwillingness to respond to the survey items or their hospitalization during the visit, and may also be due to the units’ temporarily absence or relocation. It is apparent that some missing types are more likely to be missing not at random, but other missing types are more likely to be missing at random. We show that making full use of the missing types results in narrower bounds of the parameters of interest. In a real-life example, we demonstrate substantial improvement of more than 50% reduction in bound widths for estimating the prevalence of HIV in rural Malawi. As we illustrate using the HIV study, our strategy is also useful for conducting sensitivity analysis by gradually increasing or decreasing the set of types that are missing at random. In addition, we propose an easy-to-implement method to construct confidence intervals for partially identified parameters with bounds expressed as the minimums and maximums of finite parameters, which is useful for not only our problem but also many other problems involving bounds.
Ann. Appl. Stat., Volume 12, Number 3 (2018), 1831-1852.
Received: August 2017
Revised: December 2017
First available in Project Euclid: 11 September 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Jiang, Zhichao; Ding, Peng. Using missing types to improve partial identification with application to a study of HIV prevalence in Malawi. Ann. Appl. Stat. 12 (2018), no. 3, 1831--1852. doi:10.1214/17-AOAS1133. https://projecteuclid.org/euclid.aoas/1536652976
- Supplement to “Using missing types to improve partial identification with application to a study of HIV prevalence in Malawi”. The supplementary material consists of four parts. Section S1 gives the proofs of the theorems of the bounds. Section S2 gives the testable conditions with multiple time points. Section S3 gives the proofs of the theorem and corollary for constructing confidence interval. Section S4 shows the results of the simulation studies.