The Service Annual Survey (SAS) is a business survey conducted annually by the U.S. Census Bureau that collects aggregate and detailed revenues and expenses data. Typical of many business surveys, the SAS population is highly positively skewed, with large companies comprising a large proportion of the published totals. When alternative data are not available, missing data are handled with ratio imputation models that assume missingness is at random. We propose a proxy pattern-mixture (PPM) model that provides a simple framework for assessing nonresponse bias with respect to different nonresponse mechanisms. PPM models were first introduced in this context by Andridge and Little [Journal of Official Statistics 27 (2011) 153–180], but their model assumed the characteristic of interest and the predicted proxy have a bivariate normal distribution, conditional on the missingness indicator. Although often appropriate for large demographic surveys, the normality assumption is less justifiable for the highly skewed SAS data. We propose an alternative PPM model using a bivariate gamma distribution more appropriate for the SAS data. We compare the two PPM models through application to data from six years of data collection in three industries in the health care and transportation sectors of the SAS. Finally, we illustrate properties of the method through simulation.
"Assessing nonresponse bias in a business survey: Proxy pattern-mixture analysis for skewed data." Ann. Appl. Stat. 9 (4) 2237 - 2265, December 2015. https://doi.org/10.1214/15-AOAS878