The Annals of Applied Statistics

Skip sequencing: A decision problem in questionnaire design

Charles F. Manski and Francesca Molinari

Full-text: Open access


This paper studies questionnaire design as a formal decision problem, focusing on one element of the design process: skip sequencing. We propose that a survey planner use an explicit loss function to quantify the trade-off between cost and informativeness of the survey and aim to make a design choice that minimizes loss. We pose a choice between three options: ask all respondents about an item of interest, use skip sequencing, thereby asking the item only of respondents who give a certain answer to an opening question, or do not ask the item at all. The first option is most informative but also most costly. The use of skip sequencing reduces respondent burden and the cost of interviewing, but may spread data quality problems across survey items, thereby reducing informativeness. The last option has no cost but is completely uninformative about the item of interest. We show how the planner may choose among these three options in the presence of two inferential problems, item nonresponse and response error.

Article information

Ann. Appl. Stat., Volume 2, Number 1 (2008), 264-285.

First available in Project Euclid: 24 March 2008

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Skip sequencing questionnaire design item nonresponse response error partial identification


Manski, Charles F.; Molinari, Francesca. Skip sequencing: A decision problem in questionnaire design. Ann. Appl. Stat. 2 (2008), no. 1, 264--285. doi:10.1214/07-AOAS134.

Export citation


  • Beresteanu, A. and Molinari, F. (2008). Asymptotic properties for a class of partially identified models., Econometrica. To appear.
  • Blundell, R., Gosling, A., Ichimura, H. and Meghir, C. (2007). Changes in the distribution of male and female wages accounting for employment composition using bounds., Econometrica 75 323–363.
  • Bound, J., Brown, C. and Mathiowetz, N. A. (2001). Measurement error in survey data. In, Handbook of Econometrics 5 Chapter 59 (J. Heckman and E. Leamer, eds.) 3705–3843. North-Holland, Amsterdam.
  • Chernozhukov, V., Hong, H. and Tamer, E. (2007). Estimation and confidence regions for parameter sets in econometric models., Econometrica 75 1243–1284.
  • Connor, J. T., Fienberg, S. E., Erashova, A. E. and White, T. (2006). Towards a restructuring of the national long term care survey: A longitudinal perspective. Prepared for presentation at an Expert Panel Meeting on the National Long Term Care Survey, Committee on National Statistics, National Research, Council.
  • Groves, R. M. (1987). Research on survey data quality., The Public Opinion Quarterly 51 S156–S172.
  • Groves, R. M. (1989)., Survey Errors and Survey Costs. Wiley, New York.
  • Groves, R. M. and Heeringa, S. G. (2006). Responsive design for household surveys: Tools for actively controlling survey errors and costs., J. Roy. Statist. Soc. Ser. A 169 437–457.
  • Hill, D. H. (1991). Interviewer, respondent, and regional office effects on response variance: A statistical decomposition. In, Measurement Errors in Surveys (P. Biemer, R. Groves, L. Lyberg, N. Mathiowetz and S. Sudman, eds.) 463–483. Wiley, New York.
  • Hill, D. H. (1993). Response and sequencing errors in surveys: A discrete contagious regression analysis., J. Amer. Statist. Assoc. 88 775–781.
  • Huber, P. (1964). Robust estimation of a location parameter., Ann. Math. Statist. 35 73–101.
  • Horowitz, J. L. and Manski, C. F. (1995). Identification and robustness with contaminated and corrupted data., Econometrica 63 281–302.
  • Imbens, G. and Manski, C. F. (2004). Confidence intervals for partially identified parameters., Econometrica 72 1845–1857.
  • Krosnick, J. (1999). Survey research., Ann. Rev. Psychology 50 537–567.
  • Little, R. J. and Rubin, D. B. (1987)., Statistical Analysis with Missing Data. Wiley, New York.
  • Manski, C. F. (1989). Anatomy of the selection problem., J. Human Resources 24 343–360.
  • Manski, C. F. (1994). The selection problem. In, Advances in Econometrics, Sixth World Congress I (C. Sims, ed.) 143–170. Cambridge Univ. Press.
  • Manski, C. F. (2003)., Partial Identification of Probability Distributions. Springer, New York.
  • Mathiowetz, N. A. and Groves, R. M. (1985). The effects of respondent rules on health survey reports., American J. Public Health 75 639–644.
  • Mathiowetz, N. A. and Lair, T. J. (1994). Getting better? Change or error in the measurement of functional limitations., J. Economic and Social Measurement 20 237–262.
  • Mathiowetz, N. A. and Wunderlich, G. S., eds. (2000)., Survey Measurement of Work Disability. National Academy Press, Washington, DC.
  • Messmer, D. and Seymour, D. (1982). The effect of branching on item nonresponse., Public Opinion Quarterly 46 270–277.
  • Miller, K. and DeMaio, T. J. (2006). Report of cognitive research on proposed American community survey disability questions. U.S. Census Bureau, Statistical Research Division Report, #SSM2006/06.
  • Molinari, F. (2003). Contaminated, corrupted, and missing data. Ph.D. thesis, Northwestern Univ. Available at,
  • Molinari, F. (2008). Partial identification of probability distributions with misclassified data., J. Econometrics. To appear.
  • Moore, J. C. (1988). Self/proxy response status and survey response quality., J. Official Statistics 4 155–172.
  • Rodgers, W. L. and Miller, B. (1997). A comparative analysis of ADL questions in surveys of older people., J. Gerontology Ser. B 52B (Special Issue) 21–36.
  • Rubenstein, L. Z., Schairer, C., Wieland, G. D. and Kane, R. (1984). Systematic biases in functional status assessment of elderly adults: Effects of different data sources., J. Gerontology 39 686–691.
  • Spencer, B. D. (1980)., Benefit–Cost Analysis of Data Used to Allocate Funds. Springer, New York.
  • Spencer, B. D. (1985). Optimal data quality., J. Amer. Statist. Assoc. 80 564–573.
  • Spencer, B. D. (1994). Sensitivity of benefit–cost analysis of data programs to monotone misspecification., J. Statist. Plann. Inference 39 19–31.
  • Stoye, J. (2005). Partial identification of spread parameters when some data are missing. Dept. Economics, New York, Univ.
  • Wald, A. (1950)., Statistical Decision Functions. Wiley, New York.