September 2024 Quantile regression decomposition analysis of disparity research using complex survey data: Application to disparities in BMI and telomere length between U.S. minority and white population groups
Hyokyoung G. Hong, Barry I. Graubard, Joseph L. Gastwirth, Mi-Ok Kim
Author Affiliations +
Ann. Appl. Stat. 18(3): 2012-2033 (September 2024). DOI: 10.1214/23-AOAS1868

Abstract

We develop a quantile regression decomposition (QRD) method for analyzing observed disparities (OD) between population groups in socioeconomic and health-related outcomes for complex survey data. The conventional decomposition approaches use the conditional mean regression to decompose the disparity into two parts, the part explained by the difference arising from the different distributions in the explanatory covariates and the remaining part, which is unexplained by the covariates. Many socioeconomic and health outcomes exhibit heteroscedastic distributions, where the magnitude of observed disparities varies across different quantiles of these outcomes. Thus, differences in the explanatory covariates may account for varying differences in the OD across the quantiles of the outcome. The QRD can identify where there are greater differences in the outcome distribution, for example, 90th quantile, and how important the covariates are in explaining those differences. Much socioeconomic and health research relies on complex surveys, such as the National Health and Nutrition Examination Survey (NHANES), that oversample individuals from disadvantaged/minority population groups in order to provide improved precision. QRD has not been extended to the complex survey setting. We improve the QRD approach proposed in Machado and Mata (2005) to yield more reliable estimates at the quantiles, where the data are sparse, and extend it to the complex survey setting. We also propose a perturbation-based variance estimation method. Simulation studies indicate that the estimates of the unexplained portions of the OD across quantiles are unbiased and the coverage of the confidence intervals are close to nominal value. This methodology is used to study disparities in body mass index (BMI) and telomere length between race/ethnic groups estimated from the NHANES data.

Acknowledgments

All analyses were conducted using the R software (R Core Team (2020)), utilizing the computational resources of the NIH HPC Biowulf cluster (http://hpc.nih.gov).

Citation

Download Citation

Hyokyoung G. Hong. Barry I. Graubard. Joseph L. Gastwirth. Mi-Ok Kim. "Quantile regression decomposition analysis of disparity research using complex survey data: Application to disparities in BMI and telomere length between U.S. minority and white population groups." Ann. Appl. Stat. 18 (3) 2012 - 2033, September 2024. https://doi.org/10.1214/23-AOAS1868

Information

Received: 1 September 2022; Revised: 1 November 2023; Published: September 2024
First available in Project Euclid: 5 August 2024

Digital Object Identifier: 10.1214/23-AOAS1868

Keywords: complex survey data , disparity decomposition , perturbation-based variance estimation , Peters–Belson , Quantile regression

Rights: Copyright © 2024 Institute of Mathematical Statistics

Vol.18 • No. 3 • September 2024
Back to Top