The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 12, Number 2 (2018), 915-939.
Clustering the prevalence of pediatric chronic conditions in the United States using distributed computing
This research paper presents an approach to clustering the prevalence of chronic conditions among children with public insurance in the United States. The data consist of prevalence estimates at the community level for 25 pediatric chronic conditions. We employ a spatial clustering algorithm to identify clusters of communities with similar chronic condition prevalences. The primary challenge is the computational effort needed to estimate the spatial clustering for all communities in the U.S. To address this challenge, we develop a distributed computing approach to spatial clustering. Overall, we found that the burden of chronic conditions in rural communities tends to be similar but with wide differences in urban communities. This finding suggests similar interventions for managing chronic conditions in rural communities but targeted interventions in urban areas.
Ann. Appl. Stat., Volume 12, Number 2 (2018), 915-939.
Received: November 2017
Revised: April 2018
First available in Project Euclid: 28 July 2018
Permanent link to this document
Digital Object Identifier
Mathematical Reviews number (MathSciNet)
Zheng, Yuchen; Serban, Nicoleta. Clustering the prevalence of pediatric chronic conditions in the United States using distributed computing. Ann. Appl. Stat. 12 (2018), no. 2, 915--939. doi:10.1214/18-AOAS1173. https://projecteuclid.org/euclid.aoas/1532743481
- Supplement to “Clustering the prevalence of pediatric chronic conditions in the United States using distributed computing”. Supplementary Materials contain four sections. In Supplementary Material A, we describe the approach for estimating the census tract prevalence for chronic conditions using the Medicaid Analytic eXtract (MAX) claims data. In Supplementary Material B, we provide further details on the selection of the number of clusters. In Supplementary Material C, we present additional mosaic maps showing the composition of each cluster by state and urbanicity for all the states in our analysis. In Supplementary Material D, we share the implementation of the distributed computing approach for spatial clustering along with a read me file for guidance on how to use the software implementation.