The Annals of Applied Statistics
- Ann. Appl. Stat.
- Volume 11, Number 3 (2017), 1349-1374.
Comparing healthcare utilization patterns via global differences in the endorsement of current procedural terminology codes
The linkage of electronic medical records (EMR) across clinics, hospitals, and healthcare systems is opening new opportunities to evaluate factors associated with both individual treatment benefit and potential harm. For example, the FDA Sentinel initiative seeks to create a surveillance network with over 100 million patient lives (Behrman et al. [N. Engl. J. Med. 364 (2011) 498–499]), while PCORnet has created multiple networks that include linked electronic medical records from geographic regions such as entire cities or states, with the ultimate goal of facilitating comparative effectiveness research (Collins et al. [Journal of the American Medical Informatics Association 4 (2014) 576–577]). However, one key challenge to the use of electronically assembled cohorts is the potential for variation in both the choice of specific healthcare procedures and coding practices due to differences in patient populations and/or financial incentives within care delivery networks. In order to explore variation in patient care or procedure coding, we review and develop statistical methods that can permit testing and estimation of subgroup differences in code assignments. We focus on Current Procedural Terminology (CPT) codes which are used in a standardized fashion to capture patient treatment details and to record medical histories, but the methods we develop can be used for any structured EMR data. We specifically study testing procedures that can be valid for comparing both rare and common counts as routinely encountered with medical procedure codes, and we transfer methods from studies of genetic association. Hierarchical structure in terms of both thematically grouped medical codes and provider-level clustering adds unique complexity to the analysis of EMR data. We detail penalized regression methods unifying estimation and inference to leverage the hierarchical structure and stabilize rate ratio estimates for rare procedures. We also expand inference methods to account for potential within provider correlation of patient utilization. We illustrate methods comparing the endorsement of CPT codes for subjects enrolled in a back pain cohort study where interest is in the differences across recruitment centers in the use of CPT codes (Jarvik [BMC Musculoskelet Disord. 13 (2012)]).
Ann. Appl. Stat. Volume 11, Number 3 (2017), 1349-1374.
Received: November 2016
Revised: January 2017
First available in Project Euclid: 5 October 2017
Permanent link to this document
Digital Object Identifier
Shi, Xu; Pashova, Hristina; Heagerty, Patrick J. Comparing healthcare utilization patterns via global differences in the endorsement of current procedural terminology codes. Ann. Appl. Stat. 11 (2017), no. 3, 1349--1374. doi:10.1214/17-AOAS1028. https://projecteuclid.org/euclid.aoas/1507168832
- Supplement A: Comprehensive discussion on code-wise two-sample testing options. We provide detailed review of testing strategies that are candidates for the evaluation of variation in code endorsement rates across cohorts.
- Supplement B: Proof of Lemma 3.1. We provide a proof of Lemma 3.1.
- Supplement C: Comprehensive review of simulation results comparing group-wise association tests. We provide a review of relevant results in previous research comparing group-wise association tests.
- Supplement D: Comprehensive plots of type I error and power. We provide additional supporting plots that show the type I error and power of all tests with equal/unequal sample sizes using generated data of independent observations or under provider-level clustering.