Bayesian Analysis

Loss function based ranking in two-stage, hierarchical models

Rongheng Lin, Thomas A. Louis, Susan M. Paddock, and Greg Ridgeway

Full-text: Open access

Abstract

Performance evaluations of health services providers burgeons. Similarly, analyzing spatially related health information, ranking teachers and schools, and identification of differentially expressed genes are increasing in prevalence and importance. Goals include valid and efficient ranking of units for profiling and league tables, identification of excellent and poor performers, the most differentially expressed genes, and determining "exceedances" (how many and which unit-specific true parameters exceed a threshold). These data and inferential goals require a hierarchical, Bayesian model that accounts for nesting relations and identifies both population values and random effects for unit-specific parameters. Furthermore, the Bayesian approach coupled with optimizing a loss function provides a framework for computing non-standard inferences such as ranks and histograms.

Estimated ranks that minimize Squared Error Loss (SEL) between the true and estimated ranks have been investigated. The posterior mean ranks minimize SEL and are "general purpose," relevant to a broad spectrum of ranking goals. However, other loss functions and optimizing ranks that are tuned to application-specific goals require identification and evaluation. For example, when the goal is to identify the relatively good (e.g., in the upper 10%) or relatively poor performers, a loss function that penalizes classification errors produces estimates that minimize the error rate. We construct loss functions that address this and other goals, developing a unified framework that facilitates generating candidate estimates, comparing approaches and producing data analytic performance summaries. We compare performance for a fully parametric, hierarchical model with Gaussian sampling distribution under Gaussian and a mixture of Gaussians prior distributions. We illustrate approaches via analysis of standardized mortality ratio data from the United States Renal Data System.

Results show that SEL-optimal ranks perform well over a broad class of loss functions but can be improved upon when classifying units above or below a percentile cut-point. Importantly, even optimal rank estimates can perform poorly in many real-world settings; therefore, data-analytic performance summaries should always be reported.

Article information

Source
Bayesian Anal. Volume 1, Number 4 (2006), 915-946.

Dates
First available in Project Euclid: 22 June 2012

Permanent link to this document
https://projecteuclid.org/euclid.ba/1340370947

Digital Object Identifier
doi:10.1214/06-BA130

Mathematical Reviews number (MathSciNet)
MR2282211

Zentralblatt MATH identifier
1331.62063

Keywords
percentiling Bayesian models decision theory operating characteristic

Citation

Lin, Rongheng; Louis, Thomas A.; Paddock, Susan M.; Ridgeway, Greg. Loss function based ranking in two-stage, hierarchical models. Bayesian Anal. 1 (2006), no. 4, 915--946. doi:10.1214/06-BA130. https://projecteuclid.org/euclid.ba/1340370947


Export citation