Abstract
In a range of genomic applications, it is of interest to quantify the evidence that the signal at site i is active given conditionally independent replicate observations summarized by the sample mean and variance at each site. We study the version of the problem in which the signal distribution is sparse, and the error distribution has an unknown site-specific variance so that the null distribution of the standardized statistic is Student-t rather than Gaussian. The main contribution of this paper is a sparse-mixture approximation to the non-null density of the t-ratio. This formula demonstrates the effect of low degrees of freedom on the Bayes factor, or the conditional probability that the site is active. We illustrate some differences on a HIV dataset for gene-expression data.
Citation
Micól Tresoldi. Daniel Xiang. Peter McCullagh. "Sparse-limit approximation for t-statistics." Electron. J. Statist. 18 (1) 1586 - 1602, 2024. https://doi.org/10.1214/24-EJS2238
Information