Open Access
Translator Disclaimer
February 2019 Permutation $p$-value approximation via generalized Stolarsky invariance
Hera Y. He, Kinjal Basu, Qingyuan Zhao, Art B. Owen
Ann. Statist. 47(1): 583-611 (February 2019). DOI: 10.1214/18-AOS1702


It is common for genomic data analysis to use $p$-values from a large number of permutation tests. The multiplicity of tests may require very tiny $p$-values in order to reject any null hypotheses and the common practice of using randomly sampled permutations then becomes very expensive. We propose an inexpensive approximation to $p$-values for two sample linear test statistics, derived from Stolarsky’s invariance principle. The method creates a geometrically derived reference set of approximate $p$-values for each hypothesis. The average of that set is used as a point estimate $\hat{p}$ and our generalization of the invariance principle allows us to compute the variance of the $p$-values in that set. We find that in cases where the point estimate is small, the variance is a modest multiple of the square of that point estimate, yielding a relative error property similar to that of saddlepoint approximations. On a Parkinson’s disease data set, the new approximation is faster and more accurate than the saddlepoint approximation. We also obtain a simple probabilistic explanation of Stolarsky’s invariance principle.


Download Citation

Hera Y. He. Kinjal Basu. Qingyuan Zhao. Art B. Owen. "Permutation $p$-value approximation via generalized Stolarsky invariance." Ann. Statist. 47 (1) 583 - 611, February 2019.


Received: 1 March 2016; Revised: 1 February 2018; Published: February 2019
First available in Project Euclid: 30 November 2018

zbMATH: 07036212
MathSciNet: MR3909943
Digital Object Identifier: 10.1214/18-AOS1702

Primary: 62G10
Secondary: 11K38 , 62G09

Keywords: Discrepancy , gene sets , Hypothesis testing , quasi-Monte Carlo

Rights: Copyright © 2019 Institute of Mathematical Statistics


Vol.47 • No. 1 • February 2019
Back to Top