February 2024 Dimension-agnostic inference using cross U-statistics
Ilmun Kim, Aaditya Ramdas
Author Affiliations +
Bernoulli 30(1): 683-711 (February 2024). DOI: 10.3150/23-BEJ1613

Abstract

Classical asymptotic theory for statistical inference usually involves calibrating a statistic by fixing the dimension d while letting the sample size n increase to infinity. Recently, much effort has been dedicated towards understanding how these methods behave in high-dimensional settings, where d and n both increase to infinity together. This often leads to different inference procedures, depending on the assumptions about the dimensionality, leaving the practitioner in a bind: given a dataset with 100 samples in 20 dimensions, should they calibrate by assuming nd, or dn0.2? This paper considers the goal of dimension-agnostic inference—developing methods whose validity does not depend on any assumption on d versus n. We introduce an approach that uses variational representations of existing test statistics along with sample splitting and self-normalization to produce a refined test statistic with a Gaussian limiting distribution, regardless of how d scales with n. The resulting statistic can be viewed as a careful modification of degenerate U-statistics, dropping diagonal blocks and retaining off-diagonal blocks. We exemplify our technique for some classical problems including one-sample mean and covariance testing, and show that our tests have minimax rate-optimal power against appropriate local alternatives. In most settings, our cross U-statistic matches the high-dimensional power of the corresponding (degenerate) U-statistic up to a 2 factor.

Funding Statement

Ilmun Kim acknowledges support from the Yonsei University Research Fund of 2022-22-0289 as well as support from the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (2022R1A4A1033384) and the Korea government (MSIT) RS-2023-00211073.

Acknowledgements

We thank Diego Martinez Taboada for the insightful comment that helped in removing an unnecessary eigenvalue condition in Theorem 4.2 of the previous manuscript. We also thank the referees for their constructive comments that significantly improved this paper.

Citation

Download Citation

Ilmun Kim. Aaditya Ramdas. "Dimension-agnostic inference using cross U-statistics." Bernoulli 30 (1) 683 - 711, February 2024. https://doi.org/10.3150/23-BEJ1613

Information

Received: 1 September 2021; Published: February 2024
First available in Project Euclid: 8 November 2023

MathSciNet: MR4665594
zbMATH: 07788900
Digital Object Identifier: 10.3150/23-BEJ1613

Keywords: Degenerate U-statistics , high-dimensional limits , Minimax optimality , sample splitting , studentization

Vol.30 • No. 1 • February 2024
Back to Top