Abstract
The study of networks leads to a wide range of high-dimensional inference problems. In many practical applications, one needs to draw inference from one or few large sparse networks. The present paper studies hypothesis testing of graphs in this high-dimensional regime, where the goal is to test between two populations of inhomogeneous random graphs defined on the same set of $n$ vertices. The size of each population $m$ is much smaller than $n$, and can even be a constant as small as 1. The critical question in this context is whether the problem is solvable for small $m$.
We answer this question from a minimax testing perspective. Let $P$, $Q$ be the population adjacencies of two sparse inhomogeneous random graph models, and $d$ be a suitably defined distance function. Given a population of $m$ graphs from each model, we derive minimax separation rates for the problem of testing $P=Q$ against $d(P,Q)>\rho $. We observe that if $m$ is small, then the minimax separation is too large for some popular choices of $d$, including total variation distance between corresponding distributions. This implies that some models that are widely separated in $d$ cannot be distinguished for small $m$, and hence, the testing problem is generally not solvable in these cases.
We also show that if $m>1$, then the minimax separation is relatively small if $d$ is the Frobenius norm or operator norm distance between $P$ and $Q$. For $m=1$, only the latter distance provides small minimax separation. Thus, for these distances, the problem is solvable for small $m$. We also present near-optimal two-sample tests in both cases, where tests are adaptive with respect to sparsity level of the graphs.
Citation
Debarghya Ghoshdastidar. Maurilio Gutzeit. Alexandra Carpentier. Ulrike von Luxburg. "Two-sample hypothesis testing for inhomogeneous random graphs." Ann. Statist. 48 (4) 2208 - 2229, August 2020. https://doi.org/10.1214/19-AOS1884
Information