Motif-based tests for bipartite networks

Bipartite networks are a natural representation of the interactions between entities from two different types. The organization (or topology) of such networks gives insight to understand the systems they describe as a whole. Here, we rely on motifs which provide a meso-scale description of the topology. Moreover, we consider the bipartite expected degree distribution (B-EDD) model which accounts for both the density of the network and possible imbalances between the degrees of the nodes. Under the B-EDD model, we prove the asymptotic normality of the count of any given motif, considering sparsity conditions. We also provide close-form expressions for the mean and the variance of this count. This allows to avoid computationally prohibitive resampling procedures. Based on these results, we define a goodness-of-fit test for the B-EDD model and propose a family of tests for network comparisons. We assess the asymptotic normality of the test statistics and the power of the proposed tests on synthetic experiments and illustrate their use on ecological data sets.


Introduction
Bipartite interaction networks are used to represent a diverse range of interactions in various fields such as biology, ecology, sociology or economics. For instance, in ecology, bipartite graphs depict interactions between two groups of species such as plants and pollinators (see e.g. Simmons et al., 2019b;Doré et al., 2020) or host and parasites (see e.g. Vacher et al., 2008;D'Bastiani et al., 2020), in agroethnology, they may involve interactions between farmers and crop species (see Thomas et al., 2015) and in economics, country-product trades as signals of the 2007-2008 financial crisis (see Saracco et al., 2016). Formally, a bipartite interaction network can be viewed as a bipartite graph, the nodes of which being individuals pertaining to two different groups, and an edge between two nodes being present if these two individuals interact. In the sequel, the two types of nodes will be referred to as top nodes and bottom nodes, respectively. Characterizing the general organization of such a network, namely its topology, is key to understand the behavior of the system as a whole.
The topology of a network can be studied at various scales. Micro-scale analyses typically focus on the degree of each node, the betweenness of each edge or on the closeness between each pair of nodes. On the opposite, macro-scale analysis focus on global properties of the network such as its density or its modularity. The reader may refer to Newman (2003) or Simmons et al. (2019b) for a general discussion. In this paper, we are mostly interested in the meso-scale description of the network that is provided by the frequency of motifs (Milo et al., 2002). A motif is defined as a given subgraph depicting the interactions between a small number of nodes; the count of a motif consists in the number of occurrences of this subgraph in the observed network. Figures 7 and 8 display the set of all bipartite motifs involving up to 6 top or bottom nodes. Counting the occurrences of a motif is a computationally challenging task (see Milo et al., 2002;Picard et al., 2008, for simple-i.e. non-bipartite -networks); efficient tools have been recently proposed by Simmons et al. (2019a,b) for bipartite networks.
Whatever the description scale, the analysis must account for a series of characteristics of the network at hand (such as its dimension or its density) to make the results comparable. A convenient way to account for such peculiarities is to define a null model capable to fit the network characteristics. We consider here a bipartite and exchangeable version of the expected degree distribution model proposed by Chung and Lu (2002) for simple binary graphs. The bipartite expected degree distribution (B-EDD) model simply states that each (top or bottom) node is associated with an expected degree and that a pair of nodes is connected with a probability that is proportional to the product of their respective expected degrees. The B-EDD model can obviously accommodate to the network dimension (number of top and bottom nodes), for its density but also for some existing imbalances between the degrees of the nodes. Such imbalances play an important role in many fields: in ecology they are related to the opposition between generalist insects (capable of pollinate a large number of plant species) and specialist insects (interacting with a limited number of plant species) (see e.g. Vázquez and Aizen, 2004;Bascompte and J., 2006;Simmons et al., 2019b). This opposition is one of the most probable cause for nested structures observed in mutualistic networks (see Vázquez and Aizen, 2004;Bascompte, 2009). In addition to its interpretation, this model is attractive because we can calculate the expected frequency of motifs under B-EDD such as their variance.
The distribution of motif counts in simple graphs has been widely studied, especially for simple motifs like triangles (see e.g. Nowicki and Wierman, 1988;Stark, 2001;Picard et al., 2008). In this paper, we prove the asymptotic normality of the count of any given motif under the B-EDD model, under sparsity conditions. One important feature of the B-EDD model is that the mean and the variance of the count have close form expressions. The strategy to derive these moments is related to the one introduced by Picard et al. (2008) for simple networks. This property has a major practical impact as the expectation and the variance of a motif count could not be evaluated via resampling, because of the computational cost of motif counting event for networks with intermediate size. The knowledge of the asymptotic distribution of the motif counts opens a series of possible applications, including goodness-of-fit tests for the B-EDD model and a series of tests for network comparison in the B-EDD framework.
The paper is organized as follows. Section 2 is devoted to the definition and properties of motifs in the B-EDD model and Section 3 to tests for bipartite networks. More specifically, we establish the asymptotic normality of motif frequencies in Section 3.1 and propose a goodness-of-fit test for the B-EDD model and comparison tests for two bipartite networks in Section 3.2 and Section 3.3, respectively. The accuracy of the normal approximation for finite graphs and the power of the proposed tests are assessed via a simulation study in Section 4. Finally, proofs are given in Section 5.

Motifs in the bipartite expected degree model
We consider a bipartite graph  = (, ) with nodes. The set of nodes is  = ( ,  ), where  = J1, K (resp.  = J1, K) stands for the set of top (resp. bottom) nodes, and the set of edges is  ⊂  ×  , meaning than an edge can only connect a top node with a bottom node. The total number of nodes is therefore = + . We denote by the corresponding × incidence matrix where the entry of is 1 if ( , ) ∈ , and 0 otherwise.
The parameter controls the density of the graph ( = ) whereas the function (resp. ℎ) encodes the heterogeneity of the expected degrees of the top (resp. bottom) nodes. More specifically, denoting = ∑ 1≤ ≤ the degree of the top node , we have that ( | ) = ( ). The symmetric property holds for bottom nodes. Remark 1. Lovász and Szegedy (2006) and Diaconis and Janson (2008) introduced a generic model for exchangeable random graphs called the -graph, which is based on a graphon function Φ ∶ [0, 1] 2 ↦ [0, 1]. The B-EDD model is a natural extension of the -graph for bipartite graphs with a product-form graphon function Φ( , ) = ( )ℎ( ). The B-EDD model is obviously exchangeable is the sense that the distribution of the incidence matrix is preserved under permutation of the top nodes and/or the bottom nodes.
Remark 2. The B-EDD model can also be seen as an exchangeable bipartite version of the expected degree sequence model studied in Chung and Lu (2002) and of the configuration model from Newman (2003). Under these two models, the degree of each node is fixed which makes them non exchangeable.

Remark 3.
In the case where the functions ℎ and are both piecewise constant, the B-EDD model actually corresponds to a specific form of latent block model (Govaert and Nadif, 2008), where the connexion probabilities have a product form.

Bipartite motifs in the B-EDD model
Bipartite motifs. We are interested in the distribution of the count of motifs (or subgraphs) in bipartite graphs arising from the B-EDD model. A bipartite motif is defined by its number of top nodes , its number of bottom nodes and a × incidence matrix . Figures 7 and 8 display the 44 bipartite motifs involving between two and six nodes, from which we see that ) .
An important characteristic of a graph motif is its number of automorphisms (Stark, 2001), that is the number of non-redundant permutations of its incidence matrix (see, e.g. section 2.4 in Picard et al. (2008)): Note that, because pairs of permutations ( , ) yielding the same matrix , are not counted twice, we obviously have that ≤ ( !) × ( !). In many cases, turns out to be much smaller: in particular, = 1 for star-motifs, which will be defined later. We further denote by the degree of the top node (1 ≤ ≤ ) within motif , that is = ∑ 1≤ ≤ , . The degree of the bottom node within is defined similarly as = ∑ 1≤ ≤ , .
Motif occurrence. Counting the occurrences of motif in  simply consists in considering all possible of (resp. ) top (resp. bottom) nodes among the (resp. ) and check for each possible automorphism of if an occurrence is observed. More formally, let us define the set  of possible positions for motif as the Cartesian product of the set of the ( )( ) possible locations with the set of the (top, bottom) permutations giving rise to each of the automorphisms of . So, a position results from the combination of a location with a permutation. Because the graph is bipartite, any position from  decomposes as = ( , ) where stands for an ordered list of top nodes and for an ordered list of bottom nodes. The number of positions for motif in  is precisely 298 S. Ouadah et al. Now, for a given position = ( , ) ∈  , we define ( ) as the indicator for motif to occur in position :  Lovász and Szegedy, 2006;Diaconis and Janson, 2008, for simple graphs). For any exchangeable graph model, we may define as the probability for motif to occur in position = ( , ): Importantly, because the model is exchangeable, this probability does not depend on .
Star motifs. We define a star as a bipartite motif for which either = 1 or = 1 (or both). More specifically, we name top stars (resp bottom stars) motifs for which = 1 (resp. = 1). The top stars in Figures 7 and 8 are motifs 1, 2, 7, 17 and 44, and the bottom stars are motifs 1, 3, 4, 8 and 18. Observe that = 1 for all star motifs, that = 1 for all in all top star motifs, and that = 1 for all in all bottom star motifs. Because they will play a central role in the sequel, we adopt a specific notation for the probability of star motifs, denoting the occurrence probability of the top star with degree and for the occurrence probability of the bottom star with degree . As a consequence, we have that 1 = 1 , 2 = 2 , 3 = 7 4 = 17 , 5 = 44 , (5) 1 = 1 , 2 = 3 , 3 = 4 4 = 8 , 5 = 18 .

Moments of motif counts
Expected count. Let us now denote by the count, that is the number of occurrences of a motif in a graph . We simply have that = ∑ ∈ ( ).
As a consequence, the expected count of in  is ( ) = . We also define the normalized frequency of motif as which is an unbiased estimate of .

Illustration.
As an illustration, we consider two of the networks studied by Simmons et al. (2019a), which include both plant-pollinator and seed dispersal networks extracted from the Web of Life database (www.web-of-life.es). More specifically, we consider the two largest networks of each type, which were first published by Robertson (1929) and Silva (2002), respectively. The plantpollinator network involves 546 plant species and 1044 insects and the seed dispersal network 207 plant species and 110 seed dispersers (birds or insects). Our purpose is not to provide a thorough ecological analysis of these networks, but to exemplify the proposed methodology. Table 1 gives the counts and the frequency of the star motifs with up to four branches. For the sake of clarity, we will limit ourselves to motifs up to five nodes in the illustrations. Observe that both the counts and the number of possible positions range over huge order of magnitudes. 1.24 10 6 4.47 10 7 1.20 10 9 2.35 10 6 1.60 10 8 8.17 10 9 1.12 10 3 6.50 10 3 4.07 10 4 2.32 10 5 1.24 10 4 1.31 10 5 1.23 10 6 4.92 10 −2 5.23 10 −3 9.11 10 −4 1.94 10 −4 5.28 10 −3 8.16 10 −4 1.50 10 −4

Main property of motif probabilities under B-EDD.
The tests we propose rely on the comparison between the observed count (or normalized frequency) of a motif, with its theoretical counterpart under a B-EDD model. More specifically, the motif probabilities denoted by have a close form expression under the B-EDD model.
where + ∶= ∑ = ∑ stands for the total number of edges in .
Proof. This follows from the fact that, under B-EDD, the edges are independent conditionally on the latent coordinates and defined in (1), which are all independent with respect to one other. Consider an arbitrary position = ( , ); for the sake of clarity, we identify the elements of with J1, K and the elements of with J1, K. We have ) .
The result then results from the fact that An important consequence of Proposition 1 is that, under B-EDD, the motif probability of any motif can be expressed in terms of probabilities of star motifs. Figure 1 provides an intuition of this: a motif can be decomposed in terms of top and bottom stars arising from each of its nodes.
In the sequel, to distinguish the motif probability under an arbitrary exchangeable model from the probability under the B-EDD model, we will denote by the probability of motif under B-EDD. Figure 7 provides the list of all expressions.
Probability estimate under B-EDD. Proposition 1 suggests a natural plug-in estimator for the B-EDD motif probability : where Γ (resp Λ ) denotes the normalized frequency of the top (resp. bottom) star motif with degree . Obviously, Γ (resp Λ ) is an unbiased estimated of (resp. ).
Variance of the count. We now consider the variance of the count, that is When positions and are equal, the product ( ) ( ) is simply given by ( ), the indicator of the presence of at position . Then, when positions and do not overlap (| ∩ | = 0), the product ( ) ( ) simply indicates that two occurrences of motif occur in position and , which are independent under the B-EDD model. When positions and are different and do overlap (| ∩ | > 0), the product ( ) ( ) becomes the indicator of a super-motif, that is a motif made of two overlapping automorphisms of . We denote by  2 ( ) the set of super-motifs generated by the overlaps of two occurrences of the motif ; Figure 2 provides some examples of super-motifs.
An expression similar to (9) can be derived for the covariance between two counts: Again, the last term corresponds to occurrences of super-motifs resulting from an overlap between an occurrence of motif and an occurrence of motif . We denote by 2 ( , ) the set of these super-motifs. We use the strategy described in Picard et al. (2008) to determine the sets of super-motifs  2 ( ) and ( , ′ ).
Observe that these sets do not depend on the observed networks, so, to alleviate the computational burden, they can be determined and stored once for all.
Eq. (9) shows that ( 2 ) only depends on ( ( ) ( )), which is 2 when positions and do not overlap and the probability of the corresponding supermotif when they overlap. As a consequence, we have that where the , , , ′ , , , ′′ , , , are constants, which depend on the dimensions of the graph, on the motif and on the super-motif . The order of magnitude of ′′ , , , for large and will be studied in Section 5.1.2. Because super-motifs are actually motifs, their respective occurrence probability under B-EDD are given by Proposition 1 as well, so the expectation and the variance of under B-EDD can be expressed as functions of the and { } ∈ 2 ( ) . An estimate of each can be obtained using Eq. (8) in the same way.
Some super-motifs from  2 ( ) for motif = 9 (top left) with = 3 top nodes and = 2 bottom nodes. | ∩ | (resp. | ∩ |): number of top (resp. bottom) nodes shared by the overlapping positions and . Black: nodes from , white: nodes from , black/white: nodes from ∩ . There are actually | 2 (9)| = 396 such super-motifs of motif 9. (8) is only based on empirical quantities (the counts of stars motifs) and does not depend on any parameter estimation. Especially, the functions and ℎ do not need to be estimated as the frequency of star motifs provides all necessary information about the degree distributions. As a consequence, we may define plug-in estimates of the occurrence probability, the expected count and the variance of the count of any motif under B-EDD. Table 2 compares the empirical frequencies of a selection of motifs with their respective estimated probability . The probability estimates are computed according to Equation (8), using the star motifs frequencies Γ and Λ given in Table 1. Observe that the difference between the observed frequency and their estimated expectation under the B-EDD model are of the same order of magnitude, if not smaller, than their estimated standard deviations.

Tests for bipartite networks
Asymptotic framework. We consider a sequence of B-EDD random graphs defined as follows. Table 2 Empirical frequency , estimated probability and estimated standard-deviation of the frequency according to the B-EDD model for a selection of motifs. All estimates are derived from the star motifs frequencies given in Table 1.

Asymptotic normality of motif frequencies
This section is devoted to the asymptotic normality of motif frequencies under the B-EDD model. More precisely, our first main result states the asymptotic normality of the following statistic relying on the empirical frequency of a given motif in : where denotes the estimator of defined in (8) and̂( ) the one of ( ) obtained by the plug-in of ( being any super-motif generated by two occurrences of ) in the expressions of ( ) given in (9)-(11). Theorem 1. If 0 < + < 2∕ + , then for all non-star motif and under the B-EDD model, the statistic is asymptotically normal as ∼ → ∞: The proof is based on three results given hereafter in Proposition 2, Lemma 1 and Lemma 2.
Sketch of proof. Let us first consider the following decomposition of the numer-ator of : Under the null B-EDD model, we show that, ( ) ∕ √ ( ) is asymptotically normal in Proposition 2, it is the leader term, ( ) ∕ √ ( ) is negligible in Lemma 1, it is the reminder term. Then, we conclude using Slutsky Theorem Lemma 2 which states that̂( )∕ ( ) → 1 in probability. ■

Remark 7. Like
, is only based on empirical quantities, that is ) the empirical frequency of motif and ) the empirical frequencies of the stars motifs forming . The expected frequencies of the supermotifs of involved in ( ) also depend only on empirical star frequencies.

Remark 8. Gao and Lafferty (2017b) proved a similar result as Theorem 1 in the EDD model, for a test statistic which is a linear combination of edges, vees and triangles empirical frequencies in the case of simple graphs, and under a specific condition on the graph density. Though their result is not comparable to ours since triangles can not occur in bipartite graphs and we do not account for stars motifs. Although they seem similar, a fair comparison between Theorem 1 and the result from Gao and Lafferty (2017b) is not easy ( ) because the model is not the same (we consider bipartite graphs whereas they consider simple graphs) and ( ) because they only consider vees (which are star-motifs) and triangles (which do not occur in bipartite graphs).
Remark 9. Let us give some insights about condition 0 < + < 2∕ + . Condition + > 0 is necessary to get the convergence of the estimator towards , stated in Lemma 1. This result is also used to prove Lemma 2 and Theorem 1. Condition + < 2∕ + is necessary in Proposition 2 to make sure that the variance of the reminder term of the quantity of interest vanishes. The latter result is used both in Lemma 1 and Theorem 1.
In the following proposition, the asymptotic normality of the statistic ruling the law of is stated under the null. This statistic involves the empirical frequency of a given non star motif and its theoretical expectation and variance. The proof of its asymptotic normality mostly relies on tools of martingale theory. We show that we can exhibit conditional martingale difference sequences relative to a specific filtration. This filtration is generated by the sequence of graphs  (see a proper definition of the filtration in Section 5.1.1). So, we could apply the central limit theorem of Hall and Heyde (2014).

then for all star motif and under the B-EDD
The complete proof is given in Section 5.2, it relies especially on Lemma 6 and Lemma 7.
Sketch of proof. We first consider the decomposition = − = + with being the difference between and its expectation conditionally to the considered filtration and , , and the difference between the latter conditional expectation and ; the proper definitions are given in Section 5.2.1. Lemma 6 shows that, under the null B-EDD model, the reminder term . Then, Lemma 4 shows that ( | , )∕ ( ) tends to 1 in probability for all ( , ), which allows deconditionning. ■ The two following lemmas combined with Proposition 2 permit to conclude to Theorem 1. Their proofs are given in sections 5.3 and 5.4 respectively.

Goodness-of-fit tests for the B-EDD model
We consider a bipartite network  and we want to test if it arises from the B-EDD model: To this aim, we consider the test statistic = ( − )∕ √̂( ) defined in (12). The idea is thus to compare the frequency of a motif observed in the network with its expected value under the B-EDD model.  Table 3 gives the test statistics for goodness of fit to the B-EDD model for the same motifs as in Table 2. According to Theorem 1, these statistics should be compared with the quantiles of standard normal distribution  (0, 1). Almost no motif frequency displays a significant deviation from its expectation under the B-EDD model. Only motif 16 in the plant-pollinator network displays a higher frequency than expected under B-EDD (with -value 7.5 10 −3 ). Table 3 Test statistics for the goodness-of-fit of B-EDD for the same motifs as in Table 2.

Tests for the comparison of two bipartite networks
This section is devoted to a test for network comparison. More specifically, considering two networks assumed to arise from two B-EDD models, we want to test if they arise from the same B-EDD model, or for, instance, from two different B-EDD model with same function . The rational behind the tests we propose is to compare the frequency of a motif observed in one network with its expected value according to the parameters of the other network. To this aim, we need to introduce specific notations.
A global test. We consider two bipartite networks  and  supposed to arise from B-EDD models with respective dimensions and parameters ( , , , , ) and ( , , , , ). We want to test This is to test that, although the two networks may have different dimensions ( , ), they have the same density ( ), the same top node heterogeneity ( ) and the same bottom node heterogeneity (ℎ).
Test statistic. The test statistic is based on and the empirical frequencies of motif in  and  respectively. The superscript (resp. ) is added to all quantities observed in  (resp.  ).
The rest of the proof relies on Slutsky's theorem. ■

Corollary 1. Under the same assumptions as Theorem 2 and under
the statistic has an asymptotic standard normal distribution: Proof. This follows from the fact that, under 0 , we have that ( , , 1 , , ) = ( , , 1 , , ) and ( , , 1 , , ) = ( , , 1 , , ). ■ Corollary 1 enables to define a test procedure with asymptotic level < 1∕2, which consists in rejecting 0 as soon as: where stands for the quantile of order of the standard normal distribution. The asymptotic power of this test is given by the next corollary, which is as straightforward application of Theorem 2.

Corollary 2.
Under the same assumptions as in Theorem 2, the asymptotic (as ∼ ∼ ∼ → ∞) power of the test of: is given by: where Φ(⋅) stands for the cumulative distribution function of the standard normal distribution and ℎ and are the theoretical counter parts of and defined in Theorem 2, that is = 1 ∕ 0 with: Remark 11. The power of the network comparison test is therefore controlled by the ratio ℎ ∕ √ . As expected, the power of the test based on the frequency of motif mostly depends on the departure between the expected frequencies under the null and under the alternative (that is and ). We observe that the power also depends on the the ratio between the variance under the alternative and the null (that is ).
Testing equal top nodes heterogeneity. Suppose we want to test that, although the two networks may have different dimensions, different densities, and different bottom node heterogeneity, they have the same top node heterogeneity, that is Since we allow the two networks to have different densities, one might normalize the probabilities of star motifs given in (5) as follows: 1 = 1,̃2 = 2 ∕ 2 1̃3 = 7 ∕ 3 1̃4 = 17 ∕ 4 1 ,̃5 = 44 ∕ 4 1 , 1 = 1,̃2 = 3 ∕ 2 1 ,̃3 = 4 ∕ 3 1̃4 = 8 ∕ 4 1 ,̃5 = 18 ∕ 4 1 . This allows to see that we can rewrite ( , , , , ) = as an expression of on which relies the test we consider. According to (6) and to the definition of under the B-EDD model, we get: where = ∫ ( ) and ℎ = ∫ ℎ( ) . We may consider the following test statistic: , whereΓ andΛ are the plug-in estimates of̃and̃respectively. Similar statistics can be designed to test = , ℎ = ℎ or any combination.
Illustration. Both the plant-pollinator and the seed dispersal networks involve plants species. Although these species are not the same, one may be interested in comparing if the level of heterogeneity across plants (encoded in the function ) is the same in both networks. From an ecological point of view, this amounts to test if there is the same the degree of imbalance between specialists and generalists among plants regarding pollination and seed dispersion, that are two of the main reproduction means. Table 4 provides the results of the network comparison test presented above. No significant difference is observed, suggesting that, although generalist and specialist plants may exist for both types of interactions (no assumption is made about the shape of ), the degree of imbalance between them is comparable ( ≃ ).  Table 2. Networks: = plant-pollinator, = seed dispersal.̂0(⋅) is a shorthand for the notation (⋯) (idem for̂0(⋅) and (⋯)).

Simulation study
We designed a simulation study to illustrate Theorem 1 and to assess the performance of the goodness-of-fit test and the comparison test described in Section 3.2 and Section 3.3 respectively. More specifically, our purpose is to illustrate the asymptotic normality of the test statistics and evaluate the power of the tests for various graph sizes, densities and sparsity regimes.

Asymptotic normality
Simulation design. We simulated series of networks with parameters ( , , , , ℎ ) varying according to the following design: Network dimension: We simulated networks with equal dimensions = , with values in {50, 100, 200, 500, 1000, 2000}; Sparsity regime: We considered equal parameters = in {1∕3, 1∕4, 1∕5, 1∕6}; Network density: The resulting density is = 0 − − , 0 being fixed so that = .01 when = = 100; Degree imbalance: We considered the functions ( ) = −1 and ℎ( ) = ℎ ℎ −1 ; observe that = 1 means that is constant so no imbalance does exist top nodes (resp. for ℎ , ℎ and bottom nodes). We set = 2, ℎ = 3. For each configuration, = 100 networks were sampled and the test applied. Figure 3 and Figure 4. In Figure 3, the Q-Q plots of the statistic (black dots) defined in (12) and thẽstatistic (blue dots) defined in (14) hereafter, are given for four motifs in a network with dimension = = 1000 and sparsity regime = = 1∕3. Remember that the larger the power , the sparser the graph. We observe that normality of holds for motifs 6 and 15, but not for motifs 5 and 10. Actually, the latter case is due to the fluctuations of . More specifically, for non-star motifs, is not an unbiased estimate of and it is not independent from . As a consequence, for finite dimensions and , we both have that ( ) ≠ = ( ) and ( − ) ≠ ( ). Both the bias of : ( ) = ( )− and the variance of the numerator of : ( − ) can be estimated using the delta method, which requires the covariance given in Equation (10). This enables us to define a corrected versioñof the test statistic :

Results. The results are displayed in
where the biaŝ( ) and̂( − ) are both plug-in estimates.
Illustration. We provide in Table 5 the values of corrected corrected statistics for the plant-pollinator and the seed dispersal networks, to be compared with Table 3. Observe that the correction does not yield in different conclusions, in terms of fit to the B-EDD model for both networks.  Figure 4 displays the Q-Q plots of the corrected test statistics̃gathered according to the order of magnitude of the expected motif frequencies. All network sizes, sparsity regimes and non-star motifs are thus considered here together. As expected, the normality becomes more accurate when the motifs frequency increases.

Power of the goodness-of-fit test
Simulation design. In order to illustrate the power of the goodness-of-fit test, we simulated a series of networks from a mixture of a B-EDD model and a latent block model (LBM: Govaert and Nadif, 2008), characterizing the presence of clusters of rows and columns in incidence matrices. Thus, a mixing weight varying from 0 to 1 was considered so that = 0 corresponds to a B-EDD that is 0 . In details, the following simulation setup was investigated: Network dimension and density: We considered dimensions similar to the pollination and seed dispersal binary networks studied in Simmons et al. (2019b), that is = ∈ {10 1 , … , 10 3 }. To mimic the sparsity of the same networks, we fitted the density via a linear regression and obtained log 10 ( ) = 0.3457 − 0.3958 log 10 ( ); B-EDD model: We used the same functions and ℎ as in Section 4.1, with = 2, ℎ = 3; LBM model: We considered 2 groups in rows and 2 groups in columns, all groups with proportion 1∕2 and all connection probabilities = min for all 1 ≤ , ≤ 2, except 22 = max , with set such that ( max + 3 min )∕4 = 1. Two regimes were considered: max = 0.95 (scenario I: easy) and max = 0.5 (scenario II: hard); For each configuration, = 500 networks were sampled and the test applied. Again the test corrected statistic̃was used. Note that Jin et al. (2018) and Gao and Lafferty (2017a) introduced goodness-offit tests for a symmetric version of B-EDD under a block-structured alternative (similar, but not equivalent to one we consider), using specific motifs ('graphlets' or V's and triangles). Results. The results are given in Figure 5. For illustration purposes, we only present the results we obtained for = ranging from 50 to 500. Moreover, for the sake of clarity, we only consider motifs 5, 6, 10, and 15 which constitute a representative panel of the set of motifs with size 4 and 5.
As the network dimensions increase, we can clearly observe that the tests become more powerful. For small networks with = = 50 and = = 100, the LBM regime with max = 0.95 is easier and leads to tests associated with motifs 5 and 6 with higher power. These differences vanish for larger values of and . Overall, we found that motifs 5 and 6 lead to more powerful tests. These results illustrate that the methodology proposed is relevant and that the goodness-offit tests for different motifs can be used to detect the departure from a B-EDD model.

Power of the network comparison test
Simulation design. We also studied the power of the test for network comparison introduced in Section 3.3. To this aim, we simulated series of networks with parameters ( , , , , ℎ ) varying according to the same design as in Section 4.1, where was set to 2. We focused on the test of 0 = { = } so, for each network , we simulated a sequence of networks with same dimensions ( = , = ), but a with a different parameter . More specifically, setting * = 1 (absence of degree imbalance between top nodes), we sampled networks with = (1 − ) + * , with = 0, 0.1, 0.2, … 1, so that = 0 corresponds to 0 . Regarding the two remaining parameters and ℎ , we considered two scenarios:

I (easy):
= , ℎ = ℎ , so that the two networks only differ with respect to ; II (hard): = ∕2, ℎ = 2, so that the two network differ in all parameters, but only the difference in is tested.
The 'hard' scenario is designed to assess the ability of the proposed test statistic to accommodate to differences in density and bottom node imbalance between the two networks, when testing the equality of their top node imbalance. For each configuration, = 500 pairs of networks ( , ) were sampled and compared. Following the simulation results presented in Section 4.1, we used the deltamethod to derive a corrected versioñof the test statistic defined in Equation (13). Similarly to Section 4.1, the performances of the uncorrected test statistic become similar to these of the corrected versioñfor large networks (results not shown).
Illustration. Again, to illustrate the effect of the proposed correction, we provide in Table 6 the values of corrected statistics̃testing 0 = { = }, network being plant-pollinator and network being seed dispersal. These results can be compared with Table 4: The correction yields in (moderately) higher absolute values, suggesting a gain of power. Table 6 Corrected test statistics̃for 0 = { = } for the same motifs as in Table 2 and same networks as in Table 4. Results. The results are displayed in Figure 6. We only present the results for = = = ranging for 50 to 500. Moreover, as in the previous section, we only consider motifs 5, 6, 10 and 15. As expected, the test becomes more powerful when the networks dimensions increase. More interestingly, for small networks, the smaller motifs (5 and 6, with size 4) turn out to yield a higher power. The difference vanishes when the dimensions increase. These conclusions hold under the two scenarios, which shows that the proposed test statistic does accommodate for departures that may exist between two networks, not being the departure under study (scenario II 'hard'). Still, the power is always better under scenario I: obviously, the test performs better when focusing on the only difference that actually exists (scenario I 'easy').

Definitions and technical lemmas
In this section, we introduce notations and useful technical lemmas for establishing proofs of Proposition 2 in Section 5.2, Lemma 1 in Section 5.3 and Lemma 2 in Section 5.4.

Definitions
Let us remind that we consider a bipartite graph  = (, ) with nodes. The set of nodes is  = ( ,  ), where  = J1, K (resp.  J1, K) stands for the set of top (resp. bottom) nodes, and the set of edges is  ⊂  ×  , meaning than an edge can only connect a top node with a bottom node. The total number of nodes is therefore = + . We denote by the corresponding × incidence matrix where the entry of is 1 if ( , ) ∈ , and 0 otherwise. Let us consider now a collection of bipartite graphs ( ) ∈J1, K = ( ,  ) with nodes. In the following, we introduce notations for subsets of interest and a filtration we will use to construct differences of martingales involving motif counts.

Subsets definitions. Let us introduce the following subsets definitions:
•  = {( 1 , … , ) ⊂  ∪  with at least one top node and one bottom node}, ∈ J2, K, it is the set of nodes of  meaning the selected nodes among , and denotes the -th and last selected one; we will use several times hereafter; • =  ∩  and =  ∩  , these are the sets of top and bottom nodes in  ; otherwise, it is the positions set of motif in  with the particularity that the last node added to  is part of motif .

Technical lemmas
We present here three lemmas which are key arguments in the proofs of Proposition 2, Lemma 1 and Lemma 2.
The following lemma gives the order of magnitude of the variance of a count. Before, its statement Let us give the order of magnitude of the expected count of a motif with top nodes and bottom nodes. It writes ( ) = , with where + stands for the total number of edges in and being defined in (3).
Proof. Let us observe that, for , ∈  , , Thus, a general form for the variance is the following: where  = { , ∈  , ∶ ∩ = ∅ } and  2 ( ) denotes the set of supermotifs of which are formed by two overlapping occurrences of .
Combining the orders of the three terms of assertion (18), we get that the order of magnitude of the variance of a count is ) .

■
The last argument of proof of Proposition 2, Lemma 7 and Lemma 1 relies on the following result. Proof. First let us write that The proof relies on showing the convergence in probability of the two above expectations towards , respectively. Let us now introduce the equivalence relation R and the set defined as follows: Then, we can exhibit the two following quantities which are two-samples U-Statistics (see Section 12.2, p.165 in van der Vaart (2000)): with and being defined in (2), (3), respectively,  denoting the location relative to a given position for motif and where , with 1 (·) and 2 (·) being permutation symmetric kernels in ( ) and ( ) separately. We conclude by applying the central limit theorem for two-sample U-Statistics (see Theorem 12.6 in van der Vaart (2000)) which holds under the assumption that the kernel of the U-statistic has a finite moment of order two.
Here, as it concerns probabilities this assumption is obviously fulfilled. ■ In proofs of Lemma 2, Lemma 6 and Lemma 7, we need to know the cardinal order of the sets  ⊗ , ⧵  ( ) , , = 2, 4 which contains only dependent -uplets of positions of motif on the event for which the last node added to  is a top node. Recall that  , is the positions set of motif in the subgraph of  with nodes in  and the particularity that the last node added to  is part of motif . The definition of the other set of interest is the following: Lemma 5. We have, on , with and denoting respectively top and bottom nodes in  .
Proof. Let us observe that The leader term of order Θ ) obviously vanishes and imply the lost of one order (the calculation omitted here are simply based on the same arguments as in (19)). ■

Proof of Proposition 2
For establishing the proof of Proposition 2, we first consider a decomposition of = − in Section 5.2.1, then we focus on the reminder term of this decomposition in Lemma 6 and finally show the asymptotic normality of the leading term in Lemma 7.

Decomposition of
Let us use the sets introduced in Section 5.1.1 to express as follows: with the random variables , of the B-EDD model (1). Then Let us decompose as the sum of two expressions, the first one corresponding to a martingale difference sequence relative to the filtration ( ) ∈J2, K , the second one being a term of rest: Observe that by construction, , = ∑ ∈ , { ( ) − ( ( )| −1 ; , )} is a conditional martingale difference with respect to ( ) ∈J2, K : ( , ( , )| −1 ; , ) = 0.

Lemma 6. Under the B-EDD model and condition
Proof. The proof consists in showing the two following assertions: → 0 almost surely as tends to infinity under condition + < 2∕ + .
Let us show assertion (A1): Let us focus now on assertion (A2). Let us first observe that, by independance of successive choices of  . Using definition (4) of the indicator motif, we see that Then according to measurability with respect to  −1 and the position (top or bottom) of the last selected node, we get , and using the usual notation of the conditional expectation of 's, we have Then, considering the fact that ( From now, we will work on the set  ⊗2 , ⧵  (2) , which contains only dependent pairs of positions. It follows from the Bernoulli conditional distribution of combined with the fact that .
In order to evaluate the right-hand side term of the above inequality, recall that = Θ( ) by (15), = Θ by (17) and by Lemma 5, and denoting respectively top and bottom nodes in  . Thus, we get By taking the normalization by Lemma 2 and (15), we conclude to → 0 almost surely as tends to infinity under condition + < 2∕ + . ■
Let us verify condition (C1). First observe that it follows from properties of martingale differences, meaning variance decomposition, null conditional expectation and conditional orthogonality of differences, that

Proof of Lemma 1
Proof. Let us show that ( − )∕ √ ( ) → 0 a.s. as → ∞ under the B-EDD model and condition + < 2∕ + ruling the graph density. Recall (8) the definition of : where Γ (resp Λ ) denote the normalized empirical frequencies of the top (resp bottom) star motif with degree and 1 the one of the edge. Let us begin with a Taylor expansion of order 1 of in parameters ( , , 1 ) denoting the top star motif, bottom star motif and edge probabilities respectively: ) .
Given the two following observations: i) the asymptotic normality of ( − )∕ √ ( ) holds for any motif , including star motifs, under the B-EDD model and condition + < 2∕ + by Proposition 2, ii) the empirical frequencies of motifs converge to the expected ones by the law of large numbers, we get Here and only here, Γ (resp. Λ ) and (resp. ) denote, by abuse of notation, the count of top stars (resp. bottom stars) of degree and their number of positions in the graph. Considering only non-star motifs , according to the orders of magnitude of , and ( ) given in (15), (17) and Lemma 3 respectively, we conclude to ( − )∕ √ ( ) → 0 a.s. as → ∞ because −2 ( + ) < 0, with = , or 1. ■