Detecting Tampering in a Random Hypercube

Consider the random hypercube $H_2^n(p_n)$ obtained from the hypercube $H_2^n$ by deleting any given edge with probabilty $1-p_n$, independently of all the other edges. A diameter path in $H_2^n$ is a longest geodesic path in $H_2^n$. Consider the following two ways of tampering with the random graph $H_2^n(p_n)$: (i) choose a diameter path at random and adjoin all of its edges to $H_2^n(p_n)$; (ii) choose a diameter path at random from among those that start at $0=(0,..., 0)$, and adjoin all of its edges to $H_2^n(p_n)$. We study the question of whether these tamperings are detectable asymptotically as $n\to\infty$.


Introduction and Statement of Results
Let H n 2 = (V n , e n ) denote the n-dimensional hypercube.Recall that the vertices V n of H n 2 are identified with {0, 1} n , and an edge in e n connects two vertices if and only if they differ in exactly one component.Denote vertices by x = (x 1 , • • • , x n ).A geodesic path from x to ȳ is a shortest path from x to ȳ.A diameter path in H n 2 is a longest geodesic path in H n 2 .The set of diameter paths is the set of paths x0 x1 • • • xn , where xn = 1 − x0 and Let H n 2 (p n ) denote the random hypercube obtained by starting with the graph H n 2 and deleting any given edge with probability 1−p n , independently of all the other edges.Let P n,pn denote the corresponding probability measure; P n,pn is a measure on E n ≡ 2 en , the space of all subsets of e n .An element of E n will be called an edge configuration.
We consider two similar ways of tampering with the random hypercube.
The first way is to choose a diameter path from H n 2 at random and adjoin it to H n 2 (p n ); that is, we "add" to the random graph every edge of this diameter path that is not already in the random graph.Denote the induced measure on E n by P tam n,pn .The second way is to consider 0 ≡ (0, • • • , 0) as a distinguished vertex in the hypercube, and to adjoin to the random hypercube a diameter path chosen at random from among those diameter paths which start at 0. Denote the induced measure on E n by P tam,0 n,pn .Can one detect the tampering asymptotically as n → ∞? Let Q n be generic notation for either P tam n,pn or P tam,0 n,pn .Let ||P n,pn − Q n || TV denote the total variation distance between the probability measures P n,pn and Q n .If lim n→∞ ||P n,pn − Q n || TV = 1, we call the tampering detectable.If lim n→∞ ||P n,pn − Q n || TV = 0, we call the tampering strongly undetectable, n=1 is bounded away from 0 and 1, we call the tampering weakly undetectable.

The number of diameter paths in H n
2 is easily seen to be 2 n−1 n!, while the number of diameter paths in H n 2 that start from 0 is n!.Let m n denote the number of diameter paths in either of these two cases.Numbering the diameter paths from 1 to m n , let O n,j denote the set of edge configurations which contain the j-th diameter path.From the above description of the tampered measures Q n = P tam n,pn or Q n = P tam,0 n,pn , it follows that The following proposition, which we prove in the next section, shows that the tampered measure is in fact obtained from the original measure by size biasing with respect to the diameter counting function N n .
Proposition 1.Let Q n denote either of the two tampered measures, and let N n denote the corresponding diameter counting function.Then The following proposition is immediate in light of Proposition 1.
Proposition 2. Let Q n denote either of the two tampered measures, and let N n denote the corresponding diameter counting function.Then if and only if the weak law of numbers holds for N n under P n,pn ; that is, if and only if The second moment method then yields the following corollary.Let Var n,pn denote the variance with respect to P n,pn .
Corollary 1.Let Q n denote either of the two tampered measures, and let the tampering is strongly undetectable; ii.If Var n,pn is bounded away from 1; thus the tampering is not detectable.
Part (i) of the corollary of course follows from Chebyshev's inequality; we give a proof of part (ii) in section 2.
We will prove the following result.n , with γ < e, then the tampering is detectable; furthermore, the distribution of N diam,0 n under P n,pn converges to the δ-distribution at 0; ii.If p n ≥ γ n , with γ > e, and lim sup n→∞ np n < ∞, then the tampering is weakly undetectable; in particular, the distribution of N diam,0 n under P n,pn does not satisfy the law of large numbers; iii.If lim n→∞ np n = ∞, then the tampering is strongly undetectable; equivalently, the distribution of N diam,0 n under P n,pn satisfies the law of large numbers.
Remark.If under P n,pn , the distribution of N n converges to the δ-distribution at 0, then the tampering is detectable since under the tampered measure one has N n ≥ 1 a.s.By Proposition 1, if the tampering is strongly undetectable, then the distribution of N n must converge to the δ-function at ∞. Naive intuition might suggest that for a tampering problem of the above type, the above two statements should be if and only if statements, except perhaps conceivably in some narrow bifurcation region between two regimes.Theorem 1 shows that this is indeed the case for the tampering problem under consideration.(The proof of the theorem will reveal that in case (b-ii), the distribution of N diam,0 n converges neither to the δ-distribution at 0 nor to the δ-distribution at ∞.)However, we now point out two examples of similar tampering problems where this intuition fails.
Example 1.Let G(n) be the complete graph on n vertices, and let G(n, p n ) be the Erdos-Renyi random graph with edge probabilities p n ; that is, G(n, p n ) is obtained from G(n) by deleting any particular edge with probability 1 − p n , independently of all the other edges.Let P n,pn denote the corresponding probability measure on edge configurations.As above, denote the space of all edges by e n and the space of all possible edge configurations by E n .Recall that a Hamiltonian path in G(n) is a path that traverses each of the vertices of the graph exactly once; that is, a path of the form x 1 x 2 • • • x n , where the x i are all distinct.Tamper with the random graph by choosing at random a Hamiltonian path from G(n) and adjoining it to G(n, p n ); that is, "add" to the random graph every edge of this Hamiltonian path that is not already in the random graph.Call the induced measure P Ham n,pn .The number of Hamiltonian paths in Hamiltonian paths, we obtain the result above for any k.) The above result shows in particular that under P n,pn , the Hamiltonian path counting function N ham n converges to the δ-distribution at ∞ if p n is as above with lim n→∞ ω n = ∞.The naive intuition noted in the remark after Theorem 1 would suggest that the tampering in this case would be strongly undetectable.After all, how much can one additional Hamiltonian path be felt in such a situation?However, we now demonstrate easily that whenever lim n→∞ p n = 0, the tampering is detectable, while whenever p n ≡ p ∈ (0, 1) is constant, the tampering is not strongly undetectable.(In fact, it is weakly undetectable, but we will not show that here.)In light of Proposition 1, this also shows that when p n ≡ p ∈ (0, 1) is constant, the weak law of large numbers does not hold for N ham n , a fact that has been pointed out by Jansen [3], where a lot of additional results concerning N ham n can be found.
Label the edges of . The random graph G(n, p n ) with probability measure P n,pn is constructed by considering a collection {B j } |en| j=1 of IID Bernoulli random variables taking on the values 1 and 0 with respective probabilities p n and 1 − p n , and declaring the j-th edge to exist if and only if Note that these two variances are on the same order since |e n | is on the order n 2 .Let SD n ≡ |e n |p n (1 − p n ) denote the standard deviation under the untampered measure.Using the central limit theorem, it is easy to show that if ∆Exp n is on a larger order than SD n , then the tampering is detectable, while if ∆Exp n is on the same order as SD n , then the tampering is not strongly undetectable.In the case that lim n→∞ p n = 0, we have ∆Exp n on the order n and SD n on the order o(n), while in the case that p n = p ∈ (0, 1) is constant, we have both ∆Exp n and SD n on the order n.
Example 2. Consider a random permutation σ ∈ S n as a row of n cards labeled from 1 to n and laid out from left to right in random order.Now tamper with the cards as follows.Select k n of the cards at random, remove them from the row, and then replace them in the vacated spaces in increasing order.Let U n denote the uniform measure on S n , that is, the measure corresponding to a "random permutation," and let U incsubseq,kn n denote the measure on S n induced from U n by the above tampering.Note that by construction, a permutation σ ∈ S n will have an increasing sequence of length k n with U incsubseq,kn n -probability 1.On the other hand, the celebrated result concerning the length of the longest increasing subsequence in a random permutation ( [5], [8], [1]) states that the U n probability of there being an increasing subsequence of length cn , with c > 2. The above-mentioned result also states that the U n -probability of there being an increasing subsequence of length cn 1 2 goes to 1 as n → ∞, if c < 2. From this it follows that for k n ≤ cn 1 2 , c < 2, the distribution of the number of increasing subsequences of length k n , which we denote by N incr,kn n , converges to the δ-distribution at ∞ as n → ∞.The naive intuition in the remark after Theorem 1 would suggest that one can tamper on the order k n without detection, if k n ≤ cn 1 2 with c < 2; after all, how much can one additional increasing subsequence be felt in such a situation?However, this turns out to be false.In [6], it was shown that 5 and in [7] it was shown that lim n→∞ ||U n − U incsubseq,kn . So in the former case the tampering is strongly undetectable and in the latter case it is detectable.
In section 2 we give the proof of Proposition 1 and of part (ii) of Corollary 1.In section 3 we prove Theorem 1.The proof of parts (a-i) and (b-i) are almost immediate using the first moment method.The proof of parts (a-ii) (b-ii) and (b-iii) use the second moment method and involve some quite nontrivial computations, some of which may be interesting in their own right.

Proof of Proposition 2 and Corollary 1-ii.
Proof of Proposition 2. Let ω ∈ E n .Then we have P n,pn (ω |O n,j ) = ω), and since the O n,j have the same P n,pnprobabilities for all j, we have E n,pn N n = m n P n,pn (O n,1 ).Using these facts along with the definition of Q n in (1.1) we have Proof of Corollary 1-ii.Let Y n = Nn En,p n Nn .Using Proposition 1 along with an alternative equivalent definition of the total variation distance, we have where a + = a ∨ 0. From this it follows that lim n→∞ ||Q n − P n,pn || TV = 1 if and only if lim n→∞ P n,pn (Y n > ǫ) = 0, for all ǫ > 0. By the assumption in part (ii) of the corollary, E n,pn Y 2 n ≤ M for some M and all n.For every ǫ > 0, we have From this it is not possible that lim n→∞ P n,pn (Y n > ǫ) = 0, if ǫ < 1.

Proof of Theorem 1
We begin with the quick proofs of (a-i) and (b-i).

Proof of (a-i).
There is a two-to-one correspondence between H n 2 × S n and diameter paths in H n 2 .Indeed, for x ∈ H n 2 and σ ∈ S n , we begin the diameter path at x and use the permutation σ to determine the order in which we change the components of x.(The correspondence is two to one because the diameter path is not oriented.)In particular there are 2 n−1 n! diameter paths.The probability that any particular diameter path is contained in the random hypercube H n 2 (p n ) is p n n ; thus we have From this it follows that lim n→∞ E n,pn N diam n = 0, if p n ≤ γ n , with γ < e 2 .Thus, for such p n , N diam n under P n,pn converges to the δ-distribution at 0 as n → ∞, from which it follows that the tampering is detectable.

Proof of (b-i).
There is a one-to-one correspondence between S n and diameter paths that start at 0. The probability that any particular diameter path is contained in the random hypercube H n 2 (p n ) is p n n ; thus we have From this it follows that lim n→∞ E n,pn N diam,0 n = 0, if p n ≤ γ n , with γ < e.As in part (a-i), it then follows that the tampering is detectable.
By Corollary 1, to prove (a-ii) it suffices to show that if p n is as in (a-ii), and to prove (b-iii) it suffices to show that With regard to (b-ii), note that under the untampered measure, the probability that 0 is an isolated vertex is (1 − p n ) n .If p n is as in (b-ii), then this probability stays bounded from 0. On the other hand, under the tampered measure, the probability that 0 is isolated is 0. Thus, in the case of (b-ii), the tampering cannot be strongly undetectable.Thus, by Corollary 1, to complete the proof that the tampering is weakly detectable, it suffices to show that We now give the long and involved proof of (3.3) to prove (a-ii).After that we will only need a single long paragraph to describe the changes required to proof (3.4) and (3.5), which are a bit less involved.
The diameter paths are labeled from 1 to m n = 2 n−1 n!, and we have defined O n,j to be the set of edge configurations which contain the j-th diameter edge.We relabel for convenience.Let O x,σ denote the set of edge configurations which contain the diameter path corresponding to (x, σ) in the above two-to-one correspondence.Then we have x,ȳ∈H n 2 ,σ,τ ∈Sn By symmetry considerations, letting id denote the identity permutation and letting 0 ∈ H n 2 denote the element with zeroes in all of its coordinates, we have (3.7) x,ȳ∈H n 2 ,σ,τ ∈Sn Let W n (x, σ) denote the number of edges that the diameter path corresponding to (x, σ) has in common with the diameter path corresponding to ( 0, id).
Then we have (3.8) Letting the generic E denote the expectation with respect to the uniform measure on H n 2 × S n , it then follows from (3.1) and (3.6)-(3.8)that (3.9) Thus, if we show that then it will follow from (3.9) that (3.3) holds.
We now estimate P (W n ≥ m), for m ≥ 1, where P denotes the probability corresponding to the expectation E. In fact, in the quite involved estimate that follows, it will be convenient to assume that m ≥ 2; one can show that the estimate obtained below in We now estimate We first determine for which σ ∈ S n one has that ( 0, σ) ∈ A l 1 ,••• ,lm ; this result will be needed for the general case of determining which (x, σ) belong to A l 1 ,••• ,lm .We will say that [j] is a sub-permutation of σ if σ maps [j] onto itself.A moment's thought reveals that the edge e j belongs to the diameter path ( 0, σ) if and only if both [j − 1] and [j] are sub-permutations for σ.
Thus, ( 0, σ) [l m ] are all sub-permutations of σ.The number of permutations σ ∈ S n for which this holds is easily seen to be (l We now consider when (x, σ) ∈ A l 1 ,••• ,lm for general x.It is not hard to see that a necessary condition for (x, σ) We will refer to these two conditions on x by K 0;l 1 ,lm and K 1;l 1 ,lm .
If one of these two conditions on x is satisfied, then in order to have , the following conditions are required on σ.Recall that σ gives the order in which the n coordinates of x are changed so that the diameter path moves from x to 1 − x.So if σ = (σ 1 , • • • , σ n ), then the j-th edge in the diameter path will involve changing the σ j -th coordinate.Let Then it is not hard to see that the first r l 1 ,lm (x) coordinates in σ must be reserved for B 0;l 1 (x) ∪ C 1;lm (x); that is, {σ 1 , • • • , σ r l 1 ,lm (x) } = B 0;l 1 (x) ∪ C 1;lm (x).Let x 1;j denote the vertex in H n 2 whose first j components are 1 and whose remaining components are 0. Of course, this vertex belongs to the diameter path ( 0, id).If σ is as above, then the r l 1 ,lm (x)-th vertex of the diameter path (x, σ) will be x 1;l 1 −1 (x) if x satisfies condition K 0;l 1 ,lm , and will be x 1;lm if x satisfies condition K 1;l 1 ,lm .
In the former case, we must then have σ r l 1 ,lm (x)+1 = l 1 , and in the latter case, we must then have σ r l 1 ,lm (x)+1 = l m .In the former case, the (r l 1 ,lm (x)+1)-th vertex of the diameter path (x, σ) will be x 1;l 1 (x) and the r l 1 ,lm (x))-th edge will be e l 1 , and in the latter case, the (r l 1 ,lm (x)+1)-th vertex of the diameter path (x, σ) will be x 1;lm−1 (x) and the r l 1 ,lm (x))-th edge will be e lm .(Recall that a diameter path has n + 1 vertices.) If x satisfies condition K 0;l 1 ,lm , then the next l m − l 1 coordinates of σ must involve the numbers (l 1 + 1, l 1 + 2, • • • , l m ), and must move the diameter path (x, σ) from the vertex x 1;l 1 (x) to the vertex x 1;lm (x) while passing through the edges e l 2 , • • • , e lm .Based on our analysis above, for this to happen one requires that (σ Similarly, if x satisfies condition K 1;l 1 ,lm , then the next l m − l 1 coordinates of σ must involve the numbers (l 1 + 1, l 1 + 2, • • • , l m ), and must move the diameter path (x, σ) from the vertex x 1;lm (x) to the vertex x 1;l 1 (x) while passing through the edges e l m−1 , • • • , e l 1 .Inverting the direction of our analysis above, for this to happen one requires that Then finally, the last n − r l 1 ,lm (x) − 1 − (l m − l 1 ) coordinates of σ can be chosen arbitrarily from the remaining numbers.Putting the above all together, we obtain (3.13) (Given that x satisfies condition K 0;l 1 ,lm or condition K 1;l 1 ,lm , there are where the last inequality follows from the fact that the fraction in the sum above is always less than 1.After completing the current proof, we will prove the following proposition.
Proposition 3.For every δ > 0, there exist a c δ > 0 and an r δ ≥ 0 such From Proposition 3, it follows that for any δ > 0, there exists a c δ > 0 and an r δ ≥ 0 such that (Note that in the sum above, the last subscript, , is strictly less than the superscript l m −l 1 , whereas in the sum in Proposition 3 the last subscript, l m in T n m;l 1 ,••• ,lm , can attain the value n of the superscript; however, this is no problem since the inequality goes in the right direction.)Now (3.13), (3.14) and (3.15) give (3.16) Now summing over l 1 and l m , and denoting k = l m − l 1 + 1, we have (3.17) It is easy to check that h is increasing, which implies that ρ is convex.Thus, ρ attains its maximum at an endpoint.We conclude that the . Using Stirling's formula, it is easy to check that there exists a K such that (n−m+1)!n! ≤ K( e n ) m−1 .Using these facts in (3.17), we obtain (3.18) Using (3.18) We may choose δ > 0 as small as we like in (3.20).For γ > e 2 , choose δ so that (1+δ)e , j ≥ 0, m ≥ 1.
We have and then using (3.21), Continuing in this vein, we obtain In light of (3.22), to complete the proof of Proposition 3, it suffices to show that for every δ > 0, there exist a c δ > 0 and an r δ ≥ 0 such that Let n 0 ≥ 1, and for n > n 0 write We need the following lemma whose proof we defer until the completion of the proof of the proposition.
Lemma 1.For each n there exists a constant c n such that From (3.24) and (3.25), it follows that for each n 0 there exists a constant It is easy to see that (3.28) lim we have from (3.26) and (3.27) that (3.29) however, all we need for our purposes is that this quantity is bounded, and this is very easy to see.Thus, we have It is easy to show that if {x j } k j=0 satisfies the recursive inequalities x 0 ≤ 8 3 and By (3.28), for any δ > 0, there exists an n 0 such that d n 0 ≤ 1 + δ.Using this with (3.31), and using (3.25) with n ≤ n 0 , one concludes that (3.23) holds.
This completes the proof of the proposition.
We now return to prove Lemma 1.
Proof of Lemma 1. Fix n ≥ 1.Let B denote the n × n matrix with entries b ij , 1 ≤ i, j ≤ n, given by b ij = 1 ( i j ) , for i ≥ j, and b ij = 0, for j > i.Let v 0 denote the n-vector with entries v 0 j , 1 ≤ j ≤ n, given by v 0 j = C where v l n is the n-th coordinate of v l .The lemma follows immediately from (3.34).
We have now completed the proof of (3.3), and thus the proof of (a-ii).
To complete the proof of (b-ii) and (b-iii) we need to prove (3.4) and (3.5).
In fact all the work has been done in the above proof.The proof up to (3.11) is the same as before, except that now we work with the space S n instead of with H n 2 × S n .In particular then, we now have W n = W n (σ), and it denotes the number of edges that the diameter path starting from 0 and corresponding to σ has in common with the diameter path starting from 0 and corresponding to id.Similarly, A From (3.37), it follows that as n → ∞, E(p −W n ; W ≥ 1) converges to 0 if p n is as in (b-iii), and remains bounded if p n is as in (b-ii).Thus, it follows from (3.9) that (3.4) holds if p n is as in (b-iii) and that (3.5) holds if p n is as in (b-ii).
pn (• |O n,j ).Let N diam n : E n → {0, 1, • • • , m n }denote the number of diameter paths in an edge configuration, and let N diam,0 n : E n → {0, 1, • • • , m n } denote the number of diameter paths starting from 0 in an edge configuration.Let N n be generic notation for either N diam n or N diam,0 n .We refer to N n as the diameter counting function.
denote the number of Hamiltonian paths in an edge configuration; we call N ham n the Hamiltonian path counting function.Quite sophisticated graph theoretical techniques along with probabilistic analysis have yielded the following beautiful result: if p n = log n+log log n+ωn n pn (N Ham n ≥ k) = 1, for all k, if lim n→∞ ω n = ∞; lim n→∞ P n,pn (N Ham n = 0) = 1, if lim n→∞ ω n = −∞.(See [4] and [2, chapter 7 and references].In fact these references treat Hamiltonian cycles.With regard to the case that lim n→∞ ω n = ∞, it is shown that the limit above holds for Hamiltonian cycles when k = 1.Since any Hamiltonian cycle can be cut open in n possible locations, yielding n count the number of edges present in an edge configuration.So under P n,pn , one has that N edges n is the sum of IID random variables: N edges n = |en| i=1 B j .The expected value of N edges n under the measure P n,pn is |e n |p n .Now the tampering involved selecting n − 1 edges from e n and demanding that they exist in the tampered graph.Thus, the expected value of N edges n under the tampered measure P ham n,pn is (|e n | − (n − 1))p n + (n − 1).The increase in the mean of N edges n when using the tampered measure instead of the original one is thus equal to (1 − p n )(n − 1).We denote this change in mean by ∆Exp n .The variance of N edges n under the untampered measure is
Theorem 1. a. Consider the random hypercube H n 2 (p n ) and tamper with it by adding a random diameter path.Let N diam i.If p n ≤ γ