Conditioning Efron’s biased coin design to ensure ﬁnal balance

: This paper derives a new randomization procedure by condi- tioning Efron’s (1971) biased coin design to a prespeciﬁed ﬁnal balance. The new procedure remains a function of the original bias parameter which now controls the probability of intermediate balance rather than ﬁnal balance. As the sample size increases, the design’s selection bias and intermediate balance are similar to those of the original biased coin, but unlike the original biased coin it always guarantees ﬁnal balance. It is also shown that the permuted block design for equal allocation is a special case of the new procedure when used in blocks. The latter can substitute the permuted blocks with the added beneﬁt of reducing the expected number of deterministic assignments. The new design is also noteworthy since it shows that a ran- domization procedure with new properties can be obtained by conditioning an existing one to a subset in its allocation space. New relationships among existing designs can be established in the process, further elucidating the protean nature of randomization.


Introduction
biased coin design is the oldest restricted randomization procedure proposed to mediate between balance and randomness of treatment assignments in clinical trials. A balanced design, which assigns an equal number of subjects across treatment groups, is optimal under a series of classical statistical models since it minimizes the variance of the treatment effect estimators and maximizes power of the associated hypothesis tests [1]. An unpredictable allocation procedure is desirable to minimize bias in estimation, particularly with open label trials [1]. Randomness of an allocation procedure can be described by the expected number of deterministic assignments and the selection bias, which is the bias due to the intentional guessing of the treatment allocations by the investigator [5,12,16]. Balance and randomness of allocation procedures, although both desirable features, are typically attained at the expense of one another through a trade-off [1,16]. On one extreme, complete randomization The allocation probabilities are indicated on the arrows. All sequences start from the node (0, 0) and end at (8,4). The first element in each node summarizes the number of subjects allocated so far, and the second number tallies how many of these have been randomized to the treatment group.
is unpredictable since it randomizes subjects using a fair coin but has a nonnengligible probability of imbalance. On the other, the permuted block design eliminates unbalanced designs but is highly predictable. The permuted block design employs equiprobable randomization subsequences of length 2b, i.e., blocks of size 2b that assign b subjects to treatment and b subjects to control at a time [12] (Figure 1). The biased coin design was introduced as an alternative to both complete randomization and the permuted blocks to mitigate the balancerandomness trade-off.
To define the biased coin design, let T 1 , . . . , T n be a randomization sequence, where T j = 1 if subject j is randomized to the treatment group and T j = 0 otherwise, j = 1, . . . , n and n is a positive integer. Denote N 1 (j) = j i=1 T i as the number of subjects randomized to the treatment group after j assignments. The biased coin design with 1/2 ≤ p ≤ 1 makes the j + 1st assignment to the treatment group as follows: N 1 (j) = j/2, p, N 1 (j) < j/2, j = 0, 1, . . . , n − 1, 1 − p, N 1 (j) > j/2. (1.1) When p = 1/2, randomization is complete. When p = 1, the randomization corresponds to the permuted block design with block size 2, where every other assignment is deterministic. When p < 1, the design is fully randomized with each subject being assigned to a treatment randomly. Efron suggested p = 2/3 as a suitable trade-off between randomness and balance [6]. Although the original intent of the biased coin was to force a sequential experiment to be balanced, balance is enforced only probabilistically. The same is true with its direct extensions: the biased coin with imbalance tolerance and the accelerated biased coin The allocation space of the maximal procedure with a maximum tolerated imbalance of 2 and n = 8. There are 54 possible sequences that are equiprobable. The most extreme imbalance that may occur prior to the final assignment is 2 in nodes (2,2), (4,3), (6,4), and −2 in nodes (2, 0), (4, 1), (6,2). By comparison, nodes (4,3) and (4,1) are not part of the allocation space of the permuted block design in Figure 1.
design ( [16], p. 51). For example, with a sequence of 2n 1 = 100 assignments and p = 2/3, the probability of final balance with Efron's biased coin is 0.5 compared to 0.08 for complete randomization. In the same example, when the permuted block design is used in blocks of size 4 [4,13] the probability of final balance is 1 but the expected number of deterministic assignments is on average 2n 1 /(b + 1) ≈ 33 [2]. These examples show that the biased coin design improves the probability of balance compared to complete randomization, but does not guarantee it as the permuted block design. The improvement in the probability of balance comes without adding any deterministic assignments. Despite that, the permuted block design is still the most popular procedure for clinical trials [3]. Aside from being recommended by the International Conference on Harmonization Guideline E9 [8], the permuted block design's popularity could also be explained by the use of centralized allocation schedules in global, multisite trials which has reduced the perceived risk of selection bias. Balance throughout the trial, on the other hand, has remained fundamental and the permuted block design guarantees that in theory, if all blocks are filled. Clearly, if selection bias is a concern as is in open label and single site studies, the biased coin or other design alternatives could be better than the permuted block design [9]. Berger, Ivanova, and Deloria Knoll (2003) devised the maximal procedure as such an alternative [2,17]. The maximal procedure reduces the expected number of deterministic assignments compared to the permuted blocks by increasing the allocation space relative to the permuted block design but constraining it to the same maximum intermediate and final balance as the permuted block design (see example in Figure 2). The intermediate imbalance is the unequal number of subject allocated across treatment groups that may occur at any time prior to the final assignment, while the final balance is an equal number of treatment allocations at the end of the trial. Preventing severe intermediate imbalance is necessary to avoid time trends, and to maintain optimal design properties in case of early 4030 V. P. Johnson   Fig 3. The random allocation rule for 2n 1 = 8. There are 8 4 possible sequences that are equiprobable. The random allocation rule for 2n 1 assignments is the same as one permuted block with b = n 1 .
stopping [1]. The maximal procedure is also a member of a class of procedures with a maximum tolerated imbalance, which is the most extreme imbalance allowable during the course of a trial [3]. These are called maximum tolerated imbalance procedures.
Other well known procedures that guarantee final balance are the random allocation rule [16] and the truncated binomial design [5]. The random allocation rule is the random mechanism used to fill the blocks in the permuted block design, but can be viewed as a randomization procedure in itself. It yields equiprobable sequences, but does not control the intermediate imbalance (see example in Figure 3). The truncated binomial design allocates successive treatments independently with probability 0.5, as in complete randomization, until n 1 allocations of one kind have been made. Afterwards, all remaining assignments are deterministic. This procedure does not have equiprobable sequences and does not control the intermediate imbalance. To visualize the truncated binomial design for n = 8, one would change all allocation probabilities in Figure 3 to 0.5 except for those at the boundary which remain 1. A summary of all procedures introduced so far is included in Tables 1 and 2. This paper derives a new procedure that guarantees final balance by conditioning the biased coin design to yield only balanced sequences as in the case of the random allocation rule, the maximal procedure and the permuted block design. The new randomization scheme remains a function of the original bias parameter p which now serves as a parameter for controlling the probability of intermediate balance rather than final balance. With this modification, the new design maintains similar selection bias and intermediate balance as the original biased coin but is self-correcting towards the end when balance must be met. The gain in final balance comes at the expense of adding less than 2 deterministic assignments on the average, which is less than that of the random allocation rule with the same sample size. An important adaptation is the option to use this procedure in multiple blocks, similar to the permuted block design, with Table 1 Summary of randomization procedures for equal allocation that are discussed in this paper.
The procedures aim to randomize n = 2n 1 subjects to two groups.

CR
Allocates each subject with a fair coin.

BCD
Allocates subjects using a fair coin when the allocations in each arm are equal, and a biased coin otherwise to favor the underrepresented arm.

PBD
Allocates subjects in blocks of size 2b at a time; within a block b subjects are assigned to each arm by any random order.

MP
Allocates n 1 subjects to each arm in any random order such that the intermediate imbalance is at most b.

RAR
Allocates n 1 subjects to each arm in any random order.

TBD
Allocates subjects with a fair coin until n 1 subjects are randomized to one of the arms, then assigns the rest to the other arm.
the clear benefit of reducing predictability, while ensuring balance periodically throughout the trial. This paper builds on the work in Plamadeala and Rosenberger (2012) who developed the idea of sampling sequences from conditional reference sets under Efron's biased coin design with the purpose of approximating conditional randomization tests. The current paper treats the sampling mechanism obtained in their work as a randomization mechanism, provides closed form expressions for the allocation probabilities and analyzes several properties. The result in this paper is noteworthy since it shows how a new randomization procedure can be obtained by conditioning an existing one to a subset in its original allocation set. In this process, new connections among existing procedures can be established. Section 2 introduces the new method and describes its properties. Section 3 compares it to other procedures that ensure final balance, while Section 4 outlines several implications on inference. The paper ends with a short discussion.

The conditional biased coin design
The idea of conditioning a randomization procedure to achieve a prespecified final imbalance was formally described in Plamadeala and Rosenberger (2012) in the context of approximating conditional randomization tests. In the simplest and well known case, conditioning the complete randomization to the set of sequences that achieve final balance leads to the random allocation rule. For restricted procedures defined as φ j+1 (m j ) = P {T j+1 = 1|N 1 (j) = m j }, m j = 0, . . . , j, this can be seen as follows. Assume only sequences with n 1 allocations 4032 V. P. Johnson Table 2 Comparison of randomization procedures for equal allocation that are discussed in this paper. The procedures aim to randomize n = 2n 1 subjects to two groups. to treatment and n − n 1 allocations to control are allowed. To achieve such a sequence, T j+1 must be conditioned on both N 1 (j) = m j and N 1 (n) = n 1 , which yields ζ j+1 (m j ,

Forces
Note that equation (2.1) is Theorem 2.1 in [14]. Thus, for complete randomization which has φ j+1 (m j ) = 1/2 and with n = 2n 1 : the expression for the random allocation rule. This establishes the relationship between complete randomization and the random allocation rule, in the sense that the random allocation rule for a sequence of size 2n 1 is the complete randomization assignment conditioned on the set of sequences assigning n 1 subjects to each treatment group. For example, to obtain the random allocation rule in Figure 3, one sequentially conditions the complete randomization assignments for a sequence of size n = 8 to the set {N 1 (8) = 4}. For the biased coin design, Plamadeala and Rosenberger (2012) derived the exact conditional probabilities necessary to evaluate (2.1) numerically, but do not provide the closed form expression of (2.1). Specifically, they derived the conditional distribution P {N 1 (n) = n 1 |N 1 (j) = m j }, where n is a positive integer, n 1 = 0, . . . , n, j = 0, . . . n− 1 and m j = max(0, n 1 − (n − j)), . . . , min(j, n 1 ). For the special case n 1 = n/2, p n,j (m j ) = P {N 1 (n) = n 1 |N 1 (j) = m j } is: 2) with 1/2 ≤ p ≤ 1 and q = 1 − p. The exact unconditional distribution of N 1 (n) was derived by Markaryan and Rosenberger (2010): 3) where n is a positive integer, n 1 = 0, . . . , n, 1/2 ≤ p ≤ 1 and q = 1 − p. Theorem 2.1 below gives the closed form of (2.1) with the biased coin when the final assignments must be in balance, which is introduced as an entirely new randomization procedure and is the main result of this paper. This procedure will be referred to as the conditional biased coin design. In all expressions below, the paper adopts the convention that a sum is treated as 0 when its upper limit is smaller than its lower limit. Theorem 2.1. Let n 1 be a positive integer, j = 0, . . . , 2n 1 −1, m j = max(0, j − n 1 ), . . . , min(j, n 1 ), 1/2 ≤ p < 1, and q = 1 − p; for p = 1 set the range of m j as that of N 1 (j) in (1.1). Let T j+1 be an allocation made according to Efron's biased coin design with bias parameter p and N 1 (j) = T i+1 which makes an assignment to group 1 with the following probability: The new restricted procedure has ζ j+1 (m j , n 1 ) given by: Proof. See Appendix A. Figure 4 illustrates the allocation space of the conditional biased coin design with p = 2/3 and a total sample size of 2n 1 = 8 assignments. Each down arrow indicates an assignment to the treatment group, while an arrow to the right indicates an assignment to the control group. All sequences start from the node (0, 0) and end at (8,4). The first element in each node summarizes the number of subjects allocated so far, and the second number tallies how many subjects have been randomized to the treatment group. The allocation probabilities are indicated on the arrows and were computed exactly using (2.4). In this example, it is observed that the design is symmetric about the allocation ray, which is the sequence of diagonal nodes with intermediate balance: (0, 0), (2, 1), (4, 2), (6, 3), (8,4). Also, the allocation probabilities are not constant, and at times of intermediate imbalance in favor of the control group, the allocation probabilities to the treatment group are all larger than the bias parameter p = 2/3. In addition, the design on Figure 4 has the same determin-istic assignments as the permuted block design with b = 4 and 2n 1 = 8, but the allocation sequences are not equiprobable as in the permuted block design. To generate a random sequence following the conditional biased coin procedure one would simply have to apply (2.4) sequentially, the same way (1.1) is used to obtain a random sequence following the original biased coin design.
Similarly to the original biased coin design, when p = 1 with a total sample size of 2n 1 assignments, the conditional biased design becomes the permuted block design with n 1 blocks of size 2. In such a design every other assignment is deterministic, which is undesirable. When p = 1/2, the design is the random allocation rule for equal allocation, or equivalently the permuted block design with a single block of size 2n 1 . To see this analytically, when n 1 ≤ n/2 and from which a new identity for the binomial coefficient follows: One can then use (2.5) in (2.4) to show that the conditional biased coin design with p = 1/2 reduces to the random allocation rule for equal allocation. As already observed in Figure 4, similarly to the original biased coin design, the conditional biased coin design is symmetric in the allocation probabilities. This is formally stated in the following corollary. Corollary 2.1. Let ζ j+1 (m j , n 1 ) be defined as in (2.4). For any pair (m j , m j ), such that m j < j/2, m j > j/2 and j − m j = m j , Also, it is evident from (1.1) and (2.4) that following intermediate balance both designs make the next assignment with probability 1/2. It may not be immediately obvious that given the same intermediate imbalance in favor of the control group and the same parameter p, 1/2 ≤ p < 1, the conditional biased coin design makes the next allocation to the treatment group with a probability at least as large as that of the biased coin design, provided this assignment is in the allocation space of the conditional biased coin. (2.4) and (1.1), respectively. Proof. See Appendix C.
Furthermore, for finite n 1 and the same parameter p, 1/2 ≤ p ≤ 1, the conditional biased coin design has a probability of intermediate balance at least as large as that of the biased coin design ( Figure 5). To see this in general, let P {N 1 (j) = j/2} be the probability of intermediate balance after j assignments in a sequence of a finite number of 2n 1 assignments for the biased coin, n 1 = 2, 3, . . . and j = 2, 4, . . . , 2n 1 − 2. In the case of the conditional biased coin with the same p, the probability of intermediate balance at j is: where the second equality is due to the Markovian property of N 1 (j). The conclusion follows by noting that the ratio of probabilities in the second line is at least 1 because P {N 1 (2n 1 ) = n 1 } is a nonincreasing function of n 1 (by Theorem 1 in Efron (1971)). For the conditional biased coin, P {N 1 (j) = j/2} is not monotonic in j by design ( Figure 5). For very large even n 1 and j = n 1 , because the ratio of probabilities in (2.6) approaches 1 by equation 3.3 in Efron (1971). This indicates that the designs behave similarly as the sample size increases. Finally, the allocation probability to T j+1 = 1 when m j < j/2 in (2.4) are decreasing in q, 0 ≤ q ≤ 1/2, or equivalently increasing in p. As with the original biased coin, this implies that increasing values of p shrink the distribution of N 1 (j) toward balance, which leads to the following corollary.

Corollary 2.3. The probability of intermediate balance for the conditional biased coin design is nondecreasing in
Proof. See Appendix D.
Corollary 2.3 establishes that the role of p in the new design is that of controlling the intermediate imbalance. An important property of any randomization procedure is preserving the allocation ratio at each assignment. In the equal allocation case, this is preserving the 1 to 1 allocation ratio at each step such that every subject has the same chance of assignment to the treatment regardless of order of entry (Kuznetsova and Tymofyeyev, 2012). For the biased coin design, the allocation preserving property is expressed as P (T j+1 = 1) = 1/2, j = 0, . . . , n − 1. The allocation preserving property for the conditional biased coin design is P (T j+1 = 1) = P (T j+1 = 1|N 1 (2n 1 ) = n 1 ) = 1/2. This property holds for the conditional biased coin design for all 1/2 ≤ p ≤ 1 and is due to the symmetry in the allocation probabilities.
When a clinical trial is planned, the total sample size is often prespecified as a result of power calculations. If the total number to be randomized is known and randomization is unstratified, the conditional biased coin design is a straightforward application, provided equal allocation is also used in the trial. While unstratified randomization does not incorporate any covariates and uses a single randomization, stratified randomization with equal allocation aims to achieve balance between treatment groups within all strata formed by a set of known covariates by implementing a separate randomization within each stratum (see Chapter 7 in [16] on stratified randomization). If stratification by covariates is involved, the total final number of subjects ultimately accrued in each stratum may be unknown. In this case, it may be more practical to randomize subjects in small blocks of size 2b using the conditional biased coin design to fill the blocks in each stratum, with the anticipation that the last block within any stratum may not be complete. This is similar to the stratified blocked randomization, which uses the permuted block design within each stratum to ensure balance ( [16], p. 137). In general, the conditional biased coin design in blocks can be used when the total sample size is not known in advance. Figure 6 illustrates the case for 2n 1 = 8, 2b = 4 and p = 2/3. Unlike the permuted block design with 2b = 4 and 2n 1 = 8, the sequences for the design in Figure 6 are not equiprobable. Because the probability of intermediate balance is nondecreasing in p, blocks filled using the conditional biased coin design with p > 1/2 have better intermediate balance than the permuted block design (see Figure 5 and Corollary 2.3). The covariance matrix of T 1 , . . . , T n is introduced next. This covariance matrix is used when deriving the variance of test statistics under the randomization model ( [16], pp. 109-110). The result below follows from the allocation preserving ratio property and a direct evaluation of the covariance of T i and T j . It is a special case of Lemma 4.2 in [14] when n = 2n 1 . Define: Plamadeala and Rosenberger (2012), n 1 is a positive integer and n = n 1 . For any 1 ≤ i ≤ j ≤ n the (i, j)th entry of the covariance matrix, σ ij , is: This is an exact evaluation of the covariance matrix, albeit computational, where each matrix component is a sum of probability products, some of which are probabilities describing the original biased coin. The proof is already sketched in Plamadeala and Rosenberger (2012); however, several detailed steps on how to obtain E(T i T j ) are given in Appendix E.

Selection bias
Selection bias is the bias in estimating the treatment effect due to the intentional guessing of the treatment allocations by the investigator when the randomization mechanism is known and the randomization is not centralized with subjects being allocated sequentially [5,12]. Blackwell and Hodges (1956) quantified selection bias by the maximum expected number of correct guesses in excess of what is expected by chance, which is also called the expected selection bias factor, E(F ). The expected selection bias factor is always computed under a guessing strategy that maximizes the expected number of correct guesses. In complete randomization, E(F ) = 0 regardless of guessing strategy. Blackwell and Hodges (1956) formally prove that the truncated binomial design minimizes E(F ) amongst all procedures sequentially assigning subjects in a block of a prespecified size. This would have to follow since the truncated binomial design behaves like complete randomization for at least the first n 1 assignments. Another metric for selection bias is the expected number of deterministic assignments. This assesses selection bias when the investigator tries to influence patient selection only when the next allocation is known with certainty [12]. It is not necessarily true that a higher expected number of deterministic assignments implies a higher E(F ). This section uses both metrics to compare the conditional biased coin design with and without blocks to the permuted block design with a block size of 2b = 4, the maximal procedure with a maximum tolerated imbalance of 2, the random allocation rule, and the truncated binomial design. When the total sample size is an even integer, all these procedures achieve final balance with probability 1. The permuted block design with a block size of 2b = 4 was chosen in this comparison since this design is used in practice [4,13].

The expected number of deterministic assignments, E n1 (D)
For the conditional biased coin design, the exact expression of the expected number of deterministic assignments in a sequence of size 2n 1 is: where ζ 2n1−i (n 1 − 1, n 1 ) is given in (2.4) and the probability in the first line was first expanded then evaluated with (2.3) and (2.2).
Proof. The proof is provided in Appendix F.  As already mentioned, when p = 1 and the total number of allocations is 2n 1 , the conditional biased design becomes the permuted block design with n 1 blocks of size 2, where every other assignment is deterministic and E n1 (D) = n 1 .
For the conditional biased coin design randomizing in blocks of size 4, (3.1) is evaluated first for one block with n 1 = 2, which reduces to (3 − 2p)/(2 − p). This is then multiplied by the total number of blocks to obtain the expected number of deterministic assignment for the entire design. Efron's biased coin is fully randomized when p < 1, and thus E n1 (D) = 0.
For the permuted block design E n1 (D) = 2n 1 /(b + 1) [2]. It follows that for the random allocation rule this expectation is 2n 1 /(n 1 + 1). For the maximal procedure with a maximum tolerated imbalance of 2, the expected number of deterministic assignments is (n 1 + 2)/3 [2]. For the truncated binomial design, the expected number of deterministic assignments is [5]: Table 3 compares these procedures with respect to E n1 (D). The analytical limit of E n1 (D) as n 1 increases is provided under n 1 = ∞. From Table 3, the permuted block design has the highest expected number of deterministic assignments followed by the conditional biased coin design with p = 3/4 and a block size of 4, the maximal procedure and the truncated binomial design. In all four cases, the expectation increases with n 1 , but at a lower rate for the truncated binomial design. In the case of the random allocation rule the expectation converges to 2 as the sample size increases. The expectation converges to a constant less than 2 in the case of both the conditional biased coin design with p = 2/3 and p = 3/4, with the latter having the smallest expected number of deterministic assignments of all considered procedures. The limiting value 1/p is a good approximation even for small sample sizes. The reduction in the number of expected number of deterministic assignments by the conditional biased coin design compared to the random allocation rule and permuted block design is a direct consequence of having non-equiprobable sequences. Since randomization in small blocks forces balance periodically with each complete block, at least one assignment within a block is deterministicthe last one. Consequently, the expected number of deterministic assignments within a block is at least one, and over the entire sequence this expectation is at least as large as the total number of blocks. In the conditional biased coin design without blocks the sequences with deterministic assignments are very rare and only the last assignment in the entire sequence is deterministic. This explains the large difference in the E n1 (D) values between the conditional biased coin with and without blocks in Table 3.

The expected selection bias factor, E(F )
For all procedures considered, except the truncated binomial design, the convergence strategy maximizes the expected number of correct guesses, or always guessing towards balance, which is the same as guessing the treatment that was least allocated so far. This is reasonable since at each step the least assigned treatment has the larger allocation probability at the next assignment. In the case of the truncated binomial design, to maximize E(F ) one can adopt any strategy until n 1 allocations of one kind have been made, since both treatments have an equal probability of assignment at the next allocation; then identify appropriately the tail allocations.
For Efron's biased coin, the exact expression of E(F ) is provided in [11] and its asymptotic approximation is n 1 (p/q − 1)/(2p/q) [6]. In the case of the conditional biased coin design, the exact value of E(F ) can be computed as follows. Let I ζj ≥1/2 = 1 if ζ j (m j−1 , n 1 ) ≥ 1/2 and zero otherwise, where ζ j (m j−1 , n 1 ) is given by (2.4). Further, denote: Then, in a sequence of 2n 1 assignments: where the probability inside the sum is first expanded then computed using (2.3) and (2.2). For the conditional biased coin design randomizing in blocks of size 4, E(F ) = n 1 (3 − p)/(8 − 4p).
The closed form expression for E(F ) in the case of the truncated binomial design is given in [5] as: The quantity following the symbol ∼ is the approximation using Stirling's formula for large factorials and a n ∼ b n is written when lim n→∞ (a n /b n ) = 1. Matts and Lachin (1988) provided the expression for the permuted block design which is: where 2b is the block size and B is the number of blocks, B = 2n 1 /(2b). Thus E(F ) is: for the random allocation rule. For the maximal procedure with maximum tolerated imbalance of 2, E(F ) = (2n 1 + 1)/6. In Table 4, the limit of the average expected selection bias E(F )/n 1 as n 1 increases is provided under n 1 = ∞ where available. The average selection bias of the conditional biased coin design with p = 3/4 is comparable to that of the original biased coin with p = 3/4 even for small values of n 1 . For large n 1 , they both appear to converge to the same limit, which was confirmed in separate simulations with other values of p > 1/2. This indicates that both biased coins behave similarly for medium to large sample sizes and the main difference is whether final balance is enforced. The conditional biased coin design with p = 3/4 and block size 4 has a higher average selection bias than both the permuted block design with the same block size and the maximal procedures design. However, when not used in blocks the average selection bias of the conditional biased coin design with p = 3/4 is comparable to that of the maximal procedure with a maximum tolerated imbalance of 2 (Table 4). In separate numerical studies, it was observed that the probability of the intermediate imbalance exceeding 2 for the conditional biased coin design with p = 3/4 is less than 0.1 even for n 1 as small as 10, which explains its similarity to the maximal procedure with a maximum tolerated imbalance of 2.
For unstratified randomization, the decision about the choice of p for the conditional biased coin design is similar to that for the biased coin, with the difference that the balance-randomness trade-off is with respect to the intermediate balance rather final balance. A larger value of p brings the assignments closer to intermediate balance but higher predictability as is the case when p = 1, while a lower p leads to lower probability of intermediate balance but more randomness as is the case when p = 1/2. Since for medium to large sample sizes both biased coin designs behave similarly in terms of selection bias and intermediate balance, the recommendations of p = 2/3 from the original biased coin is also feasible for the conditional biased coin, but other values may work as well. If selection bias is a concern and balance on several known prognostic covariates is paramount, stratified randomization with the conditional biased coin in blocks of size 6 or even 8 with p = 6/10, p = 2/3 or p = 3/4 is better than the usual permuted blocks with the same block size when it comes to deterministic assignments: in a conditional biased coin block, E b (D) < 1.5 for any of these block and p combinations, while in a permuted block E b (D) ≥ 1.5 for block sizes above 6. To assess the overall balance-randomness trade-off, a trade-off plot (Section 8.4.1 in [16]), which graphs the selection bias and a bal-ance criterion simultaneously, can be used to choose the p parameter for a given block size. In Figure 7, the balance criterion on the y-axis is the variance of the imbalance halfway through the block divided by half the block size, which has a minimum of 0 for very large block sizes and p > 1/2 and has a maximum value of 1 for a block of 2. The selection bias criterion on the x-axis is the expected selection bias factor within one block, E(F )/b, which approaches 0 for large block sizes with p = 1/2, and has a maximum value of 1/2 for a block size of 2. In this figure, the selection bias values were rescaled to match the range of the imbalance metric. After rescaling, both criteria are treated equally important: a one unit decrease on the x-axis is ascribed the same importance as one unit decrease on the y-axis, but other importance rules can be assumed ( [16], p. 152). In Figure 7, points closest to the origin provide the desired trade-off. Thus, a conditional biased coin block of size 6 and p = 6/10 will provide the best trade-off between intermediate imbalance and randomness, and also have fewer deterministic assignments than the permuted block. The permuted block with the same block size corresponds to p = 1/2 on the plot. Large block sizes are not recommended with the permuted blocks [9], as balance deteriorates with increasing block size if the block is not filled. However, with the conditional biased coin design, a block size of 8 and 10 with p = 7/10 (points not shown for block size 10) can provide comparable trade-off and intermediate balance as that with block size of 6 and p = 6/10. Figure 7 also shows the conditional biased coin design for a sample size of 2n 1 = 100, marked with "Block size 100". A value of p = 6/10 gives the best trade-off, which is close to Efron's recommendation for the original biased coin. If selection bias is not a concern, a block size of 4 or 6 with p = 3/4 or p = 4/5 will provide better balance than the stratified permuted blocks counterpart by Corollary 2.3 (see simulation in Section 4). A block size of 2 is not recommended with either block design since the potential for unblinding is very high. As with other design parameters, the choice of p and the block size has to be optimized given the individual trial requirements.
In summary, the conditional biased coin design in blocks is applicable any time the permuted block design is an option. The conditional biased coin design without blocks is applicable in unstratified designs with prespecified sample sizes and equal allocation, where Efron's biased coin design is applicable. The material gain with the conditional biased coin compared to Efron's biased coin is the enforcement of perfect balance at the end, while maintaining comparable or better intermediate balance and comparable selection bias. The conditional biased coin design in blocks emerges as a better option compared to the permuted block design for equal allocation when it comes to forcing intermediate balance periodically, achieving final balance, but also reducing the total number of deterministic assignments.

Inference
Inference about the treatment effect in a clinical trial may be conducted via traditional hypothesis testing based on population models or re-randomization tests [16]. The fundamental difference between the two inference approaches is that re-randomization tests frame the treatment assignments as random and the treatment outcomes as fixed, while in a population model the treatment outcomes are realizations of a random variable at fixed values of the treatment assignments ( [16], p. 101). Various aspects related to the conduct of rerandomization tests under different randomization procedures have already been thoroughly covered in Chapter 6 of Rosenberger and Lachin (2015). Recent developments about the impact of different randomization procedures on popular hypothesis tests have been made by Shao et al. (2010), Shao and Yu (2013), and Ye and Shao (2020). This section addresses the implications of using the conditional biased coin design with hypothesis testing and several problems specific to the conditional biased coin design as a basis for inference.

Model-based inference with the conditional biased coin design
Since common hypothesis tests have been developed under complete randomization, there has been a concern if the Type 1 error of these tests is inflated from using other randomization procedures [19,20,21]. Shao et al. (2010) provide a sufficient condition for a hypothesis test to maintain its Type 1 error when the response depends on a set of covariates and the randomization is not complete: if the allocation procedure and the response are conditionally independent given the set of covariates, a hypothesis test which is valid under fixed treatment allocation will maintain its Type 1 error under the new procedure and the error rate will be the same as when the allocations are fixed. This is the case for any restricted randomization procedure ( [16], p. 179) applied in unstratified designs and discussed in this paper. However, when randomization is stratified, this condition may not hold. Shao et al. (2010) have shown that under stratified randomization with Efron's biased coin design the two-sample t-test is conservative. Shao and Yu (2013) and Ye and Shao (2020) extend the same conclusions with Wald's test and the score test in misspecified generalized linear models and misspecified proportional hazards models, respectively. If all covariates used in the randomization are included in the model and the model is correctly specified the test from the proper analysis of covariance model, Wald's and score tests are valid under stratified randomization [19,20,21].
A simulation is carried out to investigate the Type 1 error rate and power of the test about the treatment effect under the conditional biased coin design and the following linear model, which is similar to model (12) in Shao et al. (2010): The covariates Z i1 and Z i2 are binary with the prevalences 0.1 and 0.9 for Z i1 , and 0.33 and 0.67 for Z i2 . Their coefficients are β 1 = 1 and β 2 = 3. The covariate Z i3 is discrete and has four levels with prevalences 0.5, 0.3, 0.1 and 0.1, and coefficient vector β 3 = (0, 1, 2, 5) T . The covariate Z i4 is also discrete and has three   N (0, 1). The simulation investigates both stratified and unstratified randomization. The four covariates form 48 possible independent strata and stratified randomization is used to balance the two treatment groups across these covariates. Stratified randomization in blocks of 6 is performed with the permuted block design and the conditional biased coin design with p = 4/5, as well as Efron's biased coin design with p = 2/3 and p = 4/5. Since some of the strata have low probabilities of occurring, many blocks will not be filled potentially leading to treatment imbalances overall in the study and across the covariates. Unstratified randomization is implemented with Efron's biased coin design and the conditional biased coin design with p = 4/5, as well as randomization in blocks of size 6 with the conditional biased coin design with p = 4/5 and the stratified permuted block design. The power and Type 1 error rates for the hypothesis H 0 : μ 1 − μ 2 = 0 versus H 0 : μ 1 −μ 2 = 0 for the t-test and the analysis of covariance tests with both correct and misspecified models are shown in Table 5. The misspecified analysis of covariance model omits factor Z i1 , but controls for Z i2 and dichotomized forms of Z i3 and Z i4 . Factor Z i3 was dichotomized by combining the least prevalent 3 levels and Z i4 was dichotomized by combining the least prevalent 2 levels. The results in Table 5 show that the t-test under all unstratified designs preserves the Type 1 error because the subject allocations are independent of the responses and covariates. The t-test under these procedures are also equally powerful. The same is true when the model is correctly specified under stratified randomization: the test about the treatment effect preserves the nominal Type 1 error and is equally powerful across all stratified procedures. However, the t-test and the misspecified test from the analysis of covariance model are conservative under all stratified randomization procedures in Table 5. Shao et al. (2010) show that under the stratified Efron's biased coin design this is due to the numerator of the t-test statistic having a smaller variance than the one prescribed by the t-test, which produces a null distribution that has a smaller variance than the T distribution referenced under the null. A similar effect on the variance of the numerator appears to hold under all other stratified procedures considered in this simulation, which explains the test error rates in Table 5. That is, the null distributions for the t-statistic with the conditional biased coin design and the permuted block design and stratified randomization also incur a variance reduction which explains the reduced Type 1 error rates in Table 5. Furthermore, the impact on the error rates appears to depend on the magnitude of this variance reduction under each procedure, which must be established analytically. For example, Efron's biased coin design with stratified randomization and p = 2/3 has the lowest power at d = 0.7 in Table 5 and the widest distribution in Figure 8. The stratified permuted block design and the conditional biased coin design in blocks with p = 4/5 are the most powerful and have the narrowest distributions in Figure 8. Although the stratified permuted block design and the conditional biased coin design in blocks with p = 4/5 appear to behave similarly both in terms of simulated long-run power at d = 0.7 and simulated distributions in Figure 8, the latter produces better overall allocation balance at the end of the study (Figure 9) and across individual covariates (e.g., Figure 12 in Appendix G), which is necessary to minimize biases due to subject heterogeneity ( [16], p. 133). The reverse relationship holds for the Type 1 error: the stratified procedure with the widest null distribution for the t-statistic, i.e, with the least reduction in the variance of the t-test numerator, is the least conservative in Table 5 (null distribution plots not shown).
Here it must be noted that the conditional biased coin design in blocks produces the most conservative test of all. Robust tests that preserve the nominal Type 1 error under model misspecification and stratified randomization have been proposed [19,20,21], but it is unclear if their power would be impacted the same way as that of conventional hypothesis tests. This is an interesting topic for further research.

Re-randomization tests with the conditional biased coin design
First, the re-randomization test with the conditional biased coin design is defined. Let T = (T 1 , . . . , T 2n1 ) be a randomization sequence following the conditional biased coin design, where T i = 1 if subject i is assigned to treatment 1 and T i = 0 if subject i is randomized to treatment 2, i = 1, . . . , 2n 1 . Also, let x = (x 1 , . . . , x 2n1 ) be a vector corresponding to the realized responses for a primary endpoint. A valid level-α test about the hypothesis of no treatment effect can be built by permuting T in all possible ways [14]. Each such permutation has a probability that is obtained by sequentially applying (2.4). While any metric for the treatment effect can be used, the family of linear rank tests is amenable to the conduct of re-randomization tests [16]. Using the score vector a = (a 1 −ā, . . . , a 2n1 −ā) , where a i is some function of the ranks of the ith response x i andā = Σ 2n1 i=1 a i /2n 1 , the form of the standardized linear rank test statistic is ( [16], pp. 105-113; also see Section 6.5 in [16] for different options for the score function): The quantity in the denominator of (4.2) is the square root of the variance for a T , which involves Σ, the exact covariance matrix of T 1 , . . . , T 2n1 , given in Section 2.
The null distribution of the statistic in (4.2) coincides with that from the conditional re-randomization test following the original biased coin given final balance in 2n 1 assignments. This is because, by definition, the conditional rerandomization test following the original biased coin permutes the treatment assignments to include only sequences that have the same imbalance as the one observed ( [14], [16]), in this case no imbalance, while by construction, the conditional biased coin design generates only sequences from the original biased coin that achieve final balance. However, this distribution is different from that of the unconditional re-randomization test following the original biased coin where the statistic is not conditioned on the final balance (see Section 6.9.1 in [16] for more on unconditional re-randomization tests.) For small to moderate sample sizes, Hollander and Peña (1988) provide an algorithm to compute exactly the conditional re-randomization distribution following Efron's biased coin, or as pointed above, the distribution of (4.2). Plamadeala and Rosenberger (2012) showed how to approximate these tests for large sample sizes via Monte Carlo by drawing sequences T using (2.1) sequentially and evaluating (4.2) for each sequence. Computation wise, the improvement provided in this paper is the closed form expression of (2.1) given by (2.4) when there is no imbalance with an even number of allocations. It is important to re-emphasize at this point that the sampling mechanism to approximate the re-randomization test and the randomization mechanism (2.4) are one and the same. The difference is that when used for randomization purposes during the conduct of the trial (2.4) is applied to obtain only one sequence, while when used for sampling purposes during the analysis of the trial it is applied to generate as many sequences as the Monte Carlo sample size. With both the Monte Carlo and the exact method, the test p-value is the proportion of sequences with statistic values as extreme or more extreme than the one observed.
An important question is whether the statistic in (4.2) has an asymptotically standard normal distribution. Smythe and Wei (1983) pointed out that the unconditional re-randomization test following the original biased coin is not asymptotically normal. When investigating the conditional re-randomization test for the original biased coin, Hollander and Peña (1988) provide a conjecture that it also lacks asymptotic normality; that is (4.2) lacks asymptotic normality. Their claim is based on an example showing the exact distribution of (4.2) for 2n 1 = 50 and 2n 1 = 70, p = 2/3 and an extreme score vector a = (1 −ā, 2 −ā, . . . , 2n 1 −ā) , obtained by applying simple ranks on a response vector with a very strong time trend. This example is replicated in Figure 10 using the Monte Carlo method for both sample sizes alongside the simulated cumulative distribution for a much larger sample size, 2n 1 = 300. While in this example it is not possible to distinguish the 2n 1 = 50 and 2n 1 = 70 curves either with the exact method, as they have shown, or the Monte Carlo approximation as in [14], the 2n 1 = 300 curve is somewhat distinct from the 2n 1 = 50 curve, but still distant from the standard normal. This additional example also points to a convergence distribution that is likely not standard normal, and supports the conjecture that the re-randomization distribution of the linear rank statistic following the conditional biased coin design with p > 1/2, equivalently the conditional re-randomization distribution following the original biased coin design given final balance, is not asymptotically standard normal. Thus, when computing p-values for these re-randomization tests it may be more suitable to use the Monte Carlo or the exact method rather than reference the standard normal distribution.

Discussion
The original intent of Efron's biased coin design (1.1) was to force a sequential random assignment to be balanced; however, its probability of final balance is not 1 for a bias p ∈ [1/2, 1). By conditioning (1.1) to the subset of sequences that are balanced, the conditional biased coin design (2.1) guarantees final balance while also controlling the intermediate imbalance. The permuted block design for equal allocation was shown to be a special case of the conditional biased coin design when (2.1) is used to randomize in blocks. In general, the conditional biased coin with blocks and p ∈ (1/2, 1) can substitute the permuted block design for equal allocation with the added bonus of a reduced expected number of deterministic assignments.
Although the parameter space for p in (1.1) is [1/2, 1], the results in (2.3) and (2.2) also hold for 0 ≤ p ≤ 1 since (2.3) and (2.2) find the probability mass on a set by partitioning it into disjoint subsets of sequences having the same number of allocations made with probabilities p, q and 1/2. This implies a conditional biased coin design with an expanded parameter space to include 0 ≤ p < 1/2, which has no practical value for the biased coin design since the probability mass is shifted towards extreme sequences, away from intermediate and final balance. In the conditional biased coin design, however, p < 1/2 impacts only the intermediate balance and enables the search for 0 ≤ p ≤ 1 that minimizes the design's expected selection bias factor E(F ) in a sequence of 2n 1 trials.
One can also derive a result similar to (2.4) for the unequal allocation case and p > 1/2. However, in the unequal allocation case the conditional biased coin with p > 1/2 does not preserve the allocation ratio and thus is not recommended for use in clinical trials. On the other hand, both the random allocation rule and the permuted block design for unequal allocation do preserve the unconditional allocation ratio. Proschan, Brittain, and Kammerman (2011) and Kuznetsova and Tymofyeyev (2012) discuss the problems arising from the use of randomization procedures not preserving the allocation ratio. Biased coin randomization procedures for unequal allocation that also preserve the allocation ratio are provided in [9] and [10].

Appendix E: Deriving E(T i T j ) for the covariance matrix
Let T i and T j be the ith and jth assignments from a sequence of allocations generated by the conditional biased coin design, 1 ≤ i < j ≤ n. Then: = P (T i = 1, T j = 1) − P (T i = 1)P (T j = 1) = P (T i =1, T j =1|N 1 (n) = n 1 ) − P (T i = 1|N 1 (n) = n 1 )P (T j = 1|N 1 (n) = n 1 ), where T i and T j are the corresponding ith and jth assignment from the original biased coin. It remains to expand each component of this difference. P (T i = 1, T j = 1|N 1 (n) = n 1 ) = P (T i = 1, T j = 1, N 1 (n) = n 1 ) P (N 1 (n) = n 1 ) The numerator of the above expression is expanded separately by rewriting it as a sum of probabilities over disjoint sets then using the Bayes rule: The remaining quantity E(T i ) = P (T i = 1) = P (T i = 1|N 1 (n) = n 1 ) = 1/2, by the allocation ratio preserving property.
Appendix F: Proof of Proposition 3.1 Some new notation is introduced. Let D n be the absolute difference between the number of subjects assigned to control and N 1 (n), the number assigned to treatment, after n allocations using Efron's biased coin design: which converges to p as n 1 tends to infinity, since the ratio of the two probabilities converges to 1 by (F.1).
For i ≥ 2, the sum in the first line of (3.1) is expressed as: