Improved inference for vaccine-induced immune responses via shape-constrained methods

We study the performance of shape-constrained methods for evaluating immune response profiles from early-phase vaccine trials. The motivating problem for this work involves quantifying and comparing the IgG binding immune responses to the first and second variable loops (V1V2 region) arising in HVTN 097 and HVTN 100 HIV vaccine trials. We consider unimodal and log-concave shape-constrained methods to compare the immune profiles of the two vaccines, which is reasonable because the data support that the underlying densities of the immune responses could have these shapes. To this end, we develop novel shape-constrained tests of stochastic dominance and shape-constrained plug-in estimators of the Hellinger distance between two densities. Our techniques are either tuning parameter free, or rely on only one tuning parameter, but their performance is either better (the tests of stochastic dominance) or comparable with the nonparametric methods (the estimators of Hellinger distance). The minimal dependence on tuning parameters is especially desirable in clinical contexts where analyses must be prespecified and reproducible. Our methods are supported by theoretical results and simulation studies.


Introduction
To date, the RV144 trial conducted in Thailand is the only vaccine efficacy trial to show a signal of efficacy (31%) against HIV infection (Rerks-Ngarm et al., 2009). RV144 inspired a phase 1b trial, named HIV Vaccine Trials Network (HVTN) 097, which evaluated the safety and immunogenicity of the same regimen in a South African population (Gray et al., 2019). The predominant subtype of HIV in South Africa is Clade C. Therefore, in an effort to increase the potential for high efficacy against clade C infections, scientists modified the HVTN 097 regimen to include HIV strains matched to the South African clade C infections (Bekker et al., 2018). A phase 1/2 trial named HVTN 100 assessed the safety and immunogenicity of the modified regimen in South Africa. However, the HVTN 702 phase 2B/3 trial of the HVTN 100 regimen in South Africa met its non-efficacy criteria in February 2020 at a planned interim analysis.
In light of the above, the comparison between the immune response profiles of HVTN 097 and HVTN 100 trials becomes important because the latter can shed some light on why the HVTN 100 regimen lacks efficacy. To fix ideas, here we focus on one class of immune responses, namely the binding of IgG antibodies to the first and second variable loops (V1V2 region) of the HIV envelope. This immune response is of particular interest because the RV144 trial revealed an inverse association between HIV infection and this immune response among vaccinees (Haynes et al., 2012). Using data from HVTN 097 and HVTN 100, we focus on answering the following three questions: Q1. How can we estimate the densities of the aggregated IgG binding immune responses (to HIV-1 envelope proteins)? Q2. Is there any ordering between the distributions of the immune responses from the two trials? Q3. How can we measure the discrepancy between the densities of the immune responses from the two trials? Although Q1 is not directly related with comparison of the two vaccine trials, answering Q1 is important because the resulting density estimators can help in designing subsequent vaccines. To answer Q2, we resort to testing for stochastic dominance, which addresses the ordering of the underlying distribution functions. To answer Q3, we rely on the squared Hellinger distance as a measure of discrepancy between two densities, where, for densities f and g, the Hellinger distance D(f, g) is defined by The significance of Q3 may not be immediately obvious. However, the squared Hellinger distance between the immune responses of HVTN 097 and HVTN 100 trials can serve as a benchmark when new pairs of vaccines are compared in future vaccine trials. The reason behind choosing the squared Hellinger distance as the measure of discrepancy in particular is discussed in Section 5. There are numerous nonparametric methods that can be implemented to carry out the aforementioned steps. However, traditional nonparametric methods do not exploit any information on the shape of the underlying densities. The uniformity of the trial population and exploratory analyses bear evidence that the underlying densities can be unimodal. The data are also consistent with the possibility that the densities are log-concave. The latter is an important subclass of unimodal densities, often advocated for use in modelling because it contains most of the well-known subexponential unimodal densities and allows powerful density estimation tools (Walther et al., 2009). Of late, shape-constrained density estimation has gained much attention. Among other reasons, the reduced burden of external tuning parameters (Samworth and Sen, 2018;Johnson et al., 2018) makes shape-constrained density estimation an attractive alternative to traditional nonparametric approaches like kernel-based or basis expansion type methods, which are known to be sensitive to the choice of the tuning parameters (cf. pp. 327, Efromovich, 2008;Laha, 2021). Also, shape-constrained density estimation methods require weaker smoothness assumptions for asymptotic consistency than do the nonparametric methods. Shape-constrained techniques are widely used in economics and operations research (Johnson et al., 2018), and have seen application in other domains such as circuit design (Hannah and Dunson, 2012). For a detailed account on the recent development of the shape constraint literature, we refer the survey articles Samworth and Sen (2018) and Samworth (2018).
Although leveraging shape information can potentially increase efficiency, there is little to no literature on the application of shape-constrained tools in vaccine trials. To answer our motivating questions, therefore, we develop new methods using shape-constrained tools. The application of the new methods is not limited to vaccine trials. For example, our tests can be applied to other areas of medical research where tests of stochastic dominance are relevant. See Leshno and Levy (2004) for an in-depth discussion of potential uses of tests of stochastic dominance in medical research. In particular, our methods are applicable to the mortality data considered in Leshno and Levy (2004) to infer whether a surgery increases the mortality of patients with abdominal aortic aneurysma. See also Stinnett and Mullahy (1998) and DeFauw (2011) for the use of stochastic dominance in cost effective analysis of healthcare. Outside medical research, our tests have substantial applicability in finance, economics, social welfare, and operations research, where stochastic dominance is a popular tool to compare portfolios, income, utility, poverty, opportunity etc.; see Levy (1992), Sriboonchita et al. (2009), Le Breton (1991), among others, for a detailed account. On the other hand, Hellinger distance has also seen successful application as a measure of discrepancy in various disciplines ranging from machine learning (Cieslak and Chawla, 2009;González-Castro et al., 2013 to ecology (Rao, 1995) to fraud detection (Yamanishi et al., 2004). However, for income data, log-concavity based methods should be used with caution since distributions with log-concave density are always sub-exponential , whereas income data can have heavier tails. Diagnostic tools such as those in Asmussen and Lehtomaa (2017) can be used to enquire if the data has heavier tail, e.g. regular varying tails, in these cases.

Organization of article and main contributions
Although the methods developed in this paper are general, central to our application lies the HVTN data, which we describe in Section 2. Below we briefly discuss our methods and main contributions. We write f 097 and f 100 for the densities of the immune responses in the HVTN 097 and HVTN 100 trials respectively, and F 097 and F 100 for the corresponding distribution functions.
Estimating f 097 and f 100 : In vaccine trials, traditionally a kernel density estimator (KDE) is used for the purpose of density estimation (cf. Miladinovic et al., 2014). However, using cross-validation, in Section 3, we show that the log-concave maximum likelihood estimator (MLE) based estimators of Dümbgen and Rufibach (2009) and Chen and Samworth (2013) minimize the estimated mean integrated squared error (MISE) among a class of shape-constrained density estimators and KDEs.

Shape-constrained tests of stochastic dominance:
The claim of a stochastic ordering between two samples is made stronger when it is backed by a test of stochastic dominance. We say a distribution function F stochastically dominates another distribution function G in first order (F G) if F (x) ≤ G(x) for all x ∈ R. If X and Y are two random variables with distribution functions F and G, respectively, then in this case, we say X stochastically dominates Y , and write X Y . The dominance is regarded as "strict" (F G or X Y ) if, in addition, there exists x ∈ R such that F (x) < G(x). If F does not strictly stochastically dominate G, then this event is defined as the non-dominance (F G) of F over G (Whang, 2019, p. 25).
The rejection of the null of non-dominance against the alternative of stochastic dominance makes the strongest case for ranking one distribution over the other (Davidson and Duclos, 2013;Álvarez-Esteban et al., 2016;Ledwina and Wy lupek, 2012). However, the resulting test suffers from lack of power because the overlap of distribution functions at the tails make unrestricted stochastic dominance almost impossible to establish via hypothesis testing (cf. Davidson and Duclos, 2013;Whang, 2019). There are many ways to deal with this difficulty. For example, Ledwina and Wy lupek (2012) exploits the fact that the dominance between two distribution functions employs a dominance between their Fourier coefficients under a carefully chosen basis. However, their method rejects the null of non-dominance in some scenarios when the distribution functions cross each other. Another remedy discussed in the literature (Kaur et al., 1994;Davidson and Duclos, 2013;Álvarez-Esteban et al., 2016) involves excluding the tail region because it does not contain enough reliable information for the problem at hand. This type of tests focus only on a compact set D inside the interior of the combined support {0 < F + G < 2}. In Section 4, we follow the latter strategy because it allows for many ways of incorporating shape-constraints with desirable asymptotic properties.
To the best of our knowledge, we are the first to introduce the use of shape constraints in the context of testing the null of non-dominance against stochastic dominance. Moreover, one of our nonparametric test statistics in Section 4.1 has not, to our knowledge, previously been studied in the context of testing the null of non-dominance. In Section 4.2, we show that the shape-constrained and nonparametric versions of our tests control the asymptotic type I error at any null configuration under reasonable conditions. We also show that our tests are asymptotically unbiased, and consistent against all alternatives lying in the interior of the class of alternative distributions. In Section 4.3, we empirically show that the shape-constrained tests have better power than their nonparametric counterparts although they have the same asymptotic critical values. Section 4.4 analyses the application of these tests to our data. The proofs are deferred to Appendix A.
Shape-constrained plug-in estimators of the squared Hellinger distance: In Section 5, we construct plug-in estimators of the squared Hellinger distance. It is well known that, unless bias-corrected, plug-in estimators based on the KDE generally have a first-order bias (cf. Section 2 of Robins et al., 2009). In contrast, for some smooth functionals, shape constrained MLE based plug-in estimators do not require further bias correction (cf. Jankowski, 2010;Mukherjee et al., 2019). The results in Lopuhaä and Musta (2019) indicate that the squared Hellinger distance is an example of such a smooth functional. When the underlying density is unimodal, we show that our unimodal density based plug-in estimator enjoys the same asymptotic guarantees as that of the bias corrected KDE based estimators (Kandasamy et al., 2015) under some regularity conditions. The simulation studies in Section 5.3 suggest that similar results hold for our log-concave MLE based plug in estimators as well. In fact, our smooth log-concave MLE based estimator shows stable performance across all settings, where even the bias-corrected version of the KDE based estimator struggles in some cases.
In the process, we develop theoretical tools for analyzing the asymptotic behavior of plug-in estimators based on the unimodal density estimator of Birgé, which may of independent interest. We defer the latter analysis to Appendix B. The methods developed in this paper are implementable using the R package SDNNtests (Laha and Luedtke, 2020), which is available on GitHub.

Notations and terminologies
Before proceeding further, we introduce some notation that will be used throughout this paper. We consider two independent samples X 1 , . . . , X m and Y 1 , . . . , Y n drawn from distributions with densities f and g. We denote the corresponding distribution functions by F and G, respectively. The respective empirical distribution functions will be denoted by F m and G n . The pooled sample (X 1 , . . . , X m , Y 1 , . . . , Y n ) has size N = m + n. We denote the corresponding empirical distribution function by H N .
For k ≥ 1, we let ||·|| k denote the usual L k norm, i.e. ||µ|| k = ( R |µ(x)| k dx) 1/k , where µ is a function supported on the real line. Also, we denote ||µ|| ∞ = sup x∈R µ(x). For a density f , denote by supp(f ) the set {x : f (x) > 0}. For a concave function f : R → R, the domain dom(f ) will be defined as in (Rockafellar, 1970, p. 40), that is, dom(f ) = {x ∈ R : f (x) > −∞}. For a sequence of measures {P n } n≥1 , we say P n converges weakly to P , and write P n → d P , if lim n→∞ µdP n = µdP holds for any bounded continuous function µ : R → R. For any two sets A, B ⊂ R, we denote by dist(A, B) the quantity min{|x − y| : x ∈ A, y ∈ B}. We let int(A) denote the interior of the set A.

Background: HVTN 097 and HVTN 100
This section presents an exploratory analysis of the dataset. For both trials, we consider the magnitude of IgG binding to the V1V2 region of seven clade C glycoprotein 70 antigens. The immune responses were measured by an HIV-1 binding antibody multiplex assay (BAMA). Following Haynes et al. (2012), Gray et al. (2019), andBekker et al. (2018), we use the log-transformed net median fluorescence intensity (MFI) as the measure of immune response for statistical analysis.
In HVTN 097 and HVTN 100, four injections of HIV vaccines were given at months 0, 1, 3, and 6. In this study, we only consider the responses measured two weeks after the month six vaccination, which is considered to be the peak immune response time point. We include only those vaccinees in this study who (a) completed the first four scheduled vaccinations and provided samples at two weeks after the month six vaccination (known as vaccinated per-protocol participants), and (b) developed a positive immune response for at least one of the seven clade C V1V2 antigens. There are 68 and 180 vaccinees in the HVTN 097 and HVTN 100 trial, respectively, who satisfy the above criteria. We base our analysis on the aggregated response, averaged over the seven clade C antigens mentioned above, and refer only to the latter when we say "immune response". We let F 097 and F 100 denote the distribution functions corresponding to the aggregated response from the two trials.  Figure 1 illustrates the empirical CDFs, the histogram, the boxplot, and the KDES of the immune responses from the two trials. Figure 1i illustrates that F 100 is always greater than F 097 except at the tails, hinting at the stochastic dominance of the HVTN 097 immune response over the HVTN 100 immune response. The histogram in Figure 1ii, the boxplot in Figure 1iii, and the plot of the KDEs in Figure 1iv also suggest that the HVTN 097 trial induces higher immune response. Bekker et al. (2018) also indicated that the magnitude of positive responses in the HVTN 100 trial is lower than that of the RV144 trial, which, on the other hand, is reported to be slightly lower than the responses does not require actual tuning. Indeed, if we choose the parameter η = o(m −1 ), then our Lemma A.5 in Appendix A ensures that the total variation distance between f m and f 0 m is o(m −1/2 ) with probability one, where f 0 m was defined to be the MLE of f had the true mode been known. Therefore, we choose η to be the inverse of the combined sample size of the two trials.
Smooth unimodal estimator: There is a substantial body of literature on smooth unimodal density estimators. See for instance, Eggermont and LaRiccia (2000), Mammen et al. (2001), Hall and Huang (2002), Wolters (2012), Meyer (2012), Turnbull and Ghosh (2014), and Wolters and Braun (2018), among others. The smooth unimodal density estimators generally depend crucially on external tuning parameters, as mentioned previously. Since we opt for shape constraints mainly to avoid tuning parameters and we already have at our disposal the tuning free smooth log-concave MLE estimator, the smooth unimodal density estimators are not particularly attractive to us. Still, since the current section solely focuses on density estimation, we include some smooth unimodal estimators for comparison here, namely the estimators of Turnbull and Ghosh (2014), Hall and Huang (2002), Wolters (2012), and Wolters and Braun (2018). Turnbull and Ghosh (2014) approximates the unknown unimodal density using Bernstein polynomial. We use the condition number approach of Turnbull and Ghosh (2014) to select tuning parameters related to the degree of the polynomial. The estimators in Hall and Huang (2002), Wolters (2012), and Wolters and Braun (2018) are kernel based. To compute them, we use the R package scdensity with the default choice of bandwidth.
Log-concave estimators: Our first log-concave estimator is the MLE among the class of all log-concave densities. The MLE is continuous but non-smooth (Dümbgen and Rufibach, 2009). The second estimator is a smoothed version of the MLE (Dümbgen and Rufibach, 2009;Chen and Samworth, 2013). The smoothing parameter for the latter is data dependent and has a closed form formula, and hence it does not require external tuning. Figure 2 displays these two density estimators. For more on the properties of the log-concave density estimators, see, e.g., Balabdaoui et al. (2009), Dümbgen et al. (2011, and Doss and Wellner (2016). Kernel density estimators: We consider kernel density estimators (KDE) with Gaussian kernel. The optimal bandwidth was chosen either by the univariate plug-in selector of Wand and Jones (1994), or the univariate least square cross-validation (LSCV) selector of Bowman (1984) and Rudemo (1982) (see Figure 1iv).
We prefer the estimator f m of a density f with the smallest mean integrated squared error (MISE), which is given by Noting minimizing the MISE with respect to f n is equivalent to minimizing we estimate the latter quantity using a ten folds cross-validation. We also estimate the negative log-likelihood  Based on these risks (see Table 1), our recommended estimators are the logconcave estimators which exhibit the lowest risk in an overall sense. Table 1 indicates that Birgé's estimator excels in minimizing the −l m risk but it has higher estimated MISE when compared to the other estimators. This can be attributed to its spikes at the mode (see Figure 2), which contributes large positive terms to l n and MISE. Grenander type unimodal estimators are known to exhibit such "spike-problem" at the mode (Walther et al., 2009), which is caused by the inconsistency of the density estimator at the mode (for more details, see Woodroofe and Sun, 1993;Balabdaoui et al., 2009).

Test of stochastic dominance
To provide an answer to Q2, we construct tests for the null of non-dominance against that of stochastic dominance using the log-concave MLEs and the unimodal estimator of Birgé. We compare the resulting shape-constrained tests with their nonparametric counterparts.
Our shape-restricted methods rely on estimating the densities f and g. We denote the corresponding unimodal estimators of Birgé by f m and g n , respectively. The construction of Birgé's estimators requires a tuning parameter η, which we set to be N −1 where N = m + n. We letf m andg n denote the logconcave MLEs of f and g (Dümbgen and Rufibach, 2009), and writef sm m and g sm n for their respective smoothed versions (Chen and Samworth, 2013). The corresponding distribution functions will be denoted by F m , G n ,F m ,G n ,F sm m , andG sm n , respectively. As m and n approach ∞, we assume that m/N → λ ∈ (0, 1). Letting H = λF + (1 − λ)G, for p ∈ (0, 1/2), we also define the sets (4.1) Kaur et al. (1994) and Davidson and Duclos, we formulate the hypotheses as follows:

Construction of the tests
(4. 2) The null configuration H 0 occurs if F = G, or G stochastically dominates F , or if F and G touch or cross each other on D. Thus our formulation is unable to reject the null when F and G touch at a point in D even if F G. This limitation seems to be unavoidable because such a configuration (F, G) lies on the common boundary shared by {(F, G) : F G} and {(F, G) : F G}, and hence can not be discriminated from the null without sacrificing control over the size of test. Notably, our exploratory analysis (see Fig 1i) suggests that it is unlikely that F 097 and F 100 fall in this category. See Figure 3 for examples of different scenarios associated with our hypotheses.
Regarding the choice of D, we need to ensure that D is inside the combined support of f and g because otherwise, inf z∈D [G(z) − F (z)] will always be 0. The set D p defined in (4.1) satisfies this criterion. In practice, we replace this unknown D p by D p,m,n defined in (4.1), which always utilizes 100(1 − 2p)% of the combined data. Naturally, if p is too small, rejection of the null will be difficult, where a large p will exclude a large portion of the data, which might be unnecessary. If only some particular interval of the data is of practical interest (e.g. some particular range of immune responses or biomarkers), we suggest setting D to be the smallest superset of that interval. In the absence of such prior knowledge, we suggest choosing the largest p so that D p,m,n excludes the tail region where empirical distribution functions overlap. We will return to this issue later in Section 4.4, with a demonstration on our motivating dataset. Now we are in a position to introduce our test statistics. Minimum t-statistic: This statistic was first introduced by Kaur et al. (1994) in context of second order stochastic dominance, and then extended to the first order by Davidson and Duclos. For distribution functions F 1 and F 2 , this statistic is given by Our tests reject the H 0 for large values of T min m,n ( F m , G n ), T min m,n (F m ,G n ), and T min m,n (F m , G n ). The last test-statistic, which is nonparametric, equals the minimum t-statistic of the Kaur et al. in context of first order stochastic dominance. Two sample empirical process (TSEP) type test statistic:Our second test rejects the H 0 for large values of T tsep m,n ( F m , G n ), T tsep m,n (F m ,G n ), or T tsep m,n (F m , G n ), where for distribution functions F 1 and F 2 , T tsep m,n is defined by Ledwina and Wy lupek (2013) uses a test statistic similar to T tsep m,n (F m , G n ) (the second test statistic in Section 2.2 of Ledwina and Wy lupek, 2013) for testing the null of stochastic dominance against non-dominance. The pivotal distribution of their test statistic is completely different from ours because they based their critical values on the configuration F = G, which is very different from what we will consider. We are not aware of any existing test which uses T tsep m,n (F m , G n ) for testing non-dominance against stochastic dominance.
Wilcoxon rank sum (WRS) type test statistic: Wilcoxon rank sum (WRS) test is widely used for comparing two vaccines (cf. Miladinovic et al., 2014) although the WRS test is actually designed for testing location shift. It is a popular choice for testing the null F = G against the alternative F = G(·−δ) for δ > 0 ( Lee and Wolfe, 1976). The WRS test is the most powerful nonparametric test for testing the following hypotheses (cf. Example 25.46 of Van der Vaart, 1998): Although the WRS test is not designed to test the null of non-dominance, we include this test to demonstrate its failure to control the type I error at some null configurations. The one-sided WRS test rejects H 0 for large values of T wrs m,n (F m , G n ), where, for distribution functions F 1 and F 2 , T wrs m,n (F m , G n ) is the Mann-Whitney form of the two-sample WRS statistic. The corresponding shape-constraint versions are given by T wrs m,n ( F m , G n ) and T wrs m,n (F m ,G n ). We excluded tests based on the smoothed log-concave MLE because rigorous asymptotic analysis of the corresponding tests is out of the scope of the present paper. However, our empirical study in Section 4.3 includes minimum t-test and TSEP test based on the smoothed log-concave MLE. Our simulations indicate that the asymptotic critical values of the tests based on T min m,n (F m ,G n ) and T tsep m,n (F m ,G n ) are valid for the corresponding smoothed log-concave tests. Our simulations also indicate that the finite sample performance of the tests based on the log-concave MLE and the smoothed log-concave MLE are quite similar. We leave the rigorous analysis of the tests based on the smoothed log-concave MLE for future study. Remark 1. Since the nonparametric tests use the empirical distribution function, shape constrained methods do not gain any advantage in terms of tuning parameters. Also, we will see that the nonparametric and shape constrained tests are asymptotically equivalent. For moderate sized samples, however, our simulations in Section 4.3 show that the shape-constrained tests exhibit better performance. Remark 2. Davidson and Duclos proposed an empirical likelihood ratio approach to test H 0 vs H 1 . We did not appeal to this nonparametric approach in this paper because this approach does not extend easily to shape-constrained scenarios. Regardless, we point out that Davidson and Duclos showed that their empirical likelihood ratio test is asymptotically equivalent to the minimum ttest.
In the sequel, we may use the terms "log-concave" or "unimodal" to refer to the test statistics based on the log-concave or unimodal density estimators. For example, we may refer to T min m,n ( F m , G n ) and T min m,n (F m ,G n ) as the unimodal minimum t-statistic and the log-concave minimum t-statistic, respectively. Also, unless otherwise specified, the terms "null" and "alternative" will refer to the H 0 and H 1 defined in (4.2), respectively.
x (i) x (ii) x (iii) x (iv) Fig 3: This figure displays plots of two distribution functions F (red) and G (blue). The range of x in these plots correspond to D p := D p (F, G). (i) F = G on D p . This is a null configuration. (ii) F and G cross each other on D p . This is also a null configuration. (iii) F strictly stochastically dominates G. In fact G(x) > F (x) for all x ∈ D p . This is an alternative configuration. (iv) F and G touch each other at the endpoint of D p . This is a null configuration.

Asymptotic distribution
In this section, we explore the asymptotic distribution of our test statistics. We show that the minimum t-test and the two sample empirical process (TSEP) test asymptotically control type I error for each null configuration and they are asymptotically consistent against each (F, G) ∈ H 1 . We also show that, with the exception of the test based on log-concave MLE, the WRS type tests control the type I error for distributions in H a 0 and are consistent against (F, G) ∈ H a 1 . We first prove the asymptotic results on the nonparametric test statistics. Then, we show that the shape-constrained test statistics are equivalent to their nonparametric counterparts up to a o p (N −1/2 ) term, which implies that the same critical values can be used for them.
We exclude the log-concave MLE based WRS statistic from our discussion because we are unable to infer on its asymptotic limit. The difficulty arises due to our inability to track the asymptotic behaviour of √ m||F m − F m || ∞ . In the remainder of this section, by "shape-constrained test statistics", we will therefore refer to T min m,n ( F m , G n ), T tsep m,n ( F m , G n ), T wrs m,n ( F m , G n ), T min m,n (F m ,G n ), and T tsep m,n (F m ,G n ) only. For the sake of clarity, in Table 2, we summarize the current state of results on the different tests discussed in this paper.

Test
Asymptotic results Empirical results

Nonparametric
Minimum t-test previously known (Kaur et al., 1994) included in this paper Before going into further details, we state a technical condition that will be required by all of our Theorems.
Condition N. F and G are continuous. Also, F and G have densities f and g, respectively, such that supp(f ) ∪ supp(g) contains an open neighborhood of D p .
The first requirement of Condition N, i.e., the continuity of F and G, is necessary for the weak convergence of the empirical processes to Brownian bridges. The second requirement ensures that dist(D p,m,n , D p ) approaches 0 as m, n → ∞ with probability one. These assumptions are likely to be satisfied by our immune response data provided p is not too small (see Figure 1i and Figure 1ii). For the rest of the paper, we restrict our attention to F and G that satisfy Condition N.

Asymptotic critical values of the nonparametric tests
We begin our discussion with the minimum t-statistic and the TSEP statistic. Our first objective is to identify the null configurations that lead to the highest asymptotic type I error. Here we remind the readers that (F, G) is a null configuration if there exists x ∈ D p so that G(x) ≤ F (x). One may guess that the interesting cases appear on the boundary of H 0 . However, to formally discuss the boundary of H 0 , we need to equip it with a suitable topology. To formalize our discussion, we consider the space F of all continuous distribution functions on R, and equip it with the uniform metric d(F, F ) = sup x∈R |F (x) − F (x)|. Consider the product space F × F with the metric By an abuse of notation, we denote by H 0 and H 1 the set of all combinations (F, G) ∈ F ×F that satisfy the hypotheses H 0 and H 1 , respectively. For i = 0, 1, we denote the closure of H i in F ×F by cl(H i ). Then the boundary of H i is given by cl(H i ) \ int(H i ). The following lemma characterizes bd(H 0 ) and int(H 0 ).
Moreover, bd(H 0 ) = bd(H 1 ). Also, the interior of H 0 is given by T min m,n (F m , G n ) → p −∞ and T tsep m,n (F m , G n ) → p −∞. The proof of Lemma 2 for the minimum t-statistic can be found in Whang (2019) (see also Davidson and Duclos, 2013). However, we include it in Appendix 2 for the sake of completeness. Lemma 2 indicates that non-trivial type I errors can originate only at the boundary of H 0 , which is a subset of H 0 because the latter is a closed set (see Lemma 1). Lemma 1 also implies that bd(H 0 ) consists of all those F and G that touch each other on D p . To concretize this idea, we define the contact set C p by Theorem 1 shows that the asymptotic distribution of the test statistics on bd(H 0 ) crucially depends on this contact set C p and H(C p ). The proof of Theorem 1 is given in Appendix A.0.2.
Theorem 1. Suppose (F, G) ∈ bd(H 0 ), m/N → λ, and F and G have continuous densities f and g satisfying inf Then under the stated conditions, the following assertions hold: A.
where L 0 is the centred Gaussian process given by (A.17) of Appendix A, and H(C p ) is as in (4.7).
The Gaussian process L 0 , which is defined in (A.17) of Appendix A, depends on F and G. We postpone further discussion on the form of L 0 till Appendix A. Next we discuss the implication of Theorem 1 on the minimum t-test. Then we will discuss the case of the TSEP test.
Asymptotic critical value of minimum t-test: Theorem 1 reveals an interesting fact: the length of C p imposes a stochastic ordering among the limiting laws of the minimum t-statistic for the boundary configurations. To elaborate, let us consider (F 1 , G 1 ) and (F 2 , G 2 ) ∈ bd(H 0 ), with respective contact sets C 1 p and C 2 p . Then, under the conditions of Theorem 1, , for i = 1, 2.
If the contact sets satisfy the ordering C 1 implying that the limiting law of the minimum t-statistic under (F 1 , G 1 ) stochastically dominates that under (F 2 , G 2 ). The extreme cases for C 1 p are the singleton sets {x}, where x ∈ D p . In this case, the asymptotic distribution of both test statistics is standard Gaussian. Therefore, we set the critical value of our tests to be z α , the (1−α)-th quantile of the standard Gaussian distribution. The class of boundary configurations with a singleton contact set is referred to as "the least favorable class" (LFC) (Davidson and Duclos, 2013). Figure 4 illustrates the difference between an LFC and an ordinary non-LFC boundary combination.
Theorem 2 of Davidson and Duclos shows that there is no null configuration under which the law of the minimum t-statistic strictly stochastically dominates that of the LFC configuration. This result, which holds for any m and n, implies that, among the null configurations, the LFC configurations lead to the greatest dominance of F over G. The above finding, in conjunction with our Theorem 1, imply that our tests, whose critical values are based on the LFC class, is likely to have asymptotic size α. This being a stronger assertion than the asymptotic control of type I error can be an interesting topic for further investigation.
The contact set is singleton; so this is an LFC configuration.
Remark 3. The asymptotic behavior of the nonparametric minimum t-statistic has been previously studied (Davidson and Duclos, 2013;Kaur et al., 1994). However, previous studies focus only on the asymptotic behaviour of the minimum t-statistic at the LFC configurations and the interior of H 0 , whereas our results show that there are other classes of boundary configurations with nonvanishing type I error. Although existing results are enough for the purpose of constructing critical values, our new results provide a more complete understanding of the scenario. Asymptotic critical value of the TSEP test: The asymptotic distribution of the TSEP statistic under H 0 also exhibits a monotonocity property similar to the minimum t-test. To elaborate, suppose the pairs (F 1 , G 1 ) and (F 2 , G 2 ) have respective contact sets C 1 p and C 2 p satisfying C 1 Therefore, similar to the case of the minimum t-test, the configurations with singleton H(C p ) constitute the class of LFC configurations for the TSEP test. Under Condition N, H is continuous and strictly increasing on C p . Therefore, To find the critical value, it suffices to study the asymptotics in such LFC cases. If x ∈ int(D p ), the TSEP test statistic weakly converges to a standard Gaussian distribution by part C of Theorem 1. Till this point, there has been no difference between the TSEP and the minimum t-test statistic. If, however, x ∈ bd(D p ) (see Figure 15iv), the asymptotic distribution of the TSEP test statistic can be different. To this end, first we state a lemma, and then using this lemma, we explain the asymptotics of the TSEP test when Lemma 3. Suppose F and G are as in Theorem 1 and H( Moreover, the following assertions also hold: then using Theorem 1B and Lemma 3 one can show that T tsep m,n converges weakly to a centred Gaussian distribution with variance σ 2 T SEP . Part A of Lemma 3 implies that if F and G touch at C p , i.e. if f = g at the point of contact, then σ 2 T SEP is still one. Hence, the TSEP test statistic is asymptotically standard Gaussian for this case. However, if F and G cross at the point of contact instead of touching, i.e. if f = g at the point of contact, then σ 2 T SEP > 1. The precise value of σ 2 T SEP is given by (4.9). Moreover, the value of σ 2 T SEP increases as the value of f and g diverges at the point of contact. On one hand, if f is much larger than g, then part B of Lemma 3 implies σ 2 T SEP is close to λ −1 . On the other hand, if g is much larger than f , then part C of Lemma 3 indicates that σ 2 T SEP is close to (1 − λ) −1 . These bounds are tight because, by part B of Lemma 3, σ 2 T SEP can not be larger than max{λ −1 , (1 − λ) −1 } under the set up of Theorem 1.
The above discussion leads to the following conclusion for the TSEP test. If we want to control the asymptotic type I error of the TSEP test at all null configurations, then we should use the critical value C m,n z α where C m,n = max{ N/m, N/n}. There is a caveat, however. To see this, we begin by noting that C m,n ≥ √ 2, with equality holding only when m = n. For example, for our motivating dataset, C m,n ≈ 1.9. However, if m/N → 0 or n/N → 0, then C m,n → ∞. Thus if m and n are not close to being equal, C m,n can be a large quantity. Therefore, using C m,n z α as critical value yields a conservative test. Hence, we will call the corresponding TSEP tests the conservative TSEP tests. Our simulations indicate that conservative TSEP tests have poorer power compared to the minimum t-test even when m and n are equal, and their power keeps degrading as C m,n increases.
If we use instead use the critical value z α , then we control the asymptotic type I error at the null configurations with C p ⊂ int(D p ) or those with f = g at C p . This only excludes the null cases where F and G may cross at bd(D p ) (see Figure 3 iv). These type of configurations can be considered pathological cases. Moreover, our simulations in Section 4.3 (see case b) show that even when F and G cross at bd(D p ), the TSEP tests with critical value z α control the type I error. The TSEP tests with critical value z α have decent power and their overall performance is comparable with the minimum t-tests. In view of the above, we recommend using the asymptotic critical value z α when using the TSEP test.
Our final result on the minimum t-test and the TSEP test establishes their asymptotic consistency.
The asymptotic distribution of the WRS statistic is well-established in the literature. Suppose F and G are continuous distribution functions. In that case, it is well known that, when F = G, the WRS statistic is asymptotically distributed as a standard gaussian random variable, i.e. T wrs m,n (F m , G n ) → d N (0, 1) (Dwass, 1956).

Asymptotic critical values of the shape-constrained tests
We will show that under some additional conditions, the difference between the nonparametric and the shape-constrained test statistics is o p (1), which automatically implies that the shape-constrained tests enjoy the same asymptotic properties as the nonparametric tests.
For the unimodal case, the additional condition is a curvature condition, which requires f and g to be nowhere flat within their respective domains.
Condition A. For the density µ, the Lebesgue measure of the set {µ = 0, µ > 0} is 0, where µ is the derivative of µ.

For densities satisfying Condition A,
√ m( F 0 m − F ) almost surely weakly converges to V • F , where V is a Brownian bridge. In this case, it can be shown that (Kiefer and Wolfowitz, 1976 The densities that violate Condition A form the boundary of the class of unimodal densities. For these densities, in this case is slightly convoluted, and we refer to Beare et al. (2017); Carolan and Dykstra (1999); Carolan (2002) for more details on the limiting process. Just to give an example, if f ∼ U [0, 1], then the limiting process is the least concave majorant of a Brownian bridge (Carolan and Dykstra, 1999). Condition A is thus required to ensure that neither f nor g is one of these problematic boundary densities.
Lemma 4. Suppose that f and g are unimodal densities satisfying Condition A. Further suppose that f and g are bounded away from 0 on an open set containing D p , and m, n satisfy m/N → λ. Then, In case of the log-concave test statistics, however, we require a smoothness condition as well as a curvature condition. We will quantify smoothness via a Hölder condition. For a compact set K ⊂ R, a function h is said to be in the Hölder class H β, We say that a density µ (µ = f or g) satisfies Condition B1 if the following holds.
In addition to Condition B1, we also require f and g to satisfy a curvature condition.
, where φ is the left derivative or the right derivative of φ.
Note that, since φ is concave, its left and right derivatives always exist. If φ is differentiable on K, Condition B2 reads as φ (x) ≤ −C for x ∈ K. Conditions of type B1 and B2 also appear in Dümbgen and Rufibach.
Lemma 5. Suppose that f and g are log-concave densities satisfying Conditions B1 and B2. Suppose, further, f and g are bounded away from 0 on an open set containing D p , and m/N → λ. Then it follows that Remark 4. All results of Section 4.2 hold if we replace D p by a compact set D as long as there is a p > 0 so that D ⊂ D p and D p satisfies the conditions of the theorems stated in Section 4.2.

Simulations
This section compares the performance of the shape-constrained tests designed in Section 4.1 with their nonparametric counterparts. We let m = n = 100, which is reflective of the sample sizes anticipated in many phase 1b or phase 2 vaccine trials -for example, our motivating dataset has m + n = 248. For all TSEP tests in this section, we use the critical value z α . See Appendix C for the simulations with TSEP tests that have critical value C m,n z α . For the TSEP and the minimum t-test, we also include the tests based on the smoothed log-concave MLE. We will refer to the corresponding test as the smoothed logconcave test. Although we do not have any theoretical result for this test, we use the asymptotic critical value z α .
By the design of our hypotheses, H 0 encompasses a broad number of cases ranging from G F , F = G to cases where F and G touch or cross each other. We develop simulation schemes so that we can explore a wide range of scenarios. Our simulation schemes involves a parameter γ varying over the range [0, 1]. Here γ quantifies the difference between the data generating distribution functions F γ and G γ . We evaluate the power ν(γ) at a grid of equally spaced points in [0, 1].
For our simulation study, we consider the following cases: is a Gamma random variable with shape parameter a and scale parameter b. (d) F γ ∼ Gamma(2, 1) and G γ ∼ P areto(0.5 + 2γ, 1). Here P areto(a, b) is the Pareto distribution function with shape parameter a and scale parameter b. Case (a) corresponds to the traditional setting of null of equity against a shift alternative. The last four cases cover H 0 combinations that include crossing. In those cases, F γ and G γ cross each other on int(D p ) at γ = 0. As γ increases, however, F γ and G γ eventually touch (cases c, d, and e), or cross (case b) each other at bd(D p ), generating a LFC configuration. We denote the corresponding γ by γ * . Finally, at γ = 1, F γ strictly dominates G γ in the sense of H 1 . Figures 18i,  17i, 17ii, and 18ii in Appendix D display the plots of F γ and G γ for several values of γ in cases (b), (c), (d) and (e), respectively. Figure 5 illustrates the densities in cases (a), (d), and (e) for γ = 1. Cases (d) and (e) are chosen to reflect violations of the shape constraints that we consider. The Pareto density in (d) violates the log-concavity assumption and the normal mixture in (e) violates both the unimodality and the log-concavity assumptions. All other densities satisfy both shape constraints.
We evaluate the properties of the tests under consideration using 10,000 Monte Carlo replicates. We set the p in D p,m,n to be 0.05, where D p,m,n was defined to be the set [ Also, the level of significance is 0.05 for all our tests. The tests based on the empirical cumulative distribution function will be referred to as NP (nonparametric) tests. For brevity, we will refer to the nonparametric, unimodal, log-concave and smoothed log-concave tests by NP, UM, LC, and smoothed LC tests, respectively. Figure 7 displays the power curves for the minimum t-test and the TSEP test. In terms of power, the LC and smootheed LC tests generally outperform the UM tests, which generally outperform the NP tests. In their simulation study, Davidson and Duclos (2013) reported the NP minimum t-test to be conservative, which aligns with our observation. Also, the shape-constrained minimum t-tests have slightly higher power than the shape-constrained TSEP tests, although the difference is not always significant. For the NP tests, the rejection rates of the minimum t-tests and the TSEP tests are almost identical.
Except for in case (d), where G γ is Pareto, all tests control the type I error in all null set-ups, including the LFC configuration where γ = γ * . Since LFC configurations constitute the boundary of H 0 , this observation indicates that our tests have size 0.05 for all cases except case (d). Although Pareto density violates only the log-concavity assumption, apparently no test has the correct size, albeit the NP and UM tests performing the best in terms of size. Also, for case (d), the smoothed LC test has lower type I error than LC test. Surprisingly, in case (e), where both log-concavity and unimodality are violated, the shapeconstrained tests have overall better performance than the NP tests although all tests exhibit poor power in this case.
All the WRS type tests, including the LC WRS test (whose asymptotic behavior is yet unknown), exhibit a much larger size than 0.05 in all cases except the null of equity type case (a), which is the ideal scenario for WRS type tests. Figure 8 illustrates the power curve ν(γ) in cases (b) and (c), where all three WRS tests exhibit very high type I error at several null configurations. Our findings are consistent with the observation in Ledwina and Wy lupek (2012) that the WRS test can be misleading for testing the null of non-dominance.
In summary, under a correctly specified model, the shape-constrained minimum t-test and TSEP test outperform their nonparametric counterparts, with the log-concave minimum t-test having the best power. When the shape constraints are violated, the nonparametric tests are not distinguishably better than the unimodal tests. On the other hand, The WRS tests do not control type I error for null configurations with crossing, as expected. Another important takeaway from this section is that with the critical value z α , the smoothed log-concave tests seem to perform as well as the log-concave tests under logconcavity.
The current article uses asymptotic critical values for performing the abovementioned tests, but bootstrap critical values could also be an option. Bootstrap for the shape-constrained tests, however, is not straightforward. Generation of observations from shape-constrained LFC configuration poses some challenges, which involves solving non-trivial optimization problems. Further discussion in this direction is out of the scope of the present paper. Therefore, we leave bootstrap tests for future research.

Application to HVTN 097 and HVTN 100 data
Our first task is to select a p for choosing D p,m,n . Figure 1 and some inspection show that the empirical distribution functions are very close on the sets (−∞, H −1 (0.072)] and [H −1 (0.975), ∞) in that they either cross or touch each other on these regions. Therefore, this is the problematic region we wish to exclude from our D p,m,n , because clearly there is not enough evidence of any dominance in this region. Therefore, we set the p in D p,m,n to be 0.075. We remark that it is ideal to choose p in a systematic way without looking at the data. However, constructing a rigorous procedure for choosing p is out of the scope of the present paper, and we leave it for future research.  It is natural to ask if we can estimate the power of our tests at (F 097 , F 100 ). Because (F 097 , F 100 ) is unavailable, we analyze the power in a neighborhood of (F sm 097 ,F sm 100 ) instead, whereF sm 097 andF sm 100 correspond to the distributions of the smoothed log-concave MLE (Chen and Samworth, 2013) estimators of f 100 and f 097 , respectively.
Let us denote the smoothed log-concave MLE of the pooled sample byf 0 m,n . LettingF 0 m,n denote the corresponding distribution function, we consider the mixture distributions F sm 097 (γ) = (1−γ)F 0 m,n +γF sm 097 , andF sm 100 (γ) = (1−γ)F 0 m,n +γF sm 100 , (4.11) where γ ∈ [0, 1]. Note that, similar to case (a) in Section 4.3, here also γ quantifies the departure of the configuration (F sm 097 (γ),F sm 100 (γ)) from the null of equality of distributions. Also, the distance betweenF sm 097 (γ) andF sm 100 (γ) increases as γ approaches 1. We denote the densities ofF sm 097 (γ) andF sm 100 (γ) byf sm 097 (γ) and f sm 100 (γ), respectively. Now observe that when γ ∈ (0, 1), the mixture densities may not be logconcave or even unimodal. Hence, we compute the log-concave projections (Dümbgen et al., 2011) off sm 097 (γ) andf sm 100 (γ), respectively. Log-concave projection of a density f is the log-concave density closest to f in Kullback-Leibler (KL) distance. Since the log-concave projection of any arbritrary density is not directly computable, we adopt a two step approach to approximate the log-concave projections. In the first step, we simulate 1000 observations from each off sm 097 (γ) andf sm 100 (γ). In the second step, we calculate the smoothed log-concave MLE density estimators of Chen and Samworth based on the simulated samples in the last step. The resulting densities are the approximate smoothed log-concave projections off sm 097 (γ) andf sm 100 (γ). Finally, we generate two samples of size 68 and 180 from the projected densities using the methods in Dümbgen and Rufibach (2010) and the R package logcondens, and replicate this process 10,000 times. Figure 9 entails that all the tests exhibit decent power for higher values of γ. The LC minimum t-test exhibits the highest power, which is unsurprising since the underlying data is generated from log-concave densities. Also, sincẽ F sm 097 (γ) =F sm 100 (γ) at γ = 0, the power curves resemble that of case (a) in our simulation schemes.

Measures of discrepancy
This section describes an approach for quantifying the difference between f and g via estimates of the squared Hellinger distance D 2 (f, g), which provides complementary insights to the tests of stochastic dominance presented in Section 4. The Hellinger distance is an example of f -divergence. These divergences are widely used to measure the similarity or dissimilarity between two probability measures. Compared to other commonly used f -divergences such as the KL divergence or the χ 2 divergence (Nielsen and Nock, 2014), the Hellinger distance is appealing due to its symmetry in its arguments, which is a desirable property for a measure of discrepancy. Another crucial advantage of the Hellinger distance is that its value is finite for every pair of densities (Gibbs and Su, 2002). In contrast, the KL and χ 2 divergences can both be infinite if the densities under consideration do not share the same support. It is not always reasonable to assume that the underlying densities of responses collected from different vaccine trials will have the same support. Therefore, the Hellinger distance appeals to us more than the KL divergence or the χ 2 divergence. The Hellinger distance has also seen successful application in various disciplines ranging from machine learning (Cieslak and Chawla, 2009;González-Castro et al., 2013, to ecology (Rao, 1995), to fraud detection (Yamanishi et al., 2004). Finally, we choose to work with the squared Hellinger distance instead of the Hellinger distance because due to its simpler form, the squared version easily lends itself to efficient estimation procedures. Regardless, a 95% confidence interval for the Hellinger distance can always be constructed using that of its squared version.
We estimate the squared Hellinger distance between f and g using the same density estimators involved in the construction of tests of stochastic dominance, that is, the log-concave MLE (Dümbgen and Rufibach, 2009) and its smooth version (Chen and Samworth, 2013), or Birgé's estimator. Recalling the definition of the density estimators f m , g n ,f m ,g n ,f sm m , andg sm n from Section 4.1, we propose the plug-in estimators D 2 ( f m , g n ), D 2 (f m ,g n ), and D 2 (f sm m ,g sm n ) for the purpose of estimating D 2 (f, g). We refer to the resulting estimators as the "unimodal", "log-concave", and the "smoothed log-concave" estimator, respectively.

Asymptotic properties of the unimodal estimator:
We will show that, under some regularity conditions, Letting b, B > 0, we denote by P(b, B) the class of densities that are bounded below and above by b and B on their support, that is, We will assume that f, g ∈ P(b, B). Simulations suggest that this condition may not be necessary. However, this condition is required for technical reasons in our proof. Such technical condition is quite common in the literature, and has appeared in the analysis of plug-in estimators (Kandasamy et al., 2015) and functionals of Grenander estimators (Mukherjee et al., 2019).
Theorem 3. Suppose f and g are unimodal densities in P Further suppose that f and g satisfy condition A and m/N → λ. Then, f,g is as defined in (5.1). Note that, since σ 2 f,g can be consistently estimated plugging in the estimator D 2 ( f m , g n ), a Wald type confidence interval is readily available for D 2 (f, g). Also, the asymptotic variance σ 2 f,g equals the lower bound of the asymptotic variance on a regular estimator of the squared Hellinger distance under the nonparametric model (Kandasamy et al., 2015). See Van der Vaart (1998); Birgé and Massart (1995) for more detail on the lower bound and related theory.
We now give a high-level explanation of the idea behind Theorem 3. It can be shown that the squared Hellinger distance is a smooth functional of the underlying distribution functions in some suitable sense. Specifically, we will show that it allows a first order Von Mises expansion (Fernholz, 2012), which has the same essence as the Taylor series expansion (Kandasamy et al., 2015). On the other hand, for F 0 m , the unimodal MLE of F based on the true mode, we show that √ m( F 0 m − F ) converges weakly to a Brownian process almost surely under Condition A. It can then be shown via a delta-method type argument, applied on the squared Hellinger distance functional, that (5.3) holds for the unimodal MLEs of F and G. The final step in proving Theorem 3 is showing that the squared Hellinger distance between Birgé's estimator and the MLE is small.

Asymptotic properties of the log-concave estimators:
Recall that our log-concave estimators of the squared Hellinger distance are given by D 2 (f m ,g n ) and D 2 (f sm m ,g sm n ), where we remind the reader thatf m ,g n are the log-concave MLEs andf sm m ,g sm n are the smoothed log-concave MLEs. Lemma 6 implies that these estimators are strongly consistent for D 2 (f, g) provided the shape constraint holds.
Lemma 6. Suppose that the densities f and g are log-concave and continuous. Then, as m, n → ∞, D 2 (f m ,g n ) → a.s. D 2 (f, g) and D 2 (f sm m ,g sm n ) → a.s. D 2 (f, g).
Our simulations indicate that both the log-concave and the smoothed logconcave plug-in estimators are √ N -consistent with asymptotic variance σ 2 f,g when f and g are continuous log-concave densities. Therefore, in our empirical study, we include Wald type confidence intervals based on these log-concave plug-in estimators as well. Our simulations in Section 5.3 indicate that under the violation of the continuity assumption, the log-concave plug-in estimator may still remain √ N -consistent, but the √ N -consistency of the the smoothed log-concave plug-in estimator may fail to hold.
It may be possible to analyze the √ N -consistency of the log-concave estimators working along the lines of Kulikov and Lopuhaä (2006); Groeneboom (1984Groeneboom ( , 1989, which pertain to the shape restriction of monotonicity. However, we leave this investigation for future research because a detailed treatment of the √ Nconsistency of the log-concave estimators is out of scope of the present paper. We remark in passing that proving the √ N -consistency of the log-concave plugin estimator may be easier for some special cases, e.g., when the logarithm of f and g are linear or piece-wise affine, by using the results of Kim et al. (2018). We do not pursue this direction because it is unlikely that, for our data, f and g belong to such restricted classes.
Remark 5. The case of model misspecification is of natural interest in the study of shape-constrained estimators. Suppose f and g are not log-concave, but they are bounded continuous densities with finite first moments. Then it follows that there exist unique log-concave densities f * and g * , which are almost sure limits off m andg n in both uniform and L 1 metric (Theorem 4 ). Here f * and g * are also the log-concave projections of f and g, respectively, in the sense of Dümbgen et al. (2011). Using the same arguments as in the proof of Lemma 6, it can be shown that the log-concave estimator of the squared Hellinger distance converges almost surely to D 2 (f * , g * ). If f and g additionally have finite second moments, a similar phenomenon takes place for the smoothed log-concave MLEs as well. In this case, however,f sm m andg sm n converge uniformly (and also in L 1 ) to different limits f * * and g * * , which can be interprated as the respective smoothed versions of the log-concave projections f * and g * (cf. Theorem 1, Chen and Samworth, 2013). In this case also, the smoothed log-concave estimator of the squared Hellinger distance converges almost surely to D 2 (f * * , g * * ). In summary, if the log-concavity assumption is violated, the log-concave plug-in estimators converge to a different limit, whose distance from D 2 (f, g) depends on the departure of f and g from log-concavity. KDE based plug-in estimators: The natural non-parametric comparators of the shape constrained plug-in estimators are the KDE based plug-in estimators. Though the latter is simple to implement, it can have a bias of order || f m,k − f || 2 + || g n,k − g|| 2 (cf. Section 2 of Robins et al., 2009), where f m,k and g n,k are the KDEs of f and g, respectively. The above bias decreases to zero at a rate slower than N −1/2 (Stupfler, 2014), thereby leading to a suboptimal performance. See Section 5 of Kandasamy et al. (2015) for more discussion on the disadvantages of the KDE based naïve plug-in estimators.
One can improve the naïve plug-in estimator, however, using a one-step Newton-Raphson procedure (cf. Van der Vaart, 1998;Pfanzagl and Wefelmeyer, 1985), which leads to a bias corrected plug-in estimator. In context of the Hellinger distance, the bias-corrected estimator (D 2 ) * ( f m,k , g n,k ) takes the form (Kandasamy et al., 2015) where ψ f and ψ g are the influence functions corresponding to the functional (f, g) → D 2 (f, g). We will formally introduce the influence functions in Appendix B. Note that learning the form of the bias-corrected estimator thus requires the explicit computation of the influence functions ψ f and ψ g . From a broader prospective, each time one tries to estimate a functional of the underlying distributions using a bias-corrected plug-in estimator, they have to carry out some extra analytical calculations that depend on the functional of interest.
In contrast, Theorem 3 shows that when the shape constraint is satisfied, the unimodal estimator of the squared Hellinger distance does not require any bias correction for √ N -consistency. Our analysis in Appendix B indicates that this is not an artifact of the Hellinger distance, but rather the result of Birgé's estimator's proximity to the unimodal MLE, i.e. the Grenander estimator based on the true mode. In fact, Theorem 4 in Appendix B shows that, if T is a smooth functional of the distribution functions F , then T ( f 0 m ) is  Figure 18 therein) indicate that these estimators do not have any O p (N −1/2 ) bias term either. This allows users of correctly specified shape-constrained estimators to avoid analytic calculation of the influence functions entirely.
In our upcoming simulation study, we use both D 2 ( f m,k , g n,k ) and (D 2 ) * ( f m,k , g n,k ) as comparators, where the KDEs are based on the Gaussian kernel. We refer to these estimators as the KDE estimator and the bias-corrected KDE estimator, respectively. As in Section 3, the kernel bandwidth is chosen using the univariate least square cross-validation (LSCV) selector of Bowman (1984) and Rudemo (1982). Generally, kernel-based bias corrected estimators satisfy √ N consistency results of the type (5.3) (cf. Theorem 6, Kandasamy et al., 2015). Therefore, we use the bias-corrected KDE based confidence intervals to benchmark the performance of our shape constrained estimators. However, we do not report any confidence interval based on the naïve plug-in estimator because our simulations indicate that usually its coverage is much less than the nominal level. For the sake of clarity, in Table 4, we summarize the current state of results on the above-mentioned estimators of the squared Hellinger distance.  N -consistency of the shape constrained estimators are proved under the correct specification of the shape constraints. Empirical studies are provided on all the estimators listed above. * Indicates whether we included confidence intervals for the corresponding estimator in the empirical study of Section 5.3. To compare the performance of different estimators of the squared Hellinger distance, we consider the following combinations of f and g:
(f) f ∼ N (0, 1), and g ∼ Gamma(3.61, 1.41). The plots of the above schemes can be found in Figure 5 and Figure 6. . In cases (a), (b), (c), (d), and (f), both shape constraints are satisfied, where in case (e), both shape constraints are violated. We will refer to case (a), (b), (c), (d), and (f), therefore, as the "correctly specified" cases and case (e), as the "misspecified" case. In case (d), the logarithm of the densities are linear on their respective supports. This is also the only case where the densities are discontinuous (see Figure 6; discontinuity at zero). In case (f), g was chosen so as to resemble the density estimate of f 100 (see Figure 2 and Figure 6). This is the only correctly specified case where the densities are known to come from different families, and the densities also have very different shapes.
We generate two samples of same size from each simulation setting. We vary the common sample size n from 50 to 500 in increments of 50. We do not consider larger values of n because in our motivating phase 1b and phase 2 vaccine trial applications, the sample sizes are generally no larger than 500. We consider 10,000 Monte Carlo replications for each sample size. Figures 10 and 11 plot √ n times the absolute bias, and Figure 12 plots n times the mean squared error (MSE). Figure 13i and Figure 13ii display, respectively, the coverage and the average length of the confidence intervals, both of which are estimated using 10,000 Monte Carlo samples. In the misspecified case (e), the performance of the shape-constrained estimators deteriorate sharply with n, which is unsurprising due to the violation of the shape constraints in this case. In case (d), where the densities are exponential, the scaled bias of the smoothed log-concave plug-in estimator increases with n, which implies √ N -consistency does not hold for this estimator in this case. Closer inspection reveals that although it is not √ N -consistent, the smoothed log-concave plug-in estimator is still consistent in case (d). We found out that the smoothed log-concave density estimator fails to approximate the exponential densities near zero, their point of discontinuity (see Figure 19 in Appendix D). It is worth mentioning that although the curvature Condition B2 is violated in case (d), it does not affect the performance of the log-concave plug-in estimator.
For all other cases, the smoothed log-concave estimator exhibits the best performance among the shape-constrained estimators. In all cases, the unimodal estimator underperforms. Importantly, Figure 13i indicates that the unimodal estimator would require sample size larger than 500 for the Wald type confidence interval to be valid, where for the other confidence intervals, this sample size is sufficient for the asymptotics to kick in.
The KDE-based plug-in estimator experiences an increase in the bias and the MSE with n, which agrees with our previous discussion on KDE based plugin estimators. Its bias corrected version performs comparably to the shapeconstrained estimators in all cases except case (f). In case (f), however, the bias corrected estimator yields a confidence interval with poor coverage, which is probably due to the large bias incurred by the original KDE-based plug-in estimator in this case (see panel (f) of Figure 11). Case (f) is a case where the underlying densities differ by both shape and scale. We suspect that the crossvalidated bandwidth for the KDE estimator does not work in this case even with bias correction. Finally, the average length of the confidence intervals do not vary noticeably across different methods.
In summary, the log-concave plug-in estimator exhibits reliable performance when the shape restriction holds. The smoothed log-concave plug-in estimator may also require the underlying densities to be continuous, but otherwise it performs comparably with the bias-corrected KDE-based plug-in estimator. The latter performs well in many settings, but it is not always reliable. The lacking performance of the KDE based methods in some settings is probably due to the variable nature of the optimal bandwidth under different settings.

Application to HVTN 097 and HVTN 100 data:
Table 5 tabulates the point estimates of the squared Hellinger distance (between f 100 and f 097 ) and the corresponding 95% intervals.  To give the reader some perspective, if f 0 is a N (0, 1) distribution and f µ is a N (µ, 1) distribution, then, when µ is equal to 1.00, 1.25, 1.50, 1.75, and 2.00, D(f 0 , f µ ) is equal to 0.346, 0.424, 0.500, 0.566, and 0.624, respectively. Also, in Figure 14, we display some density pairs (f, g) satisfying F G with Hellinger distance in the range 0.35 − 0.60, which is similar to our data.

Discussion
The first contribution of our work is a novel analysis of the data from the HVTN 097 and HVTN 100 trials. All of our tests reject the null of non-dominance in favor of the strict stochastic dominance of F 097 over F 100 . To provide further insight into the discrepancy between the two IgG binding response distributions, we estimated the squared Hellinger distance between the corresponding densities, which turns out to be approximately 0.20 (95% CI 0.10-0.30). We remark that our findings are consistent with those of Bekker et al. (2018), who found that the average magnitude of IgG binding to V1V2 antigens observed in the HVTN 100 trial is lower than that in the RV 144 trial. Although the latter used the same regimen as HVTN 097, it was conducted in a different population (Thailand). Our findings indicate that the difference in the magnitude of IgG binding response between HVTN 100 and RV 144 regimen may be attributable to the HIV clade difference rather than to the difference in populations.
The outcome of our tests become meaningful when viewed against the lack of efficacy observed in the phase 2b/3 HVTN 702 trial, which evaluated the safety and efficacy of the HVTN 100 regimen in South Africa. A possible hypothesis for why no efficacy was observed when the HVTN 100 regimen was evaluated in this trial, whereas efficacy was observed when the HVTN 097 regimen was evaluated in the RV144 trial, is that the HVTN 100 regimen leads to a lower magnitude of IgG binding to the V1V2 region. This possibility is supported by the observation made by Haynes et al. (2012) regarding the negative correlation between rate of infection and the magnitude of IgG binding to V1V2 region. This hypothesis can be tested when the immune profile of the participants in HVTN 702 trial becomes available.
Another contribution of our work relates to density estimation in the context of vaccine trials. Based on a cross-validated analysis of the HVTN 097 and HVTN 100 data, we believe that the log-concave density estimators of Dümbgen and Rufibach (2009) and Chen and Samworth (2013) may yield improved density estimation in vaccine studies. In future work, it would be worth further validating this claim on other vaccine trial datasets.
We also made several methodological contributions. In Section 4, we introduce three novel shape-constrained tests. These tests have the desirable asymptotic properties of nonparametric tests, and simulations illustrate that the shapeconstrained tests have better overall performance than the nonparametric tests. Moreover, even under the violation of the shape-constraints, their performance is not much worse than the nonparametric tests. We also introduce shapeconstrained plug-in estimators of the squared Hellinger distance and provide asymptotic consistency and distributional results. Our simulations suggest that, when the shape constraint is satisfied, the log-concave plug-in estimators exhibit overall lower MSE and absolute bias than the KDE based plug-in estimator. In fact, they perform comparably with the bias-corrected KDE based estimator. However, unlike the bias-corrected KDE plug-in estimator, the log-concave plugin estimators require neither selecting a tuning parameter nor carrying out the analytic calculations needed to derive the bias-correction term. Therefore, the log-concave plug-in estimators may be preferred in settings where this shape constraint is plausible.

Acknowledgements
This work was supported by the National Institutes of Health (NIH) through award numbers DP2-LM013340 and 5UM1AI068635-09. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH. GlaxoSmithKline Biologicals SA was provided the opportunity to review a preliminary version of this manuscript, but the authors are solely responsible for final content and interpretation. Here "KDE (BC)" and "smoothed LC" stand for the bias corrected KDE estimator and the smoothed log-concave estimator, respectively. For each estimator, either the standard error is less than 0.006 or the relative standard error is less than 2%.   Here "KDE (BC)" and "smoothed LC" stand for the bias corrected KDE estimator and the smoothed log-concave estimator, respectively. In all the cases, the standard error is less than 0.005, and hence, not plotted. ; middle: f ∼ N (1.40, 1), g ∼ N (0, 1); right: f ∼ Gamma(5, 1), g ∼ Gamma(2, 1).

Appendix A: Proofs for Section 4
Before proceeding any further, we introduce some new notations. We let l ∞ denote the collection of all bounded functions on R, which we equip with the uniform metric · ∞ . For any function µ : R → R, we define the norm · b a by µ b a = sup x∈ [a,b] |µ(x)|. Also, we denote the boundary of a set A by bd(A). Suppose (Ω, int(H 0 ), P) is the common probability space corresponding to the X i 's and Y j 's. For the rest of this section, we let → p and → a.s. correspond to this probability space. Since F and G are continuous, using the construction of Section 1.1 of Shorack (1984) (see also p.93 of Shorack and Wellner, 2009), we can show that there exist two independent Brownian bridges V 1 and V 2 on Note that U is also distributed as a Brownian bridge. We will show that the asymptotic distributions of our test statistics depend on U. Also, for ours F and G, In the sequel, we will also use the following fact on the convergence of the quantiles of H N and D p,m,n , often without mentioning. As a corollary to Fact 3.1A, it follows that dist(D p,m,n , D p ) → a.s. 0 under Condition N.  (1 − p). To prove part B and C of Fact 3.1, therefore, it suffices to show that Since q < p < p and D q ⊂ D p ⊂ D p ⊂ U , it is enough to show that H −1 is continuous and strictly increasing on U . The latter holds if H s continuous and strictly increasing on U . The proof now follows since we already showed that H has a positive density on U .

A.0.1. Proof of Lemma 1
To prove Lemma 1, we will use an alternative definition of H 0 and H 1 . Note that continuous distribution functions (F, G) ∈ H 0 if and only if sup x∈Dp [F (x)− G(x)] ≥ 0. Because D p is compact, such a pair is in H 1 if and only if sup x∈Dp [F (x)− G(x)] < 0. We will prove the current lemma in some steps.
Step 1: closure of H 0 Suppose (F n , G n ) ∈ H 0 converges to (F, G) ∈ F × F with respect to the metric d 2 . Then sup Step 2: interior and boundary of H 0 Suppose (F, G) ∈ H 0 satisfies sup x∈Dp (F (x) − G(x)) > for some > 0. Consider any (F ,G) such that d 2 ((F ,G), (F, G)) < /3, which means F − F ∞ < /3 and G − G ∞ < /3. Since F − G is continuous and D p is compact, the supremum of F − G over D p is attained at some z ∈ D p . We havẽ Step 3: boundary of H 1 Our first step is to show To this end, consider (F n , G n ) ∈ H 1 converging to (F, G) ∈ F × F in d 2 . Then sup imsart-generic ver. 2014/10/16 file: vaccine_trial_6.tex date: November 1, 2022 which is bounded above by F n − F ∞ + G n − G ∞ . Letting n → ∞, we obtain sup x∈Dp (F (x) − G(x)) ≤ 0 implying (F, G) ∈ H 1 . Hence, (A.6) holds, which implies cl(H 1 ) ⊂ A ∪ H 1 . Using Lemma A.1 we obtain cl(H 1 ) ⊂ A ∪ H 1 ⊂ bd(H 1 ) ∪ H 1 = cl(H 1 ). Hence, the inclusion in (A.6) is actually an equality and A ∪ H 1 = bd(H 1 ) ∪ H 1 . The proof will be complete if we can show that int(H 1 ) = H 1 , because then bd(H 1 ) = A follows. To that end, consider (F, which is less than − /3. Therefore, (F ,G) ∈ H 1 , which completes the proof.
Lemma A.1. Under the set-up of Lemma 1, Proof. Consider (F, G) ∈ A. Because A ∩ H 1 = ∅, it suffices to show that (F, G) is a limit point of H 1 . We will show that given δ > 0, there exists (F ,G) ∈ H 1 so that d 2 ((F, G), (F ,G)) < δ.
Since F − G is continuous, its supremum over D p is attained. Thus, there We choose a and β so that additionally the followings hold: Letting = min(δ/2, p/4), t 1 = F −1 (1 − ) and t 2 = t 1 + 1, we definẽ ThatF is a distribution function is clear from the definition. We claim that (a) F is continuous, (b) F − F ∞ < δ, and finally (c) (F , G) ∈ H 1 . TakingG = G, the proof of the current lemma follows. Hence, it remains to prove Claim (a), (b), and (c). First, we prove Claim (a). Because F is continuous, max(0, F (x) − ) is continuous in x on (−∞, a). BSince F (a) > p/2 > , continuity of F also implies F (a−) = F (a) − = F (a+). Therefore,F is continuous on (−∞, a]. Therefore (A.7) impliesF is continuous on (−∞, a]. The continuity of F (x) − on (a, t 1 ] follows because F is continuous.F is linear on (t 1 , t 2 ], and equals F on (t 2 , ∞) Moreover, the left and right limits ofF agree at t 1 and t 2 . ThereforeF is continuous.
To prove Claim (b), it suffices to show that |F (x) − F (x)| ≤ for any x ∈ (−∞, a) ∪ (t 1 , t 2 ]. Note that if x < a and F (x) ≤ , thenF (x) = 0. Thus, is bounded by 2 , which is not greater than δ by our choice of . Hence, we have shown that F − F ∞ < δ.
To prove Claim ( , a] can not contain any limit point of C p either. Thus we must have which equals − . Combining the above pieces, we obtain sup x∈Dp (F (x) − G(x)) < 0, which completes the proof of (F , G) ∈ H 1 .

Proof of Lemma 2
Since F, G ∈ int(H 0 ), there exists x 0 ∈ D p and δ > 0 such that G(x 0 ) − F (x 0 ) < −3δ. Because F and G are continuous, we can find x 0 ∈ int(D p ) such that G(x 0 ) − F (x 0 ) < −2δ. Hence, (A.41) indicates that with probability one, G n (x 0 ) − F m (x 0 ) < −δ for all sufficiently large m and n, which leads to which approaches −∞ as m, n → ∞. Here we used the fact that Because x 0 ∈ int(D p ), Fact 3.1 implies x 0 ∈ D p,m,n almost surely. Therefore, the proof follows for the minimum t-statistic. Now we will prove the result for for T tsep m,n . Let us denote q = H(x 0 ) where H was denoted to be λF + (1 − λ)G. Since either f > 0 or g > 0 on D p under Condition N, it follows that h > 0 on D p . Thus H −1 (q) = x 0 (see Lemma A.3.7, pp.94, Bobkov and Ledoux, 2016). Hence, Since H −1 (q) ∈ int(D p ), it follows that q ∈ [p, 1 − p]. Fact 3.1 also implies that which equals −∞ because m/N → λ. Hence, the proof follows. A.0.2. Proof of Theorem 1 Before proving Theorem 1, we introduce some notations and two lemmas. Recall that we defined the underlying probability space to be (Ω, int(H 0 ), P ). There exists A ⊂ Ω with P (A) = 1 such that as m, n → ∞, the following assertions hold on A: Define and . (A.11) Let us denote C p,m,n = D p,m,n ∩ {x : F (x) = G(x)}. Proceeding as in the proof of Fact 3.1, we can show that Dist(D p,m,n , D p ) → 0 and Dist(C p,m,n , C p ) → 0 as m, n → ∞ (A.12) on A, where C p is the contact set D p ∩ {x ∈ R : F (x) = G(x)} discussed in Section 4. Now we state the first lemma, which we require for proving part (A).
Lemma A.2. Suppose D ⊂ R such that F and G are bounded away from 0 and 1 on D. Then, under the conditions of Theorem 1, the following holds on A: Proof. Note that Because F and G are bounded away from 0 and 1 on D, there exist c > 0 and c < 1 such that F (x), G(x) ∈ (c, c ) for x ∈ D. We see that (A.8) and m/N → λ imply on A, the following hold: lim sup Since on the probability one set A, ϑ m,n converges to ϑ 0 uniformly, and both functions are bounded below on D, the following also holds: lim m,n→∞ which, since lim sup m,n→∞ ϑ −1/2 m,n D is bounded, converges to 0 on A since m/N → λ and (A.9) holds. Also on A, the Brownian bridges V 1 and V 2 have continuous trajectories, which indicates √ λV 2 (G) − √ 1 − λV 1 (F ) is a continuous function, and hence bounded on D. Therefore, on A, equals zero, which completes the proof.
The second lemma, which is required for proving part (B), relies on the objects . (A.18) Because V 1 and V 2 are independent Brownian bridges, it is immediate that L 0 is a Gaussian process. We now state a lemma that concerns the two sample empirical process for any p ∈ (0, 1/2). Moreover, L 0 has continuous trajectories almost surely.
Proof of Lemma A.3. Theorem 4.1 and Corollary 4.1 of Pyke and Shorack (1968) indicate (A.19), where here we emphasize that (A.17) represents the corrected formula for L 0 , given by (30) of Ledwina and Wy lupek (2012), rather than the original formula for this quantity given in (3.8) of Pyke and Shorack (1968). Now observe that, because h > 0 on D p under Condition N, H is strictly increasing, which implies that H −1 is a continuous function. Since f , g and H −1 are continuous, (A.18) implies that s 1 and s 2 are both continuous. Therefore, from (A.17) and the fact that V 1 and V 2 have continuous trajectories almost surely, it is not hard to see that L 0 is continuous almost surely.
Our next lemma characterizes the Gaussian process L 0 on the set C p when C p ⊂ int(D p ).
Lemma A.4. Suppose F and G are as in Theorem 1 and C p ⊂ int(D p ). Then the Gaussian process Proof of Lemma A.4. If C p = ∅ then the statement is vacuously true. So we will assume that C p = ∅. We first claim that if C p ⊂ int(D p ), then s 1 (t) = s 2 (t) = 1 for all t ∈ H(C p ). If the claim is true, then from (A.17) it follows that for all t ∈ H(C p ).
Since V 1 and V 2 are independent Brownian bridge processes, it follows that the Gaussian process {L (t) : t ∈ [0, 1]} defined by In particular, it can be seen that λ/(1 − λ)L is a Brownian bridge. Hence the proof follows if we can prove the claim that s 1 (t) = s 2 (t) = 1 for all t ∈ H(C p ). Since (F, G) ∈ bd(H 0 ), Lemma 1 implies that F − G attains maxima at C p whenever C p = ∅. Because F −G is continuously differentiable and C p ⊂ int(D p ), f −g = 0 on C p . Therefore, f (H −1 (t)) = g(H −1 (t)) for all t ∈ H(C p ). The claim now follows from (A.18).

A.0.4. Proof of part (A)
First we will consider the case when C p = D p . The main steps of the proof are as follows: (a) We fix > 0, and choose some set C p,m,n (σ ) satisfying C p,m,n (σ ) ⊂ D p,m,n . Here C p,m,n (σ ) depends on σ , which is a random positive number that will be chosen appropriately. Next we partition D p,m,n as follows (see Figure 15): from which, we show that, (4.8) follows.
We will restrict our attention only to the set A for this part of the proof. However, because P (A) = 1, this serves our purpose.

Proof of step (a)
For σ > 0 and p ∈ [0, 1], let us define (A.27) The first task is to properly choose a σ so that certain properties hold on C p (σ ).
Since A ⊂ Ω was chosen so that on this set V 1 and V 2 have continuous trajectories, ω 0 also has continuous trajectory, which is also uniformly continuous on D p because the latter is a compact set. Hence, for each such trajectory, σ 1 = sup σ > 0 : |x−y| < σ implies |ω 0 (x)−ω 0 (y)| < /2 for all x, y ∈ D p is well defined and positive. Note that σ 1 is a random quantity, which can take the value 0 on the set A c but σ 1 > 0 on A.
On the other hand, since C p = D p and p < p, we have C p = D p . Therefore there exists x ∈ D p so that G(x) − F (x) > 0. Suppose δ p = (G(x) − F (x))/4. Because G − F is continuous and D p is compact, D p \ C p contains at least one interval where G − F > 2δ p . Using the continuity of G − F , we can choose σ 2 so small such that D p \ C p (σ 2 ) contains at least an interval where G − F > δ p , i.e. D p \ C p (σ 2 ) = ∅. We will take σ = min(σ 1 , σ 2 ). Since σ 1 is random, σ is also random. Moreover, σ > 0 on A.
Note that, B p,m,n (σ ) and E p,m,n (σ ) can be empty for small m and n. In that case, we define the infimum of ω m,n over those set to be ∞. Also since inf H −1 is continuous on p .
imsart-generic ver. 2014/10/16 file: vaccine_trial_6.tex date: November 1, 2022 A.0.5. Proof of step (b) For any σ > 0, let us denote B p (σ) = D p \ C p (σ). We claim that B p (σ ), is non-empty which follows because our choice of σ in step (a) implies B p (σ ) ⊃ B p (σ 2 ) where B p (σ 2 ) = ∅ by definition of σ 2 . By the continuity of G − F it follows that there exists a random quantity δ σ > 0 so that G − F > δ σ on B p (σ ). Recall the definition of ϑ m,n from (A.13). Since f and g are bounded away from 0 on D p , Lemma A.2 implies that on A, where the o(1) term approaches zero as m, n → ∞. However, by (A.16), ϑ −1/2 m,n is bounded above by constant depending only on p , F , and G on D p . Noting that the continuous function ω 0 is bounded on D p , and using m/N → λ, we deduce that lim inf . On the set A, we thus have D p,m,n ⊂ D p eventually as m, n → ∞, which implies

Proof of step (c)
We denote the boundary of the set C p by bd(C p ). We can show that C p is a closed set, which implies bd(C p ) ⊂ C p . Therefore, for all y ∈ bd(C p ), we have G(y) = F (y). Therefore, by Lemma A.2, for any x ∈ E p,m,n (σ ), and y ∈ bd(C p ) ∩ D p,m,n , the following holds on A for all sufficiently large m and n: which, by our choice of σ , is not larger than /2. Here (a) follows because E p,m,n (σ ) ⊂ C p,m,n (σ ) ⊂ C p (σ ) for sufficiently large m and n. The above leads to inf x∈Ep,m,n(σ ) m,n∩bd(Cp) ω m,n (y) − .
Since ( Noting C p,m,n = D p,m,n ∩ C p , we conclude this step.

Proof of step (d)
This step follows from Lemma A.2. To see this, note that, Lemma A.2 implies that ω m,n − ω 0 C p → 0 on A as m, n → ∞. Because F = G on C p , it also follows that Since on A, D p,m,n ⊂ D p ⊂ D p for sufficiently large m and n, it entails that C p,m,n ⊂ C p ⊂ C p for sufficiently large m and n as well. Therefore, on A, where the last step follows from (A.29). The above concludes the proof of (4.8) when C p = D p . Now suppose C p = D p . In this case, we will only use Step C and D. Let us restrict our attention to only A. We define σ = σ 1 E . Letting C p (σ ) be as in (A.26), we have D p ⊂ int(C p (σ )), and also, D p,m,n ⊂ int(C p (σ )) for sufficiently large m and n. Let us also denote C p,m,n as in (A.27) and E p,m,n = C p,m,n (σ ) \ C p,m,n as in step C. Then for large m and n, the partition D p,m,n = E p,m,n (σ ) ∪ C p,m,n is valid. Therefore, the proof follows from combining Step C and D.

Proof of part (B) of Theorem 1
We now prove part (B). For the ease of reference, we let Note that T tsep m,n (F m , G n ) = inf t∈[p,1−p] ν m,n (t). We start by studying the numerator of the above display. Noting that N H = mF m + nG n , we derive that Combining the fact that sup t∈[0,1] H • H −1 (t) − t ≤ 1/N (p. 762 of Pyke and Shorack, 1968) with the fact that m/N → λ, we obtain that (A.30) with probability one. The above readily shows that Combining the above with (A.19), we see that Upon noting that the preceding limit reduces to (A.32) If we take any subsequence of the random sequence on the left side of the above, we can find a further subsequence that approaches zero almost surely. Suppose that we can show, along the latter subsequence, that T tsep m,n (F m , G n ) − inf t∈H(Cp) U(t)/[t(1 − t)] 1/2 converges almost surely to zero. In light of the fact that the limit does not depend on the choice of sequence or subsequence, Theorem 5.7 of Shorack (2000) would then imply that the whole sequence converges weakly to the same limit, namely zero. Since weak convergence to a constant is equivalent to convergence in probability to that constant, this would complete the proof. Therefore, in what follows, we use m , n to denote members of a subsequence along which (A.31) holds almost surely and set out to prove that, as m , n → ∞, Hence we assume that there exists A ⊂ Ω such that P (A ) = 1 and, as m , n → ∞, on A , where N = m + n . We choose A so that L 0 has continuous trajectories on A , which Lemma A.3 shows is possible. The rest of the proof is similar to the proof of part (A) because the asymptotics of the infimum of ν m ,n over [p, 1 − p] are largely governed by its numerator. Indeed, replacing the denominator of (A.10) from part (A) by the denominator of [t(1 − t)] 1/2 for part (B) changes little since this new denominator is also bounded away from 0 on [p, 1 − p]. Nonetheless, there are some differences, which we detail below.
Fix > 0. We replace σ from the proof of part (A) by σ , where we define σ as follows. If t, t ∈ [p, 1 − p] satisfy |t − t | < σ , then, on A , Note that since [p, 1 − p] is a compact set, and the function L 0 has continuous trajectories on A , the random quantity σ > 0 on A .
which is clearly positive. Because f and g are continuous, h is also continuous, and therefore b p < ∞. Takingσ = σ /b p , and similar to (A.26), defining C p (z) = {x ∈ D p : Dist(x, C p ) < z}, for all z > 0, we observe that [p, 1 − p] can be written as the union of the following three sets: Note that, the D p used in the proof of part (a) of the current theorem is replaced by D p in the above partitioning. Part (B) differs from part (A) in that T tsep m,n (F m , G n ) is the infimum of a random quantity, namely ν m,n , over a fixed set [t, 1−t], whereas T min m,n (F m , G n ) calculates the infimum of ω m,n over a random set D p,m,n . To deal with this randomness, the asymptotics in part (A) were analyzed on a set D p ⊃ D p constructed so as to ensure D p ⊃ D p,m,n for sufficiently large m and n almost surely. Since part (B) does not have this additional difficulty, it suffices to study the behavior of ν m,n on D p , circumventing the need to consider D p .
Recall the definition of ν(t) from (A.32). Because ν(t) ≥ 0, one can show that the infimum of ν over the random setB p (σ ) is bounded below by some random number δσ > 0. Therefore, using (A.34) and imitating the proof of (A.28), we can show that lim inf Next let us consider t ∈Ẽ p (σ ) and t ∈ bd(H(C p )). Note that (A.31) and (A.37) imply that the following holds on A : on A , which indicates that, on this set, for all sufficiently large m and n , Because the above holds for any t ∈Ẽ p (σ ) and t ∈ bd(H(C p )), we obtain that The fact that ν is non-negative on [p, 1−p] yields that inf t∈Ẽp(σ ) ν(t) ≥ 0. Thus, the above shows that inf t∈Ẽp(σ ) ν m ,n (t) ≥ inf t∈H(Cp) ν m ,n (t) − . Therefore, using (A.38), we derive that on A , for sufficiently large m and n , Recall that A was chosen to satisfy P (A ) = 1. Thus, the above convergence holds with probability one, from which, (A.33) follows. As was discussed above (A.33), the fact that this equation holds completes the proof of part (B). Part (C) follows from Lemma A.4. The proof of Theorem 2 for the minimum t-statistic can be found in Whang (2019) (see also Davidson and Duclos, 2013). However, we still include it here for the sake of completeness.

Proof of Theorem 2
If (F, G) ∈ H 1 , we have G(x) > F (x) for all x ∈ D p . Since D p is compact, the continuous function G − F attains its minima at some x 0 ∈ D p . Therefore, it follows that inf with probability one. As a result, T min m,n → p ∞ follows because (A.39) indicates that with probability one, for all x ∈ D p,m,n , n for all large m and n. However, the right hand side of the last display is bounded below by mn/N δ. Since m/N → λ, it follows that T min m,n → p ∞. Hence, it suffices to prove (A.39). To this end, note that, since max(f, g) > 0 on an open neighborhood of D p , H = λF + (1 − λ)G has positive density on this neighborhood. Therefore, H is a continuous and strictly increasing function on this neighborhood. Therefore H −1 is continuous and strictly increasing in this neighborhood as well. Hence, we can choose p < p such that H −1 (p ) < H −1 (p), and inf For all sufficiently large m and n, D p,m,n ⊂ D p almost surely by Fact 3.1. Therefore, with probability one, inf x∈Dp,m,n as m, n → ∞. On the other hand, note that sup x∈Dp,m,n which converges to 0 almost surely. Therefore, (A.39) follows, which completes the proof for T min m,n . For T tsep m,n , note that (A.39) implies (t(1−t)) −1/2 , which diverges to +∞, thus completing the proof.
A.1. Proofs for the shape-constrained test statistics Before going into the proof for the shape-constrained test statistics, we state and prove a useful lemma.
Lemma A.5. Suppose that f is a unimodal density satisfying Condition A. Let f m be the unimodal density estimator of Birgé, based on the independent observations X 1 , . . . , X m with density f . Here we take η = o(m −1 ), where η is the tuning parameter in Section 3. Denote by F m the distribution function of f m . Further suppose that f 0 m is the Grenander estimator of f based on the true mode M . Then the following assertions hold: Proof. Suppose that M is the true mode of the density f . In this case F can be written as (Rao, 1969) Let us denote the distribution function of f 0 m by F 0 m . From Rao (1969) it follows that F 0 m can be expressed as Since F is continuous, the probability that X i = M for some i ∈ {1, . . . , m} is 0. Hence, there is no ambiguity in the above definition of F 0 m . Also, the empirical distribution of the X i ' writes as

It is well known that under Condition A, the Grenander estimator
s. 0 (see Theorem 2.1 of Beare et al. (2017), the original result dates back to Kiefer and Wolfowitz (1976)). Similar results hold for F 0,− m and F − m . Sinceα m → a.s. α with probability one, we conclude that (A.43) To prove part (A) of the current lemma, now we invoke Theorem 1 of Birgé, which states that where η is as defined in Section 3, which implies that, in our case, η = O(1/m). This, combined with (A.43), proves that the right hand side of the above display approaches 0 almost surely. Thus part (A) of the current lemma is proved. Finally, part (C) of the current lemma follows by noting that which converges to 0 as m → ∞ by part (B) of the current lemma and (A.1).

A.1.1. Proof of Lemma 4
Let us define Recalling the definition of ϑ m,n from (A.13), we obtain that .
Since F m , F m , G n , and G n take values between 0 and 1, it follows that Therefore, another application of Part (B) of Lemma A.5 combined with the fact that m/N → λ entails that mn N s. 0. Using part (B) of Lemma A.5 and m/N → λ in the second step, we also deduce that which converges to zero almost surely. It remains to prove that |T wrs m,n ( F m , G n )−T wrs m,n (F m , G n )| converges to 0 almost surely. To this end, we first note that ξ(F, G) = R F dG is Hadamard differentiable with respect to the norm · ∞ at every pair of distribution functions (F, G) (see Section 5, pages 362 -371 Lehmann, 1975), where the derivative at (F, G) is given byξ where µ X : R → R and µ Y : R → R are bounded continuous functions. Observe that we can write mn N + 1 . Note that part (C) of Lemma A.5 and the fact that m/N → λ imply that as m, n → ∞, λ). Therefore, the Hadamard differentiability of ξ implies that mn N (N + 1) Similarly using (A.1), (A.2), and m/N → λ, one can show that Then the proof for T wrs m,n ( F m , G n ) follows noting which converges to 0 almost surely.

A.1.2. Proof of Lemma 5
We can find p < p such that f and g are positive on D p . Theorem 4.4 of Dümbgen and Rufibach, Condition B1, and Condition B2 imply that sup Recall the set D p,m,n defined in (4.1). We note that, for sufficiently large N , D p,m,n ⊂ D p with probability one, indicating sup (A.46) which is o p (N −1/2 ) since m/N → λ. This result is similar to Lemma A.5(B) for unimodal densities, which is critical to proving the asymptotic equivalence between T min m,n ( F m , G n ) and T min m,n (F m , G n ), and between T tsep m,n ( F m , G n ) and T tsep m,n (F m , G n ) in Lemma 4. As a consequence, the rest of the proof will be nearly identical to the proof of Lemma 4. Hence, we only highlight the main steps of the proof.
Let us definẽ . Recalling the definition of ϑ m,n from (A.13), and proceeding like the proof of Lemma 4, we can prove a log-concave analogue of (A.44), that is Note that ϑ m,n (x) 1/2 −θ m,n (x) 1/2 ϑ m,n (x) 1/2θ m,n (x) 1/2 = ϑ m,n (x) −θ m,n (x) Since ϑ −1/2 m,n Dp,m,n is bounded almost surely, the above implies that θ −1/2 m,n Dp,m,n is also bounded almost surely. Therefore another application of (A.46) yields that where above F 1 and G 1 represent the cumulative distribution functions corresponding to f 1 and g 1 . Then ψ f and ψ g represent the influence functions of T (respectively F -and G-almost surely unique) under the nonparametric model ( Van der Vaart, 1998, p. 292). When T (f, g) equals the Hellinger distance D 2 (f, g), it follows that We have already mentioned in Section 5 that the Von Mises Expansion (VME) plays a critical role in the proofs of this section. We define the first order VME of T in the same lines as Kandasamy et al. (2015). Suppose that T is Gateaûx differentiable, and the corresponding influence functions ψ f and ψ g (see (B.2) in Section 5) exist. Then we say that T : P 2 1 → R has a first order VME if it satisfies the following for all f 1 , f 2 , g 1 , g 2 ∈ P 1 : T (f 2 , g 2 ) = T (f 1 , g 1 ) + R ψ f (x; f 1 , g 1 )f 2 (x)dx + R ψ g (x; f 1 , g 1 )g 2 (x)dx + O( f 1 − f 2 2 2 ) + O( g 1 − g 2 2 2 ).
(B.6) The first order VME implies that T can be written as a linear term plus second order bias term, i.e. T is sufficiently smooth. Kandasamy et al. (2015) gives examples of many T which has first order VME.
Let f m and g n be estimators of f and g based on samples of size m and n, respectively. We denote the corresponding distribution functions by F m and G n . We aim to show that under some regularity conditions, the plug-in estimator T (f m , g n ) is √ N -consistent for estimating T (f, g). The first condition we require is related to the weak convergence of the processes √ m(F m − F ) and √ n(G n − G) to Brownian processes.
Condition C1. The distribution functions F m and G n corresponding to density estimators f m and g n satisfy √ m(F m − F ) → d V 1 (F ) and √ n(G n − G) → d V 2 (G), where V 1 and V 2 are Brownian bridges.
The second condition involves the order of the L 2 error in estimating f and g. In particular, we require f m − f 2 2 and g n − g 2 2 to be of order o p (m −1/2 ) and o p (n −1/2 ), respectively. If the model is correctly specified, and f is bounded, many density estimators imsart-generic ver. 2014/10/16 file: vaccine_trial_6.tex date: November 1, 2022 f m are also bounded with high probability, leading to O p ( f m − f 2 2 ) = D 2 (f m , f )O p (1). Note that if f m also satisfies D 2 (f m , f ) = o p (m −1/2 ), Condition C2 follows.
Our next condition requires the influence functions ψ f (·; f, g) and ψ g (·; f, g) to be of bounded total variation on R. We say a function µ : R → R is of bounded total variation on R, if there exists a generalized derivative (in the sense of distribution) µ of µ (cf. Section 3.2 of Ambrosio et al., 2000) so that R |µ (x)|dx < ∞. If µ is of bounded total variation on R, then µ is also of bounded variation on R.
Condition I. The maps x → ψ f (x; f, g) and x → ψ g (x; f, g) are of bounded total variation.
We are now ready to state the main theorem of this section, which gives the asymptotic distribution of T (f m , g n ) under the above-stated conditions. Later we will show that when T (f, g) = D 2 (f, g), the conditions are satisfied. Thus Theorem 3 will follow as a corollary to Theorem 4. Related literature (cf. Kandasamy et al., 2015) implies that the asymptotic variance of T (f m , g n ) as given by Theorem 4 agrees with the asymptotic lower bound for this case under the nonparametric model.
Theorem 4. Suppose P 1 ⊂ P. Let T : P 2 1 → R be a functional of the form (B.1) satisfying the first order VME in (B.6). Consider f, g ∈ P 1 . We assume that the influence functions ψ f and ψ g defined in (B.2) satisfy Condition I . Let f m and g n be estimators of f and g based on two samples of size m and n, respectively, where m and n satisfy m/N → λ. Let us denote N = m + n. Further suppose f m , g n ∈ P 1 satisfy Conditions C1 and C2. Then we have √ N T (f m , g n ) − T (f, g) → d N (0, σ 2 f,g ), where σ 2 f,g = λ −1 R ψ f (x; f, g) 2 f (x)dx + (1 − λ) −1 R ψ g (x; f, g) 2 g(x)dx.
Proof. Since T satisfies the first order VME, (B.6) indicates that T (f m , g n ) − T (f, g) = R ψ f (x; f, g)f m (x)dx + R ψ g (x; f, g)g n (x)dx where the last step follows from Condition C2. Denote by F , F m , G, and G n the distribution functions corresponding to f , f m , g, and g n , respectively. Since ψ f (x; f, g) is an influence function with respect to f , it satisfies R ψ f (x; f, g)f (x)dx = 0.
Hence we can write imsart-generic ver. 2014/10/16 file: vaccine_trial_6.tex date: November 1, 2022 Now note that ψ f (·; f, g) is of bounded total variation on R by Condition I. Therefore, integration by parts yields that The Riemann-Stieltjes integral in the second term on the right hand side of the last display exists because ψ f is of bounded total variation and F m − F is continuous. Since ψ f (·; f, g) is of bounded total variation, it is also bounded, leading to lim x→±∞ ψ f (x; f, g)(F m (x) − F (x)) = 0.
Therefore, we deduce that Similarly we can show that R ψ g (x; f, g)g n (x)dx = − R (G n (x) − G(x))dψ g (x; f, g).
Since F m and G n satisfy Condition C1, it follows that where V 1 and V 2 are independent standard Brownian bridges. Here the underlying metric space corresponding to the weak convergence is (l ∞ , · ∞ )×(l ∞ , · ∞ ), where l ∞ was defined to be the set of all bounded functions on R. Since m/N → λ, Slutsky's Theorem yields √ N (F m − F ), √ N (G n − G) → d λ −1/2 V 1 (F ), (1 − λ) −1/2 V 2 (G) .
Since Condition I holds, it follows that, for µ 1 , µ 2 ∈ l ∞ , the map (µ 1 , µ 2 ) → R µ 1 (x) dψ f (x; f, g) + R µ 2 (x) dψ g (x; f, g) is continuous with respect to the uniform metric · ∞ . Therefore, invoking the continuous mapping theorem we obtain that R √ N (F m (x) − F (x))dψ f (x; f, g) + R √ N (G n (x) − G(x))dψ g (x; f, g) Now for any continuous distribution function F , any Brownian bridge V, and any function µ with finite total variation, the random variable The above follows from the proof of Theorem 2.3 of Mukherjee et al. (2019).
imsart-generic ver. 2014/10/16 file: vaccine_trial_6.tex date: November 1, 2022 Therefore, which is distributed as a Gaussian random variable with variance σ 2 f,g , thus completing the proof. Now we focus on the special case at hand, i.e. T (f, g) = D 2 (f, g). Towards this end, our first task is to show the existence of the first order VME. We take P 1 to be P(b, B), where f P(b, B) is as defined in (5.2).

Proof of Lemma 6
Theorem 4 of Cule and Samworth implies thatf m uniformly converges to f almost surely provided (i) f has finite first moment, (ii) the support of f has nonempty interior, and (iii) max{log f (x), 0}f (x)dx < ∞. For log-concave f , (i) follows from Lemma 1 of Cule and Samworth, (ii) follows from the continuity of f , and (iii) follows because f is bounded (cf. Lemma 5, . The similar result holds forg n as well. Because uniform convergence implies pointwise convergence, the above impliesf mgn pointwise converges to f g almost surely. Therefore, an application of Scheffé's Theorem (cf. Theorem 16.12, Billingsley, 2013) yields that as m, n → ∞, For the smoothed log-concave MLE, the uniform convergence off m to f follows from Theorem 1 of Chen and Samworth provided f has finite second moment, which follows trivially because all moments of a log-concave density are finite (Lemma 1, . The rest of the proof then follows from Scheffé's Theorem as in the case of the log-concave MLE. by F m and F 0 m , respectively, we deduce that other at the boundary. To summarize, the conservative TSEP tests might have a slight advantage over the ordinary counterparts in terms of type I error in some boundary cases, but this advantage comes at the cost of a drastic power-loss. In view of the above, we do not recommend the conservative TSEP tests for implementation.   Table 6: Summary of the trial HVTN 097 and trial HVTN 100. By positive respondents, we refer to vaccinees who developed immune response for at least one of the seven clade C V1V2 antigens under consideration.

Test
x γ = 0.33  x  Plot of the log-concave density estimators based on a sample of size 1000 from standard exponential distribution. Observe that the smoothed log-concave MLE does not approximate the true density well near the origin, which is also the point of discontinuity.