Statistical Science

Rank Tests from Partially Ordered Data Using Importance and MCMC Sampling Methods

Debashis Mondal and Nina Hinrichs

Full-text: Open access


We discuss distribution-free exact rank tests from partially ordered data that arise in various biological and other applications where the primary objective is to conduct testing of significance to assess the linear dependence or to compare different groups. The tests here are obtained by treating the usual rank statistics, based on the completely ordered data as “latent” or missing, and conceptualizing the “latent” $p$-value as the random probability under the null hypothesis of a test statistic that is as extreme, or more extreme, than the latent test statistics based on the completely ordered data. The latent $p$-value is then predicted by sampling linear extensions or the complete orderings that are consistent with the observed partially ordered data. The sampling methods explored here include importance sampling methods based on randomized topological sorting algorithms, Gibbs sampling methods, random-walk based Metropolis–Hasting sampling methods and random-walk based modern perfect Markov chain Monte Carlo sampling methods. We discuss running times of these sampling methods and their strength and weaknesses. A simulation experiment and three data examples are given. The simulation experiment illustrates how the exact rank tests from partially ordered data work when the desired result is known. The first data example concerns the light preference behavior of fruit flies and tests whether heterogeneity observed in average light-preference behavior can be explained by manipulations in serotonin signaling. The second one is a reanalysis of the lead absorption data in children of employees who worked in a lead battery factory and consolidates the results reported in Rosenbaum [Ann. Statist. 19 (1991) 1091–1097]. The third one reexamines the breast cosmesis data from Finkelstein [Biometrics 42 (1986) 845–854].

Article information

Statist. Sci., Volume 31, Number 3 (2016), 325-347.

First available in Project Euclid: 27 September 2016

Permanent link to this document

Digital Object Identifier

Mathematical Reviews number (MathSciNet)

Zentralblatt MATH identifier

Exact tests fuzzy $p$-values Gibbs sampling iterval censoring linear extensions linear rank statistics perfect MCMC proportional hazard model topological sorting


Mondal, Debashis; Hinrichs, Nina. Rank Tests from Partially Ordered Data Using Importance and MCMC Sampling Methods. Statist. Sci. 31 (2016), no. 3, 325--347. doi:10.1214/16-STS549.

Export citation


  • Aho, A. V., Garey, M. R. and Ullman, J. D. (1972). The transitive reduction of a directed graph. SIAM J. Comput. 1 131–137.
  • Baik, J., Deift, P. and Johansson, K. (1999). On the distribution of the length of the longest increasing subsequence of random permutations. J. Amer. Math. Soc. 12 1119–1178.
  • Bayarri, M. J. and Berger, J. O. (2000). $p$ values for composite null models. J. Amer. Statist. Assoc. 95 1127–1142.
  • Bayarri, M. J. and Berger, J. O. (2004). The interplay of Bayesian and frequentist analysis. Statist. Sci. 19 58–80.
  • Besag, J. (2004). Markov Chain Monte Carlo Methods for Statistical Inference. Dept. Statistics, Univ. Washington, Seattle.
  • Besag, J. and Clifford, P. (1989). Generalized Monte Carlo significance tests. Biometrika 76 633–642.
  • Besag, J. and Mondal, D. (2013). Exact goodness-of-fit tests for Markov chains. Biometrics 69 488–496.
  • Besag, J., Green, P., Higdon, D. and Mengersen, K. (1995). Bayesian computation and stochastic systems. Statist. Sci. 10 3–41.
  • Brightwell, G. and Winkler, P. (1991). Counting linear extensions. Order 8 225–242.
  • Bubley, R. and Dyer, M. (1999). Faster random generation of linear extensions. Discrete Math. 201 81–88.
  • Cormen, T. H., Leiserson, C. E., Rivest, R. L. and Stein, C. (2001). Introduction to Algorithms, 2nd ed. MIT Press, Cambridge, MA.
  • Cox, D. R. (1972). Regression models and life-tables. J. R. Statist. Soc. Ser. B 34 187–220.
  • Cox, D. R. and Hinkley, D. V. (1979). Theoretical Statistics. Chapman & Hall, London.
  • Crowley, J. (1974). A note on some recent likelihoods leading to the log rank test. Biometrika 61 533–538.
  • Fay, M. P. and Shaw, P. A. (2010). Exact and asymptotic weighted log-rank tests for interval censored data: The interval R package. J. Stat. Software 36 1–34.
  • Ferrenberg, A. M., Landau, D. P. and Swendsen, R. H. (1995). Statistical errors in histogram reweighting. Phys. Rev. E 51 5092–5100.
  • Finkelstein, D. M. (1986). A proportional hazards model for interval-censored failure time data. Biometrics 42 845–854.
  • Gelman, A., Meng, X.-L. and Stern, H. (1996). Posterior predictive assessment of model fitness via realized discrepancies. Statist. Sinica 6 733–807.
  • Geman, S. and Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. Pattern Anal. Mach. Intell. 6 721–741.
  • Geyer, C. J. and Meeden, G. D. (2005). Fuzzy and randomized confidence intervals and $P$-values. Statist. Sci. 20 358–366.
  • Goggins, W. B. and Finkelstein, D. M. (2000). A proportional hazards model for multivariate interval-censored failure time data. Biometrics 56 940–943.
  • Goggins, W. B., Finkelstein, D. M., Schoenfeld, D. A. and Zaslavsky, A. M. (1998). A Markov chain Monte Carlo EM algorithm for analyzing interval-censored data under the Cox proportional hazards model. Biometrics 54 1498–1507.
  • Gordon, A. D. (1979a). A measure of the agreement between rankings. Biometrika 66 7–15.
  • Gordon, A. D. (1979b). Another measure of the agreement between rankings. Biometrika 66 327–332.
  • Hájek, J. (1968). Asymptotic normality of simple linear rank statistics under alternatives. Ann. Math. Stat. 39 325–346.
  • Hájek, J., Šidák, Z. and Sen, P. K. (1999). Theory of Rank Tests, 2nd ed. Academic Press, San Diego, CA.
  • Huber, M. (2004). Perfect sampling using bounding chains. Ann. Appl. Probab. 14 734–753.
  • Huber, M. (2006). Fast perfect sampling from linear extensions. Discrete Math. 306 420–428.
  • Hunt, J. W. and Szymanski, T. G. (1977). A fast algorithm for computing longest common subsequences. Commun. ACM 20 350–353.
  • Kahn, A. B. (1962). Topological sorting of large networks. Commun. ACM 5 558562.
  • Kain, J. S., Stokes, C. and de Bivort, B. L. (2012). Phototactic personality in fruit flies and its suppression by serotonin and white. Proc. Natl. Acad. Sci. USA 109 19834–19839.
  • Kalbfleisch, J. D. and Prentice, R. L. (1980). The Statistical Analysis of Failure Time Data. Wiley, New York.
  • Karzanov, A. and Khachiyan, L. (1991). On the conductance of order Markov chains. Order 8 7–15.
  • Kruskal, W. H. and Wallis, W. A. (1952). Use of ranks in one-criterion variance analysis. J. Amer. Statist. Assoc. 47 583–621.
  • Lerche, D., Brüggemann, R., Sørensen, P., Carlsen, L. and Nielsen, O. J. (2002). A comparison of partial order technique with three methods of multi-criteria analysis for ranking of chemical substances. J. Chem. Inf. Comput. Sci. 42 1086–1098.
  • Liu, J. S. (2008). Monte Carlo Strategies in Scientific Computing. Springer, New York.
  • Matthews, P. (1991). Generating a random linear extension of a partial order. Ann. Probab. 19 1367–1392.
  • Meng, X.-L. (1994). Posterior predictive $p$-values. Ann. Statist. 22 1142–1160.
  • Mondal, D. and Hinrichs, N. (2016). Supplement to “Rank tests from partially ordered data using importance and MCMC sampling methods.” DOI:10.1214/16-STS549SUPP.
  • Morton, D. E., Saah, A. J., Silberg, S. L., Owens, W. L., Roberts, M. A. and Saah, M. D. (1982). Lead absorption in children of employees in a lead-related industry. Am. J. Epidemiol. 115 549–555.
  • Page, E. B. (1963). Ordered hypotheses for multiple treatments: A significance test for linear ranks. J. Amer. Statist. Assoc. 58 216–230.
  • Patil, G. P. and Taillie, C. (2004). Multiple indicators, partially ordered sets, and linear extensions: Multi-criterion ranking and prioritization. Environ. Ecol. Stat. 11 199–228.
  • Prentice, R. L. (1978). Linear rank tests with right censored data. Biometrika 65 167–179.
  • Propp, J. G. and Wilson, D. B. (1996). Exact sampling with coupled Markov chains and applications to statistical mechanics. Random Structures Algorithms 9 223–252.
  • Puri, M. L. and Sen, P. K. (1971). Nonparametric Methods in Multivariate Analysis. Wiley, New York.
  • Riggle, J. (2009). The complexity of ranking hypotheses in optimality theory. Comput. Linguist. 35 47–59.
  • Rosenbaum, P. R. (1991). Some poset statistics. Ann. Statist. 19 1091–1097.
  • Rosenbaum, P. R. (2002). Observational Studies, 2nd ed. Springer, New York.
  • Rubin, D. B. (1987). A noniterative sampling/importance resampling alternative to the data augmentation algorithm for creating a few imputations when fractions of missing information are modest: The SIR algorithm. J. Amer. Statist. Assoc. 82 543–546.
  • Satten, G. A. (1996). Rank-based inference in the proportional hazards model for interval censored data. Biometrika 83 355–370.
  • Schensted, C. (1961). Longest increasing and decreasing subsequences. Canad. J. Math. 13 179–191.
  • Self, S. G. and Grossman, E. A. (1986). Linear rank tests for interval-censored data with application to PCB levels in adipose tissue of transformer repair workers. Biometrics 42 521–530.
  • Skare, Ø., Bølviken, E. and Holden, L. (2003). Improved sampling-importance resampling and reduced bias importance sampling. Scand. J. Stat. 30 719–737.
  • Smith, A. F. M. and Gelfand, A. E. (1992). Bayesian statistics without tears: A sampling-resampling perspective. Amer. Statist. 46 84–88.
  • Tarjan, R. E. (1976). Edge-disjoint spanning trees and depth-first search. Acta Inform. 6 171–185.
  • Thompson, E. A. and Geyer, C. J. (2007). Fuzzy $p$-values in latent variable problems. Biometrika 94 49–60.
  • Vandal, A. C., Conder, M. D. E. and Gentleman, R. (2009). Minimal covers of maximal cliques for interval graphs. Ars Combin. 92 97–129.
  • Vandal, A. C. and Gentleman, R. (1998). Weak order partitioning of interval orders with applications to interval censored data Technical Report STAT9702, Dept. Statistics, Univ. Auckland.
  • Wilson, D. B. (2004). Mixing times of Lozenge tiling and card shuffling Markov chains. Ann. Appl. Probab. 14 274–325.
  • Zar, J. H. (1972). Significance testing of the Spearman rank correlation coefficient. J. Amer. Statist. Assoc. 67 578–580.

Supplemental materials