## Electronic Journal of Statistics

### Nonparametric density estimation using partially rank-ordered set samples with application in estimating the distribution of wheat yield

#### Abstract

We study nonparametric estimation of an unknown density function $f$ based on the ranked-based observations obtained from a partially rank-ordered set (PROS) sampling design. PROS sampling design has many applications in environmental, ecological and medical studies where the exact measurement of the variable of interest is costly but a small number of sampling units can be ordered with respect to the variable of interest by any means other than actual measurements and this can be done at low cost. PROS observations involve independent order statistics which are not identically distributed and most of the commonly used nonparametric techniques are not directly applicable to them. We first develop a kernel density estimator of $f$ based on an imperfect PROS sampling procedure and study its theoretical properties. Then, we consider the problem when the underlying distribution is assumed to be symmetric and introduce some plug-in kernel density estimators of $f$. We use an EM type algorithm to estimate misplacement probabilities associated with an imperfect PROS design. Finally, we expand on various numerical illustrations of our results via several simulation studies and a case study to estimate the distribution of wheat yield using the total acreage of land which is planted in wheat as an easily obtained auxiliary information. Our results show that the PROS density estimate performs better than its SRS and RSS counterparts.

#### Article information

Source
Electron. J. Statist., Volume 8, Number 1 (2014), 738-761.

Dates
First available in Project Euclid: 21 May 2014

https://projecteuclid.org/euclid.ejs/1400703412

Digital Object Identifier
doi:10.1214/14-EJS902

Mathematical Reviews number (MathSciNet)
MR3211030

Zentralblatt MATH identifier
1348.62127

#### Citation

Nazari, Sahar; Jafari Jozani, Mohammad; Kharrati-Kopaei, Mahmood. Nonparametric density estimation using partially rank-ordered set samples with application in estimating the distribution of wheat yield. Electron. J. Statist. 8 (2014), no. 1, 738--761. doi:10.1214/14-EJS902. https://projecteuclid.org/euclid.ejs/1400703412

#### References

• [1] Arslan, G. and Ozturk, O. (2013). Parametric inference based on partially rank ordered set samples. Journal of the Indian Statistical Association, 51, 1–24.
• [2] Barabesi, L. and Fattorini, L. (2002). Kernel estimation of probability density functions by ranked set sampling. Communication in Statistics: Theory and Methods, 31, 597–610.
• [3] Breunig, R.V. (2001). Density estimation for clustered data. Econometric Reviews, 20, 353–367.
• [4] Breunig, R.V. (2008). Nonparametric density estimation for stratified samples. Statistics and Probability Letters, 78, 2194–2200.
• [5] Buskirk, T.D. (1998). Nonparametric density estimation using complex survey data. In ASA Proceedings of the Section on Survey Research Methods, 799–801. American Statistical Association.
• [6] Chen, Z., Bai, Z. and Sinha, B.K. (2004). Ranked Set Sampling: Theory and Applications. Springer-Verlag, New York.
• [7] Chen, Z. (1999). Density estimation using ranked set sampling data. Environmental and Ecological Statistics, 6, 135–146.
• [8] Fieberg, J. (2007). Kernel density estimators of home range: Smoothing and the autocorrelation red herring. Ecology, 88, 1059–1066.
• [9] Fligner, M.A. and MacEachern, S.N. (2006). Nonparametric two–sample methods for ranked set sample data. Journal of the American Statistical Association, 101, 1107–1118.
• [10] Frey, J. (2012). Nonparametric mean estimation using partially ordered sets. Environmental and Ecological Statistics, 6, 309–326.
• [11] Gao, J.L. and Ozturk, O. (2012). Two sample distribution-free inference based on partially rank-ordered set samples. Statistics and Probability Letters, 82, 876–884.
• [12] Ghalanos, A. and Theussl, S. (2012). Rsolnp: General Non-linear Optimization Using Augmented Lagrange Multiplier Method. R package version 1.14.
• [13] Gulati, S. (2004). Smooth non-parametric estimation of the distribution function from balanced ranked set samples. Environmetric, 15, 529–539.
• [14] Hatefi, A., Jafari Jozani, M. and Ozturk, M. (2014). Mixture model analysis of partially rank ordered set samples: Estimating the age-groups of fish from length-frequency data. Submitted.
• [15] Hatefi, A. and Jafari Jozani, M. (2013a). Information content of partially rank ordered set samples. Submitted.
• [16] Hatefi, A. and Jafari Jozani, M. (2013b). Fisher information in different types of perfect and imperfect ranked set samples from finite mixture models. Journal of Multivariate Analysis, 119, 16–31.
• [17] Jafari Jozani, M., Majidi, S. and Perron, F. (2012). Unbiased and almost unbiased ratio estimators of the population mean in ranked set sampling. Statistical Papers, 53, 719–737.
• [18] Lam, K.F., Yu, P.L.H. and Lee, C.F. (2002). Kernel method for the estimation of the distribution function and the mean with auxiliary information in ranked set sampling. Environmetrics, 13, 397–406.
• [19] Minoiu, C. and Reddy, S.G. (2012). Kernel density estimation on grouped data: The case of poverty assessment. The Journal of Economic Inequality. To appear.
• [20] Opsomer, J.D. and Miller, C.P. (2005). Selecting the amount of smoothing in nonparametric regression estimation for complex surveys. Journal of Nonparametric Statistics, 17, 593–611.
• [21] Ozturk, O. (2012). Quantile inference based on partially rank ordered set samples. Journal of Statistical Planning and Inference, 142, 2116–2127.
• [22] Ozturk, O. (2011). Sampling from partially rank-ordered sets. Environmental Ecological Statistics, 18, 757–779.
• [23] Miladinovic, B., Kumar, A. and Djulbegovic, B. (2013). Kernel density estimation for random-effects meta-analysis. International Journal of Mathematical Sciences in Medicine, 1, 1–5.
• [24] Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis. Chapman and Hall, London, New York.
• [25] Wand, M.P. and Jones, M.C. (1995). Kernel Smoothing. Chapman and Hall, London.
• [26] Wolfe, D.A. (2012). Rank set sampling: Its relevance and impact on statistical inference. ISRN Probability and Statistics, doi 10.5402/2012/568385.
• [27] Wolfe, D.A. (2004). Ranked set sampling: An approach to more efficient data collection. Statistical Science, 19, 636–643.
• [28] Ye, Y. (1987). Interior algorithms for linear, quadratic, and linearly constrained non-linear programming. Ph.D. Thesis, Department of EES, Stanford University.