Open Access
March, 1992 An Optimal Variable Cell Histogram Based on the Sample Spacings
Yuichiro Kanazawa
Ann. Statist. 20(1): 291-304 (March, 1992). DOI: 10.1214/aos/1176348523

Abstract

Suppose we wish to construct a variable $k$-cell histogram based on an independent identically distributed sample of size $n - 1$ from an unknown density $f$ on the interval of finite length. A variable cell histogram requires cutpoints and heights of all of its cells to be specified. We propose the following procedure: (i) choose from the order statistics corresponding to the sample a set of $k + 1$ cutpoints that maximize a criterion, a function of the sample spacings; (ii) compute heights of the $k$ cells according to a formula. The resulting histogram estimates a $k$-cell theoretical histogram that stays constant within a cell and that minimizes the Hellinger distance to the density $f$. The histogram tends to estimate low density regions accurately and is easy to compute. We find the number of cells of order $n^{1/3}$ minimizes the mean Hellinger distance between the density $f$ and a class of histograms whose cutpoints are chosen from the order statistics.

Citation

Download Citation

Yuichiro Kanazawa. "An Optimal Variable Cell Histogram Based on the Sample Spacings." Ann. Statist. 20 (1) 291 - 304, March, 1992. https://doi.org/10.1214/aos/1176348523

Information

Published: March, 1992
First available in Project Euclid: 12 April 2007

zbMATH: 0745.62034
MathSciNet: MR1150345
Digital Object Identifier: 10.1214/aos/1176348523

Subjects:
Primary: 62G05
Secondary: 62E20

Keywords: Density estimation , Hellinger distance , Histogram , order statistics , spacing

Rights: Copyright © 1992 Institute of Mathematical Statistics

Vol.20 • No. 1 • March, 1992
Back to Top