Explicit bounds for the approximation error in Benford's law

Benford's law states that for many random variables X>0 its leading digit D = D(X) satisfies approximately the equation P(D = d) = log_{10}(1 + 1/d) for d = 1,2,...,9. This phenomenon follows from another, maybe more intuitive fact, applied to Y := log_{10}(X): For many real random variables Y, the remainder U := Y - floor(Y) is approximately uniformly distributed on [0,1). The present paper provides new explicit bounds for the latter approximation in terms of the total variation of the density of Y or some derivative of it. These bounds are an interesting alternative to traditional Fourier methods which yield mostly qualitative results. As a by-product we obtain explicit bounds for the approximation error in Benford's law.


Introduction
The First Digit Law is the empirical observation that in many tables of numerical data the leading significant digits are not uniformly distributed as one might suspect at first. The following law was first postulated by Simon Newcomb (1881): Prob(leading digit = d) = log 10 (1 + 1/d) for d = 1, . . . , 9. Since the rediscovery of this distribution by physicist Frank Benford (1938), an abundance of additional empirical evidence and various extensions have appeared, see Raimi (1976) and Hill (1995) for a review. Examples for "Benford's law" are one-day returns on stock market indices, the population sizes of U.S. counties, or stream flow data ). An interesting application of this law is the detection of accounting fraud (see Nigrini, 1996). Numerous number sequences (e.g. Fibonacci's sequence) are known to follow Benford's law exactly, see Diaconis (1977), Knuth (1969) and Jolissaint (2005).
An elegant way to explain and extend Benford's law is to consider a random variable X > 0 and its expansion with integer base b ≥ 2. That means, X = M · b Z for some integer Z and some In terms of Y := log b (X) and one may express the probability of (1) as If the distribution of Y is sufficiently "diffuse", one would expect the distribution of U being approximately uniform on [0, 1), so that (2) is approximately equal to Hill ( This particular and similar results are typically derived via Fourier methods; see, for instance, Pinkham (1961) or Kontorovich and Miller (2005).
The purpose of the present paper is to study approximate uniformity of the remainder U in more detail. In particular we refine and extend an inequality of Pinkham (1961). Section 2 provides the density and distribution function of U in case of the random variable Y having Lebesgue density f . In case of f having finite total variation or, alternatively, f being k ≥ 1 times differentiable with k-th derivative having finite total variation, the deviation of L(U ) (i.e. the distribution of U ) from Unif[0, 1) may be bounded explicitly in several ways. Since any density may be approximated in L 1 (R) by densities with finite total variation, our approach is no less general than the Fourier method. Section 3 contains some specific applications of our bounds. For instance, we show that in case of Y being normally distributed with variance one or more, the distribution of the remainder U is very close to the uniform distribution on [0, 1).

On the distribution of the remainder U
Throughout this section we assume that Y is a real random variable with c.d.f. F and Lebesgue density f .

The c.d.f. and density of U
For any Borel set B ⊂ [0, 1), The corresponding density g is given by Note that the latter equation defines a periodic function g : R → [0, ∞], i.e. g(x + z) = g(x) for arbitrary x ∈ R and z ∈ Z. Strictly speaking, a density of U is given by 1{0 ≤ x < 1}g(x).

Total variation of functions
Let us recall the definition of total variation (cf. Royden 1988, Chapter 5): For any interval J ⊂ R and a function h : J → R, the total variation of h on J is defined as In case of J = R we just write TV(h) := TV(h, R). If h is absolutely continuous with derivative An important special case are unimodal probability densities f on the real line, i.e. f is nondecreasing on (−∞, µ] and non-increasing on [µ, ∞) for some real number µ.

Main results
We shall quantify the distance between L(U ) and Unif[0, 1) by means of the range of g, The latter inequality follows from sup x∈R g(x) ≥ 1 0 g(x) dx = 1 ≥ inf x∈R g(x). In addition we shall consider the Kuiper distance between L(U ) and Unif[0, 1), and the maximal relative approximation error, Expression (2) shows that these distance measures are canonical in connection with Benfords law.
Note that KD(G) is bounded from below by the more standard Kolmogorov-Smirnov distance, and it is not greater than twice the Kolmogorov-Smirnov distance.
Remark. The inequalities in Theorem 1 are sharp in the sense that for each number τ > 0 there exists a density f such that the corresponding density g satisfies A simple example, mentioned by the referee, is the uniform density f (x) = 1{0 < x < τ }/τ .
Writing τ = m + a for some integer m ≥ 0 and a ∈ (0, 1], one can easily verify that

and this entails (3).
Here is another example with continuous densities f and g: For given τ > 0 consider a continuous, even density f with f (0) = τ such that for all integers z ≥ 0,  As a corollary to Theorem 1 we obtain a refinement of the inequality which was obtained by Pinkham (1961, corollary to Theorem 2) via Fourier techniques: Corollary 2 Under the conditions of Theorem 1, for 0 ≤ x < y ≤ 1, In particular, The previous results are for the case of TV(f ) being finite. Next we consider smooth densities for some version of f (k) . Then g is Lipschitz-continuous on R. Precisely, for x, y ∈ R with |x − y| ≤ 1, Corollary 4 Under the conditions of Theorem 3, for 0 ≤ x < y ≤ 1, In particular, Finally, let us note that Theorem 1 entails a short proof of the qualitative result mentioned in the introduction:

Some applications
We start with a general remark on location-scale families. Let f o be a probability density on the real line such that TV(f Then one verifies easily that

Normal and log-normal distributions
For φ(x) := (2π) −1/2 exp(−x 2 /2), elementary calculations reveal that In general, with the Hermite type polynomial of degree k. Via partial integration and induction one may show that for arbitrary integers j, k ≥ 0 (cf. Abramowitz and Stegun 1964). Hence the Cauchy-Schwarz inequality entails that These bounds yield the following results: Theorem 6 Let f (x) = f µ,σ (x) = φ((x − µ)/σ)/σ for µ ∈ R and σ ≥ 1/6. Then the corresponding functions g = g µ,σ and G = G µ,σ satisfy the inequalities for all normal densities f with standard deviation at least one.

Corollary 7 For an integer base
with σ ≥ 1/6. Then the leading digits D 0 , D 1 , D 2 , . . . of X satisfy the following inequalities: For

Gumbel and Weibull distributions
Let X > 0 be a random variable with Weibull distribution, i.e. for some parameters γ, τ > 0, Then the standardized random variable Y o := τ log(X/γ) satisfies Elementary calculations reveal that for any integer n ≥ 1, with p n (t) being a polynomial in t of degree n. Precisely, p 1 (t) = t, and for n = 1, 2, 3, . . .. In particular, p 2 (t) = t(1−t) and p 3 (t) = t(1−3t+t 2 ). These considerations lead already to the following conclusion: Corollary 8 Let X > 0 have Weibull distribution with parameters γ, τ > 0 as above. Then for arbitrary integers k, ℓ ≥ 0 and digits d 0 , d 1 , d 2 . . . as in Corollary 7. Explicit inequalities as in the gaussian case seem to be out of reach. Nevertheless some numerical bounds can be obtained. Table 1 contains numerical approximations for TV(f
Hence the S n,k are Stirling numbers of the second kind (see [6], chapter 6.1).

Some useful facts about total variation
In our proofs we shall utilize the some basic properties of total variation of functions h : J → R (cf. Royden 1988, Chapter 5). Note first that In both cases we would get a contradiction to h (0) = h being integrable over R.