Random Discrete Distributions Derived From Self-Similar Random Sets

A model is proposed for a decreasing sequence of random variables $(V_1, V_2, \cdots)$ with $\sum_n V_n = 1$, which generalizes the Poisson-Dirichlet distribution and the distribution of ranked lengths of excursions of a Brownian motion or recurrent Bessel process. Let $V_n$ be the length of the $n$th longest component interval of $[0,1]\backslash Z$, where $Z$ is an a.s. non-empty random closed of $(0,\infty)$ of Lebesgue measure $0$, and $Z$ is self-similar, i.e. $cZ$ has the same distribution as $Z$ for every $c > 0$. Then for $0 \le a < b \le 1$ the expected number of $n$'s such that $V_n \in (a,b)$ equals $\int_a^b v^{-1} F(dv)$ where the structural distribution $F$ is identical to the distribution of $1 - \sup ( Z \cap [0,1] )$. Then $F(dv) = f(v)dv$ where $(1-v) f(v)$ is a decreasing function of $v$, and every such probability distribution $F$ on $[0,1]$ can arise from this construction.

While the last of these applications was the main source of inspiration for the present study, the results described here admit interesting interpretations in some of the other applications as well.
Let (V n ) = (V 1 , V 2 , · · ·) be a sequence of random variables such that V n ≥ 0 and n V n = 1 a.s., where n always ranges over {1, 2, · · ·}. Call (V n ) a random discrete distribution, or rdd. Call a random variable V a size-biased pick from (V n ), if V = V N for a positive integer valued random variable N such that P (N = n | V 1 , V 2 , · · ·) = V n (n = 1, 2, · · ·) ( 1 ) This construction, and its iteration to define a size-biased random permutation of (V n ), play a key role in both theory and applications of random discrete distributions [14,8,33]. Denote by F the distribution on (0, 1] of a size-biased pick V from (V n ). Following Engen [10], call F the structural distribution of (V n ). It is well known that many probabilities and expectations related to (V n ) can be expressed in terms of this one distribution F . For example, (1) implies that for any positive measurable function g Taking g(v) = 1(a < v < b) gives an expression in terms of F for the mean number of n's such that a < V n < b. This shows in particular that if (V n ) is ranked, i.e. if V 1 ≥ V 2 ≥ · · · a.s., then the distribution of V 1 restricted to the interval (1/2, 1] can be recovered from F : As noted in [33], the structural distribution F also appears in formulae related to Kingman's partition structure induced by (V n ), which is a natural construction of interest in several of the applications listed above. Call a distribution F on (0, 1] a structural distribution if F is the structural distribution of some rdd. Pitman [33] posed the problem of characterizing the set of all structural distributions, and gave a simple necessary condition for a distribution F to be structural, namely that for every 0 < a ≤ 1 (or, equivalently, for every 0 < a ≤ 1/2), This paper introduces a class of models for a rdd with the feature that the structural distribution can be identified explicitly. Analysis of these rdd's shows that the following condition is sufficient for F to be a structural distribution. We note however that this condition is far from necessary, even assuming F has a density (See Example 19).
Condition 1 F admits a density f(u) = F (du)/du such that is a decreasing function of u for 0 < u < 1.
From a mathematical point of view, it is natural to represent a rdd by the lengths of a random collection of disjoint open sub-intervals of [0, 1]. The complement of such a random collection of intervals is then a random closed subset of [0, 1], as defined more formally in Section 2.
The assumption that Lebesgue(Z) = 0 ensures that n V n = 1 a.s.. So (V n ) derived from Z is a ranked rdd. Think of each point of Z as the location of a cut in the line. Then (V n ) is defined by the ranked lengths of the intervals that remain after cutting [ 0, 1] at the points of Z. One natural construction of such a Z, corresponding to an arbitrary prescribed distribution for (V n ), is obtained from the exchangeable interval partition considered by Berbee [3] and Kallenberg [17]. Here we consider constructions with a different sort of symmetry: where cZ = {cz, z ∈ Z}, and d = denotes equality in distribution. (See Section 2 for the formal definition of the distribution of Z). Call Z a self-sim 0 set if Z is an a.s. non-empty self-similar random closed subset of R I + with Lebesgue(Z) = 0 a.s..
That Condition 1 is sufficient for F to be a structural distribution is implied by the following theorem: Theorem 4 A distribution F on (0, 1] is the structural distribution of (V n ) derived from some self-sim 0 set if and only if F satisfies Condition 1.
This result is derived in Section 4 using the characterization of the structural distribution of a self-sim 0 set provided by Theorem 7 below. Our formulation of this theorem was guided by the following two examples of self-sim 0 sets which have been extensively studied. Both examples involve the beta (a, b) distribution on (0, 1), which is defined for a > 0, b > 0 by the probability density proportional to For a random closed subset Z of R I , define Following terminology from renewal theory, when Z is a random discrete set of renewal times, we call (A t ) the age process derived from Z. If (V n ) is derived from Z, and A 1 > 0, then A 1 is one of the lengths in the sequence Example 5 poisson-dirichlet(θ). Suppose Z is the set of points of a Poisson process on (0, ∞) with intensity measure θx −1 dx for some θ > 0. Then the points of Z ∩ (0, 1] can be ranked in decreasing order, say It is known that Z n may be represented as where X 1 = A 1 and the X i are i.i.d. beta(1, θ) variables [16]. In terms of the X i the sequence (V n ) is obtained by ranking the termsṼ n defined bỹ The distribution of (V n ) derived from this Z is known as the Poisson-Dirichlet distribution with parameter θ [19,16]. It is known that (Ṽ n ) is a size-biased permutation of (V n ) [27,28,8,33]. In particular,Ṽ 1 = A 1 is a size-biased pick from (V n ), so the structural distribution of (V n ) is identical to the beta(1, θ) distribution of A 1 .
Example 6 stable(α). Let Z be the closure of the set of zeros of a selfsimilar strong-Markov process B, such as a Brownian motion or a recurrent Bessel process, started at B 0 = 0. It is well known that Z is then the closure of the range of a stable subordinator of index α for some 0 < α < 1. For example, α = 1/2 for Brownian motion, and α = (2 − δ)/2 for a Bessel process of dimension δ. The distribution of (V n ) in this case is an analog of the Poisson-Dirichlet distribution that has been studied by several authors [44,29,35]. It is well known that this Z is a.s. perfect, i.e. Z contains no isolated points. Consequently, Z is uncountable, and its points cannot be simply ranked as in the previous example. Still, it was shown in [35] that A 1 is a size-biased pick from (V n ), as in the previous example. So the structural distribution of (V n ) is again identical to the distribution of A 1 , in this case beta (1 − α, α), also known as generalized arcsine [9]. It was shown further in [30] that in this example a size-biased random permutation (Ṽ n ) of (V n ), constructed with extra randomization, admits the representation (12) for independent beta (1 − α, nα) random variables X n .
has the same joint distribution as if N were a size-biased pick from (V n ). Consequently: the structural distribution of (V n ) equals the distribution of A 1 , and the distribution of N, the rank of A 1 in (V n ), is given by Theorem 7 is proved in Section 2. Note the subtle phrasing of the conclusion of Theorem 7. It is not claimed, nor is it true for every self-sim 0 set Z, that A 1 is a size-biased pick from (V n ), as was observed in Examples 5 and 6. Spelled out in detail, the conclusion of Theorem 7 is that the rank N of A 1 in (V n ) has the following property, call it the weak sampling property: Equivalently, by definition of conditional probabilities, (weak sampling): Compare with the strong sampling property which was observed in Examples 5 and 6: (strong sampling): To paraphrase Theorem 7, every self-sim 0 set has the weak sampling property.
Example 20 in Section 5 shows that not every self-sim 0 set has the strong sampling property.
Proposition 23 can be used to generate a large class of self-sim 0 sets with the strong sampling property. But we do not know a nice sufficient condition for a self-sim 0 set to have this property. The most important conclusion of Theorem 7 is the identification of the structural distribution of (V n ) with the distribution of A 1 . We provide another approach to this result in Section 4. The idea is to exploit the fact that Z is self-similar iff log Z is stationary, and make use of the generalizations to stationary random sets [31] of some standard formulae for stationary renewal processes. An advantage of this approach is that it gives an explicit description of all possible joint distributions of (G t , D t ) derived from a self-sim 0 set, which leads to Theorem 4.

Self-Similar Random Sets
Call Z a random closed subset of R I if ω → Z(ω) is a map from a probability space (Ω, F , P ) to closed subsets of R I , and A t is F -measurable for every t > 0, where we define D t , G t and A t in terms of Z as below Definition 2. To emphasize the Z underlying these definitions, we may write e.g. A t (Z) instead of A t . Define the distribution of Z to be the distribution of the age process (A t (Z), t ≥ 0) on the usual path space of cadlag paths. We refer to Azéma [2] for a general treatment of random closed subsets of R I . A real or vector-valued process Such processes were studied by Lamperti [24,25], who called them semistable. See [40] for a survey of the literature of these processes. A random closed subset Z of R I is self-similar in the sense (6) iff its age process is 1-selfsimilar. A natural example of a self-similar random closed subset of (0, ∞) is provided by the closure of the zero set of a β-self-similar process for any β.
Assume now that Z is a self-sim 0 set as in Definition 3. Let V n (t) be the length of the nth longest component interval of [0, t]\Z. Then the sequence valued process ((V n (t), n ∈ N), t ≥ 0) is 1-self-similar, and The random sequence (V n (t)/t, n ∈ N) then defines a ranked rdd which has the same distribution for every t > 0.
Proof of Theorem 7. Let N t denote the rank of A t in the sequence of ranked lengths (V n (t), n = 1, 2, · · ·) of [0, t]\Z: with the convention sup ∅ = 0, so that It is a key observation that To check (24), start from the identity (21). Fix an m ∈ N and integrate 1(A t = V m (t)) with respect to both sides of (21). Since for each n, dV n (t) is carried by the set {t : A t = V n (t)}, and this set differs from {t : N t = n} by at most the discrete set of times {t such that A t = V k (t) for more than one k}, we obtain (24) with m instead of n. It is clear that (N t , t ≥ 0) satisfies the assumptions of the following Lemma. Theorem 7 follows immediately from the conclusion of the Lemma.
Let V n (t) be defined by (24) for all n ∈ N 0 . Then for every n ∈ N 0 and every t > 0 In particular, if (21) holds, then for every t > 0, Proof. Apply the next Lemma to the 0-self-similar process X t = 1(N t = n). 2 It will be shown that (27) follows from this identity. As a first consequence of (28), it suffices to prove (27) for t = 1. Let f : R I + → R I + be a C 1 function with f(0) = 0. The chain rule for Lebesgue integrals (see e.g. [39], Chapter 0, Prop. (4.6)) gives for all t ≥ 0 Take t = 1 and use (28) to obtain Now assume further that f (0) = 0, so there is a vanishing contribution from the event (X 1 = 0) in the rightmost expectation above. Then we see that for By standard measure theory, (31) must hold also for every Borel f such that both expectations in (31) are well-defined and finite. Taking shows that for every Borel subset B of R I that does not contain 0 But (32) holds also for B = R I because, using (28) again Subtracting (32) (33) gives (32) That is to say, Z is stationary iff its age process (A t ) is a stationary process in the usual strict sense. Then the R Definition 10 Call Z a stationary 0 set if Z is an a.s. non-empty stationary random closed subset of R I with Lebesgue(Z) = 0 a.s.. Clearly, Z is a self-sim 0 set iff log Z is a stationary 0 set. The following proposition is the restriction to stationary 0 sets of the result stated in Section VI of [31].
where V has uniform distribution on (0, 1), and V is independent of L.
Given a probability distribution F 0 on (0, ∞), it is easy to construct a stationary 0 set such that L has distribution F 0 .
If F 0 is degenerate at c > 0, so L = c is constant, it is obvious that stationary-lattice(F 0 ) is the only possible distribution of a stationary 0 set such that L has distribution F 0 . But there are many other possible distributions of Z corresponding to each non-degenerate F 0 . Among these is the following: Example 13 stationary-regen 0 (F 0 ). This is the unique distribution of a stationary 0 set Z that is regenerative subset of R I in the sense of [26], and such that L has distribution F 0 . See Fristedt [13], who gives the following construction of Z, and further references. Let where A 0 and D 0 are defined by (34) in terms of L with distribution F 0 and an independent uniform V , and, independent of these variables, Z 1 and Z 2 are two independent copies of the closed range of a pure jump subordinator with Lévy measure Λ(dx) = cx −1 F 0 (dx), for an arbitrary constant c > 0. It is easily seen that this yields the same distribution as various other constructions of Z that can be found in the literature. It is immediate from the above construction that this Z is reversible: Z d = − Z. If Λ has total mass 1, then the stationary-regen 0 set Z defined by (35) is just the stationary point process derived as in renewal theory from i.i.d. spacings with distribution Λ.
Another method of constructing a stationary-regen 0 (F 0 ) is to let Z be the closure of the zero set of a suitable stationary strong-Markov process X. One can always take X to be the stationary version of the age process derived from the subordinator with Lévy measure Λ described above. This is the method of Horowitz [15]. But zero sets of other Markov processes X may be considered. For example, the zero set Z of a stationary diffusion process X on the line, for which 0 is recurrent, gives a stationary-regen 0 set with Λ(0, ∞) = ∞. See Knight [22] and Kotani-Watanabe [23] regarding which Lévy measures Λ can be obtained by this construction from a diffusion.
Example 14 Suppose W = log Z where Z is the stable (α) regenerative self-sim 0 set. If we represent Z as the zero set of (B(t), t ≥ 0) for a Brownian motion B (in case α = 1/2) or Bessel process B of dimension δ = 2 − 2α, then W is the zero set of the process (X s , s ≥ 0) defined by X s = e −s/2 B(e s ). It is well known that for B a Brownian motion, X is a one-dimensional Ornstein-Uhlenbeck process. See Section 6 of [35] regarding the Bessel case.
4 The joint law of (G 1 , D 1 ) for a self-sim 0 set We start this section by presenting an alternative derivation of the key identity (13) that is part of the conclusion of Theorem 7. Another proof of (13). Let Z be a self-sim 0 set. Let V n := V n (1), the length of the nth longest component interval of [0, 1]\Z. Let U be uniform(0, 1) independent of Z. A size-biased pick from (V n ) is provided by the length of the component interval covering U, that is (D U ∧ 1) − G U . So an equivalent of (13) is which, by scaling, amounts to Ignoring null sets, the event (D U ≥ 1) is identical to (U > G 1 ). On this event G U = G 1 , so the left side of (36) reduces to 1 − G 1 , and (36) can be rewritten as That is to say (39) or equivalently, by scaling, Consider now the random subset log Z of R I . Since Z is a self-sim 0 set, log Z is a stationary 0 set. Since z → log z is increasing with log 1 = 0, and V is uniform on (0, 1), and independent of L and U. Thus the identity (37) reduces to the following: for such L, V and U, As noted in Section 3, L can have an arbitrary distribution F 0 on (0, ∞). By conditioning on L, (44) holds no matter what the distribution of L iff (44) holds for L an arbitrary positive constant, say L = log C. Now (44) reduces to this: for any constant C > 1, and independent uniform (0, 1) variables U and V , Put W = 1 − V , so U and W are i.i.d. uniform (0, 1) too. By the same reduction made earlier in (40), it is enough to check that for 0 < x < 1, Since 1 − C −W < 1 − C −1 , the distribution on the right side of (46) vanishes for x > 1 − C −1 . So it does on the left, since the condition UC −W < C −1 makes as well. So it is enough to compute the densities of both sides in (46) relative to dx for 0 < x ≤ 1 − C −1 , and show they are equal. On the one hand, conditioning on W shows that the density on the left side of (46) is constant over this range and equal to E[C W /(C − 1)]. On the other hand, the change of variables shows that the density on the right side of (46) is constant and equal to 1/ log C. An easy integration confirms that the two constants are equal. 2 Proof of Theorem 4. Consider the distribution of A 1 for a self-sim 0 set. By application of Proposition 11 as in the preceding argument, where Y := LW for a uniform(0, 1) variable W independent of L, and, from the discussion below Proposition 11, the random variable L := log(D 1 /G 1 ) can have an arbitrary distribution F 0 on (0, ∞). The distribution of Y is given by the density where h(y) := (y,∞) x −1 F 0 (dx). The conclusion of Theorem 4 is now clear. 2 As a complement to Theorem 4, the following corollary summarizes the collection of distributional identities implicit in the above argument:

Corollary 15
The distribution of D 1 /G 1 derived from a self-sim 0 set can be any distribution on (1, ∞). Let F 0 denote the corresponding distribution of L := log(D 1 /G 1 ), which can be any distribution on (0, ∞). Then (i) The joint distribution of (G 1 , D 1 ) is determined by F 0 via the formula where W:= −(log G 1 )/L has uniform (0, 1) distribution, independent of L with distribution F 0 ; (ii) the random variables − log G 1 and log D 1 are identically distributed with a decreasing density h(y) on (0, ∞) (iii) the random variables G 1 and 1/D 1 are identically distributed, with density g(u) on (0, 1) such that ug(u) is an increasing function of u; (iv) the random variables 1/G 1 and D 1 are identically distributed, with density k(x) on (1, ∞) such that xk(x) is a decreasing function of x; (v) the right-continous version of the density h is related to F 0 by (49); each of the densities h,g and k is arbitrary subject to the constraints stated above, and each of these densities can be recovered from any of the others via the formulae h(y) = e −y g(e −y ) = e y k(e y ) (0 < y < ∞) (56) Remark 16 Inversion. The identity in distribution D 1 d = 1/G 1 for any self-sim 0 set, implied by (ii) above, can be seen immediately by scaling, using the relation (G t < u) = (t < D u ). In case Z is the stable(α) self-sim 0 set, as in Example 6, the joint distribution of (G 1 , D 1 ) is well known [9]. In particular, the distribution of G 1 is beta (α, 1 − α). The identity in distribution D 1 d = 1/G 1 can be strengthened to (G t , t ≥ 0) d = (1/D 1/t , t ≥ 0) in this case, and more generally whenever Z is invertible, that is Z d = 1/Z where 1/Z = {1/z, z ∈ Z}. This amounts to reversibility of log Z, which holds whenever log Z is regenerative [26], and also if log Z is a stationary random lattice. However, as shown by an example in [31], not all stationary 0 sets are reversible, so not all self-sim 0 sets are invertible.
Remark 17 Distribution of D 1 − G 1 . For the stable (α) set Z, the distribution of the length D 1 − G 1 of the complementary interval covering 1 is found by integration from the joint law of (G 1 , D 1 ) to be where the last factor equals (1−x) α for 0 < x < 1 and 1 for x ≥ 1. It can also be shown that this distribution of D 1 − G 1 is the distribution of Z 1−α,α /Z α,1 for independent Z 1−α,α and Z α,1 , where Z a,b has beta(a, b) distribution.
The distribution of D 1 − G 1 is also easily described for the set of points Z of a Poisson point process with intensity θx −1 dx as considered in Example 5. In that case D 1 − G 1 is the sum of independent variables D 1 − 1 and 1 − G 1 , where 1 − G 1 has beta(1, θ) distribution, and D 1 d = 1/G 1 . So the density of D 1 − G 1 can be expressed as a convolution.
For a general self-sim 0 set, it is clear from (54) that the joint law of (G 1 , D 1 ) has a density relative to Lebesgue measure in the plane iff F 0 is absolutely continuous. Still, it can be seen as follows that D 1 − G 1 always has a density, no matter what F 0 . Use (54) to write Conditioning on L gives Integrating out with respect to F 0 (d ) gives a general if unwieldy formula for the density of D 1 − G 1 . We do not know whether F 0 can be recovered from the density of D 1 − G 1 , or if there is any nice description of all possible densities for D 1 − G 1 as Z ranges over all self-sim 0 sets.

Example 18
The zero set of a perturbed Brownian motion. The above formulae can be applied to the self-sim 0 set Z defined by the zero set of the perturbed Brownian motion (|B t | − µL t , t ≥ 0) studied in [5]. Here B is a standard Brownian motion, (L t , t ≥ 0) is its local time process at zero, and µ > 0 is a parameter. The law of G 1 , found explicitly in [5] turns out to be fairly complicated. Still, without further calculation, the above results show how this law determines the structural distribution of (V n ) derived from this Z, the law of D 1 , and joint law of (G 1 , D 1 ). It seems intuitively clear that this Z is not invertible, but we do not see a proof.

Examples
Example 19 A (V n ) with V n > 0 for all n such that the structural distribution has a density f not satisfying Condition 1. Let V 1 have a density with support [q, 1] for some 1/2 < q < 1. Let V n+1 = (1 − V 1 )W n for n ≥ 1 where (W n ) is any rdd with W n > 0 for all n, whose structural distribution has a density that does not vanish on (0, 1). Then (V n ) is a rdd whose structural distribution F has a density f that is strictly positive on (0, 1 − q) and (q, 1), but which vanishes on (1 − q, q). Obviously this f does not satisfy Condition 1.
Example 20 A self-sim 0 set Z that does not have the strong sampling property. For every possible structural density f for (V n ) derived from a self-sim 0 set, as described in Theorem 4, there is a self-sim 0 set that generates a (V n ) with the given structural density f, and which does not have the strong sampling property (17). Such a self-sim 0 set Z is obtained as Z = exp(W ) where W is the stationary-lattice(F 0 ) for F 0 the distribution in (49) derived from f as in (53). Let L be the length of the component interval of W c that contains 0. So L has distribution F 0 . And Z := {exp(L(Z − V ))} where V is uniform (0, 1) independent of L, and Z is the set of integers. Consequently, Z ∩ (0, 1] = {Z 1 > Z 2 > · · ·} a.s. where Z n = e −L(n−1+V ) for n = 1, 2, · · ·, and the spacingsṼ n := Z n−1 − Z n , where Z 0 := 1, are given bỹ The sequence (V n ) is obtained by ranking (Ṽ n ). The expression forṼ n shows thatṼ 2 ,Ṽ 3 , · · · is a geometric progression with common ratio e −L . Let N be the rank ofṼ 1 in (V n ), soṼ 1 = V N . Clearly V n =Ṽ n for all n > N, so P V n+1 V n = e −L for all sufficiently large n = 1 and N can be recovered from (V n ) as Thus both L and N are measurable functions of (V n ). In particular, N is not a size-biased pick from (V n ) in the sense of (17).
But we do not know if this holds for any other choices of (α, θ). It is easily checked that for (α, θ) in the range (63) this beta distribution on (0, 1) satisfies the necessary Condition 1 for existence of a self-sim 0 set generating a (V n ) with this structural distribution. In Corollary 26 we show how to derive pd(α, θ)from a self-sim 0 set for 0 < α < 1 and θ > 0, but we do not know whether this is possible for 0 < α < 1 and −α < θ ≤ 0.

Operations
There are some natural operations related to both random discrete distributions and self-similar random sets, which allow examples to be combined in some interesting ways.
Define the ranked product of two rdd's (U n ) and (V n ) defined on the same probability space to be the rdd (W n ) obtained by ranking the collection of products {U m V n , m ∈ N, n ∈ N}. As noted in [33], ifŨ 1 is a size-biased pick from (U n ) andṼ 1 is a size-biased pick from (U n ), and (Ũ 1 , U 1 , U 2 , · · ·) and (Ṽ 1 , V 1 , V 2 , · · ·) are independent, thenW 1 :=Ũ 1Ṽ1 is a size-biased pick from the ranked product (W n ) of (U n ) and (V n ). So the set of structural distributions on (0, 1] is closed under the multiplicative analog of convolution.
In particular, if f and g are two structural densities then so is h defined by Let P • Q denote the distribution of the ranked product of a rdd (P ), i.e. a rdd with distribution P , and an independent rdd (Q). Let str(P) denote the structural distribution on (0, 1] of a rdd (P ), and let * denote the multiplicative convolution operation on distributions on (0, 1]. Then the above remarks may be summarized as follows: str(P•Q) = str(P) * str(Q).
Note that the operation • on distributions of rdd's is commutative: P • Q = Q • P . Note also that with mild non-degeneracy assumptions on P and Q, if (W n ) is a rdd (P • Q) then with probability 1 there are distinct positive integers (k, , m, n) with W k /W = W m /W n .
So, for example, P • Q could not be pd(α, θ)for any (α, θ) . A more interesting operation on laws of rdd's is the composition operation ⊗ defined as follows. Given two laws P and Q for a rdd, let (U n ) be a rdd (P ), and, independent of (U n ), let (V mn , n = 1, 2, · · · , ), m = 1, 2 · · · be a sequence of i.i.d. copies of a rdd (Q). Let P ⊗ Q be the law of the rdd obtained by ranking the collection of products {U m V mn , m ∈ N, n ∈ N}. It is easily seen that the operation ⊗, like •, has the property str(P ⊗ Q) = str(P) * str(Q). However, except in trivial cases, P ⊗ Q = P • Q. This is clear because mild conditions on P and Q ensure that the probability considered in (66) becomes 0 for P ⊗ Q instead of P • Q. Indeed, the composition operation ⊗ is not even commutative. This is easily seen as follows. Take one of the laws, say P , to be the degenerate distribution that assigns probability 1 to the sequence (1/2, 1/2, 0, · · ·), and let Q be the law of any (V n ) such that V n has a continuous distribution for each n and V 1 > V 2 > · · · a.s.. Then (W n ) governed by P ⊗ Q has W 1 > W 2 > · · · a.s. whereas (W n ) governed by Q ⊗ P has W 1 = W 2 > W 3 = W 4 > · · · a.s. Typically then, P • Q, P ⊗ Q and Q ⊗ P will be three distinct laws for a rdd with the same structural distribution str(P) * str(Q).
A nice illustration of the composition operation is provided by the following result of [34]: for α > 0 and θ > 0, pd(0, θ) ⊗ pd(α, 0) = pd(α, θ) If, as in the above examples, both P and Q can be derived from a self-sim 0 set, it is natural to ask whether P • Q and P ⊗ Q can be so derived. This is achieved for ⊗ by the following construction.

Construction 22
Let X and Y be two random closed subsets of R I of Lebesgue measure 0. Let (γ n , δ n ), n ∈ N) be a list of all the component intervals of the complement of X. Let Y 1 , Y 2 , · · · be a sequence of independent copies of Y , independent also of X. Let Informally, the new set Z contains all the points of X, and, within each component interval of X c , the new set also contains points derived from a copy of Y shifted to start at the left end of the interval. Some basic properties of this construction are stated in the following Proposition, whose proof is straightforward and left to the reader: Proposition 23 Let P and Q denote the distributions of the rdd's derived from self-sim 0 sets X and Y repectively. Let Z be constructed from X and Y as in Construction 22. Then Z is a self-sim 0 set, the distribution of the rdd derived from Z is P ⊗ Q, with structural distribution str(P ⊗ Q) = str(P) * str(Q). Moreover, if both X and Y have the strong sampling property (17) then so does Z.
A consequence of this proposition, which can also be checked directly, is the following:

Corollary 24
The set of densities on (0, 1) satisfying Condition 1 is closed under multiplicative convolution.
Remark 25 Finite Unions. If Z 1 , · · · , Z m are m independent self-sim 0 sets, it is easily seen that their union Z is also a self-sim 0 set. Since A 1 (Z) = max i A 1 (Z i ), Theorem 7 identifies the structural distribution of the rdd derived from Z as the distribution of max iṼ i whereṼ i is a size-biased pick from the rdd derived from Z i . Let f i denote the structural density of V i derived from Z i . Then the structural density f derived from Z is So the class of densities satisfying Condition 1 is closed under this operation too.
Note that if X is discrete, e.g. the Poisson process generating pd(0, θ) as in Example 21, and Y is perfect, like the stable (α) set generating pd(α, 0), then laying down shifted copies of Y in the component intervals of X, as in Construction 22, yields a perfect set Z. But if the roles of X and Y are switched, laying down shifted copies of X in the component intervals of Y yields a set that is a.s. neither discrete nor perfect. Certainly, the distributions of the sets Z so obtained are different, but whether or not the derived rdd's have the same distribution is not so obvious.
By combining the identity (67) with the above proposition, we obtain: Corollary 26 For every α > 0 and θ > 0, there exists an a.s. perfect self-sim 0 set Z with the strong sampling property such that the rdd derived from Z has pd(α, θ) distribution.

Open problems
In the setting of Lemma 8, fix t and write simply N for N t and V (B) for n∈B V n (t)/t for a subset B of N. Applying Lemma 9 to X t = 1(N t ∈ B) shows that for every subset B of N, P (N ∈ B|V (B)) = V (B). However, as in the discussion around (16) and (17), Example 20 shows that it does not necessarily follow that P (N ∈ B|V (C), C ⊆ N) equals V (B), as it does in Examples 5 and 6. See [36] for some applications of this property in the setting of Example 6. It is natural to ask what additional hypothesis is appropriate for this stronger conclusion to hold in a more general setting, but we do not have an answer to this question. In essence, the problem is the following: Problem 27 Find a condition that implies the identity (27) for a vectorvalued 0-self-similar process X.
See [37] for a number of reformulations of (27) and further discussion, including a simple example of an R I 2 -valued 0-self-similar process X for which (27) fails to hold.
We do not know much about rdd's derived from self-sim 0 sets besides the results presented in this paper. Some obvious questions seem very difficult to tackle. For instance: Problem 28 Is it possible to characterize the set of all possible laws of rdd's that can be derived from a self-sim 0 set Z?
Less abstractly, given some description of the distribution of a self-sim 0 set or perhaps another random closed Z, there is the problem of how to describe the distribution of (V n ) derived from Z. Several papers in the literature can be viewed as treating instances of this problem for Z's of various special forms [16,29,38]. Problem 30 describes a self-sim 0 set Z for which this question remains to be answered. For the random closed subset Z of (0, 1) associated with an exchangeable interval partition of (0, 1) derived from a rdd as in Berbee [3], Kallenberg [17], it is obvious that the law of Z is uniquely determined by that of the rdd. But if there exists a self-sim 0 set Z that generates the rdd, uniqueness of the law of Z is not so obvious: Problem 29 Given that a rdd with a particular distribution can be derived from some self-sim 0 set Z, is the distribution of such a Z unique?
We do not even know if there is uniqueness in law for the two most basic examples 5 and 6. To conclude, we pose the following: Problem 30 Suppose Z = exp W for W a stationary-regen 0 (F 0 ) with ∞ 0 x −1 F 0 (dx) < ∞, so W is the set of points in a stationary renewal sequence. LetṼ n be the sequence of spacings between the points of the discrete set Z, as defined in (10) and (12). For which F 0 is it the case, as it is for W a homogeneous Poisson process, that (Ṽ n ) is a size-biased random permutation of the ranked sequence (V n )? Example 20 shows that there are discrete self-sim 0 sets Z such that (Ṽ n ) does not have the same distribution as a size-biased random permutation of (V n ), despite the identity in distribution of first terms implied by (13). It would be interesting to know if there were any other discrete self-sim 0 sets besides exp(W ) for homogeneous Poisson W which had this property. If there were, it would presumably be possible to explicitly describe the joint law of the size-biased sequence (Ṽ n ), and then derive a sampling formula for the corresponding partition structure, as in [34].