CONVERGENCE IN INCOMPLETE MARKET MODELS

The problem of pricing and hedging of contingent claims in incomplete markets has led to the development of various valuation methodologies. This paper examines the mean-variance approach to risk-minimisation and shows that it is robust under the convergence from discrete- to continuous-time market models. This property yields new convergence results for option prices, trading strategies and value processes in incomplete market models. Techniques from nonstandard analysis are used to develop new results for the lifting property of the minimal martingale density and risk-minimising strategies. These are applied to a number of incomplete market models: It is shown that the convergence of the underlying models implies the convergence of strategies and value processes for multinomial models and approximations of the Black-Scholes model by direct discretisation of the price process. The concept of D 2 -convergence is extended to these classes of models, including the construction of discretisation schemes. This yields new standard convergence results for these models. For ease of reference a summary of the main results from nonstandard analysis in the context of stochastic analysis is given as well as a brief introduction to mean-variance hedging and pricing.


Introduction
Weak convergence results for option values and their associated trading strategies have so far been established in the context of complete market models, where unambiguous pricing of contingent claims is possible by means of hedging techniques. The results surveyed by Taqqu and Willinger [TW87] and the more recent work on stronger modes of convergence by Cutland, Kopp and Willinger [CKW93a] rely on market completeness.
For incomplete markets, convergence results for option values were obtained by Runggaldier and Schweizer [RS95], and in a more general framework in a recent work by Prigent [Pri97]. Both use the minimal martingale densities introduced by Föllmer and Schweizer [FS91] as their starting point and use the weak convergence of these densities to derive the convergence of option values. However, in neither case there is a discussion of hedging strategies associated with the various optimality criteria, although the formulation of the general mean-variance hedging problem for incomplete market models provides explicit expressions for such strategies under certain additional conditions on the price process. These conditions are satisfied in all the models discussed in the above papers, but it appears that the techniques developed there do not suffice to settle the question of convergence of trading strategies in this context.
In this paper we extend the techniques developed in [CKW93a] to show that the associated discretetime strategies do indeed converge to their counterpart in the limiting continuous-time model, provided that the limiting model is complete. As in [CKW93a] we employ techniques from nonstandard analysis in proving our results, but are able to state these results in standard terms by means of the stronger concept of D 2 -convergence developed in [CKW93a]. The necessary nonstandard tools are described briefly in the primer provided in [CKW91]; in this paper we outline new nonstandard results required for the extension of their results to incomplete market models.
2. Mean-Variance Hedging 2.1. Discrete Time. The calculation of the minimal martingale measures and trading strategies we wish to use in our convergence theory is particularly simple in the discrete case. We review briefly the discrete-time setup, introduced by Schweizer in [Sch88], which will serve us throughout this paper.
Define a discrete time economy on the set T = {0, 1, . . . , T } (T ∈ N) of trading dates; the price of some risky asset is given as a stochastic process S on a complete probability space (Ω, F , P ) with a filtration F = (F t ) t∈T ; assume that F 0 is trivial (up to P -null sets) and that F T = F . Without loss of generality we suppose that S already represents the discounted price process with respect to some numeraire (equivalently, the risk-free discount rate is zero).
We also assume that the price process S is F-adapted and S t ∈ L 2 (P ) for t ∈ T. Define ∆X t = X t+1 − X t for any process X on (Ω, F , P ). Note that, contrary to customary practice, we use the 'forward' increments.
Here θ represents the number of units of the risky asset while ψ represents the amount held in a riskfree account. The usual definitions for the associated value process V t (φ) = θ t S t + ψ t , gains process G t (φ) = Σ t−1 u=0 θ u ∆S u and cost process C t (φ) = V t (φ) − G t (φ) apply. Recall that a strategy φ is selffinancing if and only if ∆V (φ) = ∆G(φ).
Let H ∈ L 2 (P ) be a contingent claim; a strategy φ is called H-admissible if V T (φ) = H; we then say that φ generates H. A market model is complete if for any claim H there exists a self-financing strategy generating H. By contrast, in an incomplete model a claim H can in general not be generated by a self-financing strategy (note that there always exists an H-admissible strategy for any H ∈ L 2 : take θ ≡ 0, ψ t = H · 1 {t=T } ). We can therefore only expect to find an H-admissible strategy which is optimal with respect to some optimality criterion. Here we use the notion of local risk-minimisation which was introduced in [Sch88]: The local risk process r(φ) for a strategy φ is defined as for t ∈ T. We are looking for a strategy φ which minimizes this local risk by an appropriate choice of θ t and V t (see [Sch88] for details). Note that r ≡ 0 for a self-financing strategy. It can be shown that under the nondegeneracy condition there exists a unique H-admissible locally risk-minimising strategy φ H . This strategy is given explicitly by backward sequential regression: setting V H T := H and then defining recursively for t ∈ T \ {T }. Finally, set θ H T := 0 and ψ H T := H. As defined in (2) θ H t can be viewed as the best linear estimate for V H t+1 based on the information at time t, while equations (3) and (4) ensure that φ H is mean-self-financing, i.e. the cost process C(φ H ) is an F-martingale.
The locally risk-minimising strategy given by equations (2)-(4) gives rise to a natural decomposition of the contingent claim H (cf. [Sch94a]): We define the Doob decomposition of the price process S = S 0 +M +A, where Then M is a martingale and A is a predictable process; both M and A are square-integrable. We can then define the square-integrable martingale L by It is easy to see that the martingales M and L H are orthogonal, i.e. their product is a martingale. Note that ∆L H t = ∆C t (φ H ), so that L H represents the extra cost required by the strategy φ H . Using (3) we have so we have obtained a decomposition where L H is a square-integrable martingale orthogonal to the martingale part of S. Equation (5) is the discrete-time version of the Föllmer-Schweizer decomposition (see below); in the case where the price process S is already a martingale under P the decomposition (5) simplifies to the (discrete-time) Kunita-Watanabe decomposition of H.

Continuous Time.
In continuous time the set of trading dates T : t∈T is a filtration on some complete probability space (Ω, F , P ) which satisfies the usual conditions (right-continuity and completeness) with F 0 trivial and F T = F . The price process is given by a semimartingale S ∈ S 2 (P ) with Doob-Meyer decomposition S = S 0 + A + M where M is a squareintegrable martingale and A is a predictable process with square-integrable variation (see [Ell82] for definitions). We also assume that A is absolutely continuous with respect to M -this is a nondegeneracy condition akin to (1) above. A trading strategy φ is a pair (θ, ψ) with θ predictable and ψ adapted such that We can then define the gains process G(φ) := θdS and the cost process C(φ) := V (φ) − G(φ) analogous to the discrete-time situation. The notion of local risk-minimisation for the generating strategy of a contingent claim H as defined above can be transferred to this setting, although this involves sophisticated results on the differentiation of semimartingales (see [Sch90,Sch91]. It can be shown, however, that the existence of a locally risk-minimising strategy is equivalent to the existence of a Föllmer-Schweizer decomposition where V 0 ∈ R, θ H ∈ L 2 (S) and L H is a square-integrable martingale orthogonal to M . The question of existence and uniqueness of the decomposition (6) for a general claim H was settled by Monat and Stricker [MS95].
Note that our market model is complete if L H ≡ 0 for any H ∈ L 2 (P ).

Minimal Martingale Measures.
We wish to express the initial value V H 0 associated with the locally risk-minimising trading strategy φ H as the expectation of the claim H under a signed martingale measureP for the price process S; the minimal martingale measure for S (cf. [Sch94b]). This measure will be described by its density relative to P. In the discrete-time setting the density is given as the final value (i.e. the value at time T ) of the process The density dP dP =Ẑ T ∈ L 2 (P ) defines a signed measureP under which S is a martingale and In the continuous-time case we concentrate on conditions which ensure thatP is a true measure equivalent to P , (thus coinciding with P on F 0 ) such that any L 2 (P )−martingale orthogonal to M remains a martingale underP . In this case the density for the minimal martingale measure is defined via the stochastic exponential where α is the density of A with respect to M in the Doob-Meyer decomposition of S. It is then shown in [MR97,Chapter 26], that, under the assumption that α t (M t − M t− ) < 1 for all t ∈ T, the integrability conditionẑ t ∈ L 2 (P ) is equivalent to the existence of a unique minimal martingale measureP , whose density relative to P is given byẑ T .

Nonstandard martingales
To compare the measures defined by these densities, we generalize the results of Section 3.1 of [CKW91], where techniques from nonstandard analysis were used to provide alternative derivations of Black-Scholes option pricing theory. We refer to Section 2 of [CKW91] for a primer in hyperfinite probability theory and the basic definitions needed for a nonstandard description of Brownian Motion and related concepts. However, we shall need to extend the setup somewhat in order to allow consideration of arbitrary cádlág processes rather than restrict the discussion to path-continuous processes as in [CKW91]. Two fundamental references for these developments are [AFHL86] and [HP83]; to set the scene for these extensions we recall (largely following [Cut00]) the principal ideas and definitions here.
3.1. The Nonstandard Universe. We shall assume given (as in [CKW91]) a fixed nonstandard extension * R of the real line R. The extension * R includes elements defined as non-zero 'infinitesimals' (x ∈ * R satisfying |x| < ε for all ε > 0 in R) and their 'infinite' multiplicative inverses. The extension itself is not unique: * R can, for example, be defined as an ultrapower R N /U of the reals by any non-principal ultrafilter U on N (i.e. a collection of subsets of N closed under intersections and supersets, containing no finite sets, and such that for any A ⊂ N, either A or N \ A belongs to U.) The existence of such ultrafilters is equivalent to the Axiom of Choice. What is important here is that the arithmetical and order operations valid in R extend to * R: the tuple ( * R, +, ×, <) is an ordered field.
We may view all mathematical objects as sets. For any set S the superstructure V(S) over S is defined as V(S) = n∈N V n (S), where V 0 (S) = S, and V n+1 (S) = V n (S) ∪ P(V n (S)) (n ∈ N).
This construction can be applied in turn to R and to * R. Their superstructures V = V(R) and V( * R) are then connected by a map * : V(R) → V( * R) which associates with each object M in V(R) its nonstandard extension * M in V( * R). The nonstandard universe (whose members are known as the internal sets) is then simply given by The Transfer Principle states that any bounded quantifier statement holds in V iff it holds in * V.
(A bounded quantifier statement is a mathematical statement which can be written to ensure that all quantifiers range over a prescribed set. This includes most statements used in practice.) This result enables us to 'switch' from the 'standard world' V to internal objects (elements of * V) and back again: in proofs we can therefore 'translate' a statement into the language of internal sets, manipulate it within * V and then (hope to) translate the results into the context of V. For the finite part {x ∈ * R : |x| < r for some r ∈ R} of the set * R of hyperreals the important (topological) connection is made very simply via the Standard Part Theorem, which asserts that each finite x ∈ * R can be written uniquely as x = r + δ for some r ∈ R and an infinitesimal δ. We write x ≈ y if x − y is infinitesimal. Associated with (Ω, F , P ) is a standard measure space, its Loeb space (Ω, F L , P L ), and Loeb-measurable functions f : Ω → R are those which have liftings to internal F−measurable functions F : Ω → * R. The interplay between these (hyper-) discrete and (standard) continuous-time formulations enables us to derive new convergence results, generalizing those of [CKW93a].

Internal Probability Spaces and Loeb
Briefly, the Loeb measure construction proceeds as follows: the internal map P : F → * [0, 1] is finitely additive; and we may define the map ). This makes (Ω, F , • P ) into a (standard) finitely additive measure space. By the Caratheodory extension theorem • P will have a unique σ-additive extension to the σ-algebra σ(F ) provided that • P is σ-additive in F . But this is a consequence of the ℵ 1 -saturation of * V: this property, which is shared by any countable ultrapower, states that for any countable decreasing family (A m ) m∈N of non-empty internal sets we have m∈N A m = ∅.
Hence the completion P L of • P , defined on the completion F L of σ(F ), defines the Loeb space (Ω, F L , P L ) as a standard probability space. A key property of B ∈ F L is the existence of a set A ∈ F such that the symmetric difference A∆B is P L -null.
Example 3.1. Lebesgue measure on [0, 1] is the Loeb measure of counting measure Λ on T (with T = 1) as defined above. The Lebesgue σ-algebra consists of sets B ⊂ [0, 1] for which st −1 Example 3.2. Wiener measure W on C = C 0 (0, T ) can be defined similarly as the Loeb measure associated with counting probability W N on the set C N of all polygonal paths (B t ) t∈T formed by joining the points of T linearly and satisfying B 0 = 0 and ∆B t = B t+∆t − B t = ± √ ∆t. Wiener space is then the Loeb space associated with the internal probability space (C N , * P(C N ), W N ). More precisely: write F N for the completion of σ( * P(C N )), for almost all (t, ω) in T × Ω. We call F a nonanticipating lifting of f . Such liftings exist for Itô integrals: These ideas extend to the various functionals of special semimartingales used in continuous-time option pricing models, so that we can approximate these by appropriate 'discrete' hyperfinite counterparts in order to establish convergence results for option prices and the associated optimal trading strategies. The analogues in this context of the usual path-regularity properties of martingales are discussed exhaustively in [HP83], and we shall not repeat their technical results here. The concept of SDJ-functions is crucial: in essence this describes conditions under which an internal function F : T → * R has a cádlág standard part: This definition is compatible with standard parts defined in the Skorohod topology on the space * D[0, T ] of * -cádlág functions. For this we need to demand that SDJ-functions should have at most one noninfinitesimal jump in each monad ('infinitesimal neighborhood') in T and be S-continuous at 0, i.e. t ≈ 0 should imply that F (t) ≈ 0 (see [HP83] for details.) For an internal stochastic process X : a. paths t → X t (ω) have this property. The internal probability space (Ω, F , P ) becomes an internal filtered space if it is endowed with an increasing internal sequence of algebras A = (A t ) t∈T on Ω, and the process X is nonanticipating Defining the standard part st(M ) as above we obtain a (standard) L 2 (P L )-martingale with respect to the filtration The Stochastic integrals relative to M can again be defined as hyperfinite Stieltjes integrals by setting, for internal Θ : Ω × T → * R, ( Θ∆M ) t = s<t Θ s ∆M s for t ∈ T. For this 'integral' to have a standard counterpart we need to introduce the internal Doléans measure ν M on Ω × T endowed with the internal algebra A Ω×T generated by sets The most useful class of integrands is the class SL 2 (ν M ) of nonanticipating processes X for which X 2 is S-integrable relative to ν M on Ω × T. The following results, taken largely from [HP83] relate the nonstandard and standard formulations to each other: Theorem 3.6. Let M be an SL 2 -martingale of class SDJ. Let m := st(M ) be its standard part and assume that θ ∈ L 2 (ν m ). Then θ has a 2-lifting Θ (i.e. an internal function Θ : 3.4. D 2 -Convergence of Wiener Functionals. Using nonstandard methods, the papers [CKW93a, CKW95] introduced a mode of convergence for functionals on Wiener space which is stable under stochastic differentiation and integration. This was motivated by the wish to consider convergence of a sequence of contingent claims in Cox-Ross-Rubinstein option pricing models (based upon simple random walks) together with convergence of their generating strategies and value process processes. Thus the desired mode of convergence would need to be stable under both stochastic differentiation and stochastic integration -which weak convergence is not.
The appropriate mode of convergence, called D 2 − convergence in [CKW93a] (for reasons which will become clear below), can be formulated for arbitrary separable metric spaces and is stated very simply in terms of liftings: Theorem 3.7. Let Y be a separable metric space with a Borel probability µ, and suppose that (µ n ) is a family of probabilities on Borel sets Y n ⊂ Y converging weakly to µ.
If F n : Y n → R is a family of measurable functions and f : Y → R is measurable then the following are equivalent: (ii) F n (y), y → f (y), y weakly as n → ∞. This means that the distribution of F n (y), y ∈ R × Y n with y distributed according to µ n converges to the distribution of f (y), y ∈ R×Y with y distributed under µ.
Example 3.8. This result is applied in [CKW95] to Wiener space: The associated discrete spaces are taken as (C n , F n , A n , W n ), where C n is the set of paths of a simple random walk based on T n of step size ± √ ∆ n (with ∆ n = 1 n ) and where we interpolate linearly between points of T n . Write B n (t, X) = X(t) for X ∈ C n , then B n (X) denotes the path (B n (t, X)) t∈Tn and ∆B n (t, X) = ∆X(t). The filtration A n is generated by {B n (s, ·) : s ≤ t}, and W n is counting probability on C n .
Thus, given a sequence of measurable functions F n : C n → R and a measurable function f : C → R, we have: F N is an SL 2 -lifting of f for all infinite N iff the sequence (F n ) converges to f 'weakly along the graphs' in the sense of Theorem 3.7.
Example 3.9. Convergence of stochastic processes can be handled similarly: discretise the time interval [0,1] as T n = {0, ∆ n , 2∆ n , . . . , 1} and use the counting probability Λ n on T n and Lebesgue measure λ on [0,1]. Given φ ∈ L 2 (λ × W ) and Φ n : T n × C n → R for n ≥ 1, we then have: . Specializing to the convergence of claims and strategies based upon a Black-Scholes model, a second standard interpretation of the definition links it to L 2 -convergence by factoring the functions through a discretisation of the path space -this explains the terminology (here Q denotes the equivalent martingale measure in the Black-Scholes model): (here · denotes the supremum norm on C.) The existence of a adapted Q-discretisation scheme for Wiener space is shown in [CKW93a]. This leads to the following extension of Theorem 3.7 and Definition of D 2 -convergence: Theorem 3.11. Let (H n ) n∈N with H n : Ω n → R be a sequence of claims and let h ∈ L 2 (Q). Then the following are equivalent: , ω weakly and E Qn H 2 n → E Q h 2 . Definition 3.12. Let (H n ) n∈N and h be claims as in Theorem 3.11. We say that H n D 2 -converges to h if any of the equivalent conditions (i)-(iii) in Theorem 3.11 hold. We then write H n D 2 → h.

The Minimal Martingale Measure
We saw in 2.3 that for a semimartingale s with Doob-Meyer decomposition (10) s = s 0 + α d m + m the density for the minimal martingale measure is defined via the stochastic exponential In discrete time the density is given by the process (cf. equation (8)). Proposition 4.1 below will help to establish the connection between (11) and (12) denotes the stochastic exponential of x.
Proof. The proof of this proposition uses essentially the same techniques as the proof of Lemma 3.1 in [CKW91]. The only additional technical result that is required here is the approximation of the 'pure jump' part of a cádlág function by an internal function. This can be achieved pathwise by approximating the (at most countably many) jumps of x(ω) by points in T and employing Countable Comprehension to extend this to an internal subset; a complete proof is given in the Appendix. Proof. It follows from Theorems 3.5 and 3.6 that A∆M is a martingale of class SDJ and st( A∆M ) = a dm. The result then follows from Proposition 4.1.
Remark 4.3. It will be shown in Section 6 how Corollary 4.2 implies convergence results for option prices in a sequence of discrete-time models. Similar results are obtained in [Pri97]; however the above formulation was derived independently and makes use of different techniques. In [Pri97] the notion of uniform tightness of a sequence of martingales is used together with the results in [JMP89] to obtain the weak convergence of the sequence of associated stochastic integrals (see also [DP92]).

Trading Strategies and Value Processes
In this section we present a general result (extending Theorem 3.5 in [CKW95]) which will be used to relate the optimal trading strategies for discrete incomplete market models to those in complete continuous time models. We will illustrate these results in the next section when we consider (incomplete) discrete time approximations of the Black-Scholes model.
Suppose the process S : Ω×T → * R is an internal SL 2 (P )-martingale of class SDJ. As usual we denote the internal Doléans measure of S by ν S . Let s := st(S) be the standard part of S, so that s : Ω × [0, T ] → R is an L 2 (P L )-martingale.
We make the following assumptions (S1) and (S2) on the processes S and s. In the next section we will introduce a class of models for which these are satisfied. Furthermore, note that (S1) and (S2) are satisfied for the models considered in [CKW91,CKW93b].
(S2) Any h ∈ L 2 (P L ) can be represented as Now let h ∈ L 2 (P L ) and v 0 , θ be defined as in (14). Define processes g, v, ψ : Ω × [0, T ] → R by Let H ∈ SL 2 (P ) be a lifting of h. We want to obtain an internal decomposition of H of the form with V 0 ∈ * R and L an internal martingale orthogonal to S. We can obtain this decomposition by the sequential regression described in Section 2: Let The internal martingale L defined by (where the last equality uses the martingale property of S and the definition of Θ t ). Then Finally, we define internal processes G, Ψ : Ω × T → * R by We need the following lemmata. The first characterizes orthogonality of martingales in terms of orthogonality with respect to the space generated by stochastic integrals. The following two give a simple criterion for an internal martingale to be infinitesimal.
For the converse note that the process is a martingale.
Lemma 5.2. Let X : Ω × T → * R be an S-continuous process and P L (X t ≈ 0) = 1 for each t ∈ T. Then P L (X t ≈ 0 for all t ∈ T) = 1.
Proof. This is a nonstandard version of the standard result that two continuous processes which are versions of each other are already indistinguishable. Then P L (D) = 0 and P L (Ω \ D) = 1. For fixed ω ∈Ω \ D the path X · (ω) is S-continuous and X t (ω) ≈ 0 for all t ∈ S. Let x := st(X(ω)), so x is a continuous function which is zero on Q ∩[0, T ]. Hence X t (ω) ≈ 0 for all t ∈ T.
Proof. Let [X] be the internal quadratic variation of X. We know that is an increasing process, hence the path [X] · (ω) is S-continuous for P L -a.a. ω. Therefore X is S-continuous.

Since [X] is an increasing process we have
Hence, for fixed t ∈ T, X t ≈ 0 P L -a.s.. Lemma 5.2 then implies that X t ≈ 0 for all t ∈ T, P L -a.s..
We can now prove the main result of this section: The internal process Therefore the final term in (17) disappears and Hence Θ is also a 2-lifting of θ with respect to ν S and Θ ∈ SL 2 (ν S ). By assumption (S1) this is equivalent to ΘS being a SL 2 (P × Λ)-lifting of θs. Furthermore, G is an SL 2 -martingale of class SDJ and hence st(G) = st Θ∆S = θds = g, P L -a.s..
As E[L 2 T ] ≈ 0 it follows from Lemma 5.3 that the paths of st(L) are constant zero, P L -a.s., hence Finally, this implies that Ψ = V − ΘS is an SL 2 -lifting of ψ = v − θs.
Remark 5.5. In the language of mathematical finance Theorem 5.4 shows that the lifting property of a claim H implies the lifting property of the associated locally-risk-minimising (or variance-optimal 1 ) strategy and its value and gains process.
Theorem 5.4 therefore includes the results of [CKW91, Thorem 3.5] and [CKW93b, Theorem 4.1] as special cases. It should be noted however that the models in [CKW91,CKW93b] are internally complete. It is therefore possible to obtain an internally self-financing trading strategy generating the claim H, so that equation (16) is automatically satisfied: for self-financing strategies we have V = V 0 + G.
The crucial point in Theorem 5.4 is that the additional cost process L is infinitesimal, so that the internal strategies here are self-financing 'in the limit'. This last statement will be made precise in the next section.
We summarize the results of this section in the following theorem which is a generalisation of the main result in [CKW93a]: Theorem 5.6. With the above notation and assumptions (S1) and (S2) the following are equivalent:

Applications
In this section we apply the results of Section 5 to two alternative approximations of the BS model. These models were introduced in [MV96] and [RS95] where convergence results for certain classes of contingent claims have been obtained. Using our results we can extend these to a wider class claims and -more importantly -to the associated trading strategies and value processes.
Remark 6.1. It will be useful later to have some notation for the binomial model on T β n : let Ω β n := {−1, +1} T β n \{T } and Q β n be the measure on (Ω β n , P(Ω β n )) given by the binomial probabilities q and 1 − q. Denote by W β n the binomial random walk on T β n with step size ± ∆ n t/β and the price process S β n is given by where ∆W β n,t := W β n,t+∆nt/β − W β n,t .
We want to specify a probability measure Q n on Ω n such that the price process (19) S n,t+∆ := S n,t · u ωt d β−ωt , S n,0 := s 0 with u := 1 + µ ∆ n t β + σ ∆ n t/β and d := 1 + µ ∆ n t β − σ ∆ n t/β is a martingale under Q n . One possible choice would be the minimal martingale measure as defined in Section 2.3; however, we choose the unique martingale measure for the price process on T β n given by the probabilities for an 'up-' or 'down-movement' between times in T β n . We then define the measure Q n on (Ω n , F n ) by for ω = (ω 0 , . . . , ω T −∆nt ) ∈ Ω n and (ω t ) t∈T\{T } independent. We denote the expectation with respect to Q n by E n [ · ]. A filtration A n = (A n,t ) t∈Tn is again generated by the multinomial random walk W n : Ω n × T n → R with ∆W n,t := (−β + 2ω t ) ∆ n t/β, W n,0 := 0.
Since E n [ ∆W n,t | A n,t ] = − µ σ ∆ n t, W n is not a martingale under Q n . We therefore define an adjusted multinomial processW n : Ω n × T n → R with ∆W n,t := ∆W n,t + µ σ ∆ n t,W n,0 := 0, so that E n ∆W n,t A n,t = 0 (20) The n-th (β+1)-nomial CRR model satisfies the assumptions of Section 2.1, so that any claim H : Ω n → R in this model can be replicated by a mean-self-financing strategy Φ H = (Θ H , Ψ H ) which is risk-minimising since S n is a martingale.
The Hyperfinite Version. For any infinite N ∈ * N \ N we have an internal (β + 1)-nomial CRR model on the hyperfinite filtered probability space (Ω N , F N , A N , Q N ) with associated Loeb space (Ω N , L(F N ), L(Q N )).
Proof. This is analogous to the proof of Theorem 3.3.5 in [AFHL86].
By Lemma 6.2(i) W N,t is finite, L(Q N )-a.s.. It then follows from (19) and the proof of Lemma 3.1(a) in [CKW91] that, L(Q N )-a.s., for all t ∈ T N . Hence, S N is S-continuous and Sincew is a standard Brownian motion, s is indeed the price process in a Black-Scholes model under the unique martingale measure Q := L(Q N ). Hence each claim h ∈ L 2 (Q) can be represented as for some θ h ∈ L 2 (ν s ) and v h 0 ∈ R. We now verify assumption (S1) on page 11 by employing the following lemma: For any nonanticipating process Θ and any θ ∈ L 2 (ν s ) we then have Proof. The internal algebra A ΩN ×ΛN on Ω N × T N is generated by the sets {A × {t} : t ∈ T N , A ∈ A N,t } (cf. [AFHL86]). Using (22) Since S N,t is non-infinitesimal L(Q N )-a.s., and square-S-integrable we have by (22) and the tower property for conditional expectations. Hence, Θ ∈ SL 2 (ν SN ) if and only if ΘS N ∈ SL 2 (Q N × Λ N ).
In order to check (22) we calculate We are now in a situation where we can apply Theorem 5.6, so that the valuation of claims and calculation of mean-self-financing trading strategies in the hyperfinite (β + 1)-nomial CRR model are equivalent to the corresponding operations in the BS model. Theorem 6.4. Let H n : Ω n → R be a sequence of claims in the (β + 1)-nomial CRR models and h ∈ L 2 (Q) a claim in the BS model. Then the following are equivalent: An important aspect of D 2 -convergence is the existence of a discretisation scheme which maps paths in C back into C n (see Definition 3.10).
Proof. Let (d β n ) n∈N be the adapted Q-discretisation scheme for the binomial CRR model on Ω β n (cf. [CKW93a]), so that d β n : C → C β n where C β n := {W β n (ω) : ω ∈ Ω β n } denotes the path space for the binomial CRR model. Define a mapd n : C β n → C n by (d n (ω))(t) := ω(t) for ω ∈ C β n and t ∈ T n and filling in linearly between points in T n . Sod n samples paths in C β n at points in T n and 'forgets' what happens between these points. We now define d n : C → C n as To see that (d n ) n∈N is an adapted Q-discretisation scheme we note that d n is A n -adapted and Q-measure preserving since d β n andd n are. It only remains to show that d n (ω) → ω in Q-probability. Fix ε > 0 and let δ > 0. As (d β n ) n∈N is a discretisation scheme there exist n δ ∈ N such that Furthermore, d n (ω) − ω ≤ 2(β + 1) ∆ n t/β for all ω ∈ C β n . We therefore haveñ ∈ N such that d n (ω) − ω < ε/2 for all n ≥ñ and ω ∈ C β n . Hence, for n ≥ max{n δ ,ñ}, Remark 6.6. The convergence result in Theorem 6.4 could also have been obtained under the minimal martingale measure for the multinomial CRR model. Therefore our choice of the measure Q n might seem arbitrary; however, note that the minimal martingale measure depends on the specification of the 'physical' probability in the underlying model. We may therefore also use a martingale measure from the beginning. Furthermore, our choice of Q n allows the construction of an adapted discretisation scheme from the already existing one for the binomial CRR model as in the proof of Proposition 6.5.
It was shown in [CKW93a,Theorem 4.4] that a sequence of D 2 -convergent claims can always be obtained by means of an adapted discretisation scheme. However, when considering a specific claim h in the BS model a D 2 -convergent sequence (H n ) n∈N approximating h can often be found in a more direct and natural way: Example 6.7. If the claim h only depends on the price of the risky asset at maturity, i.e. h = f (s T ) for some function f : R → R then H n := f (S n,T ) is a natural choice of an approximating sequence for h. Indeed, if f is piecewise continuous and satisfies a polynomial growth condition it is easy to see that  Summing up the results of this section we have shown that, for any β ∈ N, the (β + 1)-nomial CRR model has exactly the same convergence properties as the complete binomial CRR model, provided we are using mean-variance hedging for pricing and replicating claims in these models. When using finite models as approximations of the continuous time BS model the use of binomial models does therefore not offer any advantage over multinomial models. This gives a theoretical justification for the use of trinomial models in numerical methods for derivatives pricing and hedging; trinomial models are often preferred because they allow a more efficient calculation of prices (see [Wel98] for examples) and hedging parameters (the 'greeks'); it should also be noted that the use of trinomial models is equivalent to explicit finite difference methods for the numerical solutions of the pricing PDE (cf. for example [CS98] It is well-known that this model is complete (see e.g. [BK98,MR97]), so that there is a unique equivalent martingale measure Q for the price process s, and under this measure s is the solution to the stochastic differential equation wherew is a standard Brownian motion under Q, i.e. s is given as From now on we will work with the measure Q; a filtration F on (Ω, F ) is generated byw. For any h ∈ L 2 (Q) let φ h be the unique self-financing strategy generating h.
To define a sequence of discrete time models approximating (25) we follow the 'direct discretisation' approach in [RS95]. We first approximate the volatility function σ by a sequence of piecewise constant functions: For n ∈ N and ∆ n t := T /n define σ n : [0, T ] → R by (with the sum taken over T n = {0, ∆ n t, . . . , t}). This means that σ n (t) = σ(t) for t ∈ T n and σ n remains constant between points in T n . So σ n is left-continuous with right limits, bounded and strictly positive.
We define another Q-martingale s n : We can then obtain a discrete time process S n : Ω × T n → R by evaluating s n at the discretisation points t ∈ T n , so that S n satisfies S n,t+∆nt = S n,t exp − 1 2 σ 2 n,t ∆ n t + σ n,t ∆w t , t ∈ T n \ {T }, where ∆w t :=w t+∆nt −w t ∼ N (0, ∆ n t).

The Discrete Time Model and its Internal Version.
We define the n-th discretised BS model as follows: Let Ω n := R Tn\{T } , F n := B(Ω) and Q n the probability defined by ω t ∼ N(0, ∆ n t) and ω 0 , . . . , ω T −∆nt independent. A filtration A n is generated by the process W n : Ω n × T n → R with ∆W n,t := ω t , W n,0 := 0. The price process S n is then defined as in (27), with ∆w t replaced by ∆W n,t . The discretised BS model satisfies the assumptions of Section 2.1 so that any claim H ∈ L 2 (Q n ) can be replicated by a risk-minimising mean-self-financing strategy Φ H = (Θ H , Ψ H ). Again the variance-optimal strategy ξ H coincides with Θ H .
As before, for infinite N , this gives rise to an internal model on (Ω N , A N , F N  Due to the piecewise continuity of σ we have shows that S N is an SL r -martingale for any r ∈ [1, ∞). Finally with ε t ≈ 0 for all t ∈ T N , so that we can use the proof of Lemma 6.3 to show that assumption (S1) in Section 5 is satisfied. Hence Theorem 5.6 holds for this internal model and the BS model (25); we can now use it to obtain standard convergence results for the sequence of discretised BS models (27): Convergence Results. As in Section 6.1 we assume that Ω = C for the BS model and we consider the subspaces C n ⊂ C of polygonal paths of W n . We therefore have the following analogue to Theorem 6.4: Theorem 6.9. Let H n : Ω n → R be a sequence of claims in the discretised BS models and h ∈ L 2 (Q) a claim in the BS model. Then the following are equivalent: We will see below how we can obtain an almost trivial adapted Q-discretisation scheme (d n ) n∈N for these models, so that we also have the alternative characterization of D 2 -convergence in terms of 'L 2 (d n (·))convergence' given by Theorem 3.11: Proposition 6.10. The family of mappings d n : C → C n defined by (28) (d n (ω))(t) := ω(t) for ω ∈ C and t ∈ T n , with d n (ω) filled in linearly between points in T n , is an adapted Q-discretisation scheme.
Proof. Firstly, d n is A n -adapted by definition. Furthermore, Q(d −1 n (A)) = Q n (A) for all A ∈ F n due to the fact that the finite-dimensional distributions of a Brownian motion are multivariate normal. It therefore only remains to show that d n (ω) → ω in Q-probability.
Using [Cut87, Theorem 2.2] once again we see that, for infinite N ∈ * N \ N, the standard part of the process d N : * C × T N → * R defined by (28) is a standard Brownian motion on ( * C, L( * F ), L( * Q)), hence L( * Q) d N ( * ω) − * ω ≈ 0 = 1, in particular, for any positive ε ∈ R, By the nonstandard characterization of convergence of a sequence in R this means that Q d n (ω) − ω < ε → 1 as n → ∞.

Conclusion
Although the proofs employed in this paper make extensive use of nonstandard stochastic analysis, as developed in [HP83] and [AFHL86], the main results themselves (including the formulation of D 2 −convergence) are expressed entirely within the framework and terminology of standard stochastic analysis. The nonstandard tools used here, however, provide a degree of insight into the essential structure of the models and the relationship between them which is not easily obtained by purely standard methods.
The techniques developed in this paper enable us to extend to certain incomplete market models the convergence theory developed in [CKW93a] for approximations to the Black-Scholes model. By utilizing the stability properties of the minimal martingale measure approach and developing the appropriate liftings, we were able to demonstrate the equivalence of the D 2 −convergence of contingent claims and that of their associated strategies, which distinguishes this mode of convergence from the more usual weak convergence methodology.
Weak convergence results for strategies do not, to our knowledge, exist in the literature for the various models dealt with in this paper. The D 2 −convergence results obtained in the final section of this paper provide a theoretical justification for the multinomial approximations often used in practice and for direct sampling methods applied to continuous-time models.
However, our methodology still requires the limiting model to be complete, even though this restriction is not placed on the approximating discrete models. It would be interesting to see an extension of this approach to situations where the limit model is also allowed to be incomplete. Recall that |∆fu| < 1 2 for all u ∈ J, so that, for s ∈T, | • ∆Fs| ≤ 1 2 , which implies that 1 + • ∆Fs ≥ 1 2 . Since | log(1 + x) − x| ≤ |x| 2 for |x| ≤ 1 2 we see that |εs| ≤ 4|∆Fs − • ∆Fs| 2 and hence the final sum is bounded by An analogous argument (using (31) and (33)) then shows that st(K) = k.