Lecture 8. The Doob’s stopping theorem

The next four lectures will be devoted to the foundational theorems of the theory of continuous time martingales. All of these theorems are due to Joseph Doob.

The following first theorem shows that martingales behave in a very nice way with respect to stopping times.

Theorem (Doob’s stopping theorem) Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a stochastic process that is adapted to the filtration (\mathcal{F}_t)_{t \ge 0}, whose paths are right continuous and locally bounded. The following properties are equivalent:

  • (M_t)_{t \ge 0} is a martingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0};
  • For any, almost surely bounded stopping time T of the filtration (\mathcal{F}_t)_{t \ge 0} such that \mathbb{E}(\mid M_T \mid)<+\infty, we have \mathbb{E} (M_T)=\mathbb{E} (M_0).

Proof:

Let us assume that (M_t)_{t \ge 0} is a martingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0}, whose paths are right continuous and locally bounded. Let now T be a stopping time of the filtration (\mathcal{F}_t)_{t \ge 0} that is almost surely bounded by K>0. Let us first assume that T takes its values in a finite set: 0 \le t_1 <...<t_n \le K. Thanks to the martingale property, we have
\mathbb{E} (M_T)
= \mathbb{E} (\sum_{i=1}^n M_T 1_{T=t_i})
=\sum_{i=1}^n \mathbb{E} (M_{t_i} 1_{T=t_i})
=\sum_{i=1}^n \mathbb{E} (M_{t_n} 1_{T=t_i})
=\mathbb{E} (M_{t_n})
=\mathbb{E} (M_{0}).

The theorem is therefore proved if T takes its values in a finite set. If T takes an infinite number of values, we approximate T by the following sequence of stopping times:

\tau_n =\sum_{k=1}^{2^n} \frac{kK}{2^n} 1_{ \left\{\frac{(k-1)K}{2^n} \le T < \frac{kK}{2^n} \right\} }.

The stopping time \tau_n takes its values in a finite set and when n \rightarrow +\infty, \tau_n \rightarrow T. To conclude the proof of the first part of the proposition, it is therefore enough to show that

\lim_{n \rightarrow +\infty} \mathbb{E} ( M_{\tau_n} )=\mathbb{E} (M_{T} ).
For this, we are going to prove that the family (M_{\tau_n})_{n \in\mathbb{N}} is uniformly integrable. Let A \ge 0.

Since \tau_n takes its values in a finite set, by using the martingale property and Jensen’s inequality, it is easily checked that

\mathbb{E} (| M_{K} |1_{M_{\tau_n} \ge A }) \ge \mathbb{E} (| M_{\tau_n}| 1_{M_{\tau_n} \ge A }).

Therefore, we have

\mathbb{E} ( M_{\tau_n} 1_{M_{\tau_n} \ge A }) \le \mathbb{E} ( M_{K} 1_{\sup_{0 \le s \le K} M_s \ge A }) \rightarrow_{A \rightarrow + \infty} 0.

By uniform integrability and convergence in probability, we deduce that

\lim_{n \rightarrow +\infty} \mathbb{E} ( M_{\tau_n} )=\mathbb{E} (M_{T} ),

from which it is concluded that

\mathbb{E} (M_T)=\mathbb{E} (M_0).

Conversely, let us now assume that for any, almost surely bounded stopping time T of the filtration (\mathcal{F}_t)_{t \ge 0} such that \mathbb{E}(\mid M_T \mid)<+\infty, we have \mathbb{E}(M_T)=\mathbb{E} (M_0).

Let 0 \le s \le t and A \in \mathcal{F}_s. By using the stopping time

T=s 1_A +t 1_{^c A},

we are led to

\mathbb{E} \left( (M_t - M_s) 1_A \right)=0,

which implies the martingale property for (M_t)_{t \ge 0}. \square

The hypothesis that the paths of (M_t)_{t \ge 0} be right continuous and locally bounded is actually not strictly necessary, however the hypothesis that the stopping time T be almost surely bounded is essential, as it is proved in the following exercise.

Exercise : Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a continuous martingale (that is a martingale with continuous paths) with respect to the filtration (\mathcal{F}_t)_{t \ge 0} such that M_0=0 almost surely. For a>0, we denote T_a=\inf \{ t >0, M_t=a \}. Show that T_a is a stopping time of the filtration (\mathcal{F}_t)_{t \ge 0}. Prove that T_a is not almost surely bounded.

Exercise : Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a continuous martingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0}. By mimicking the proof of Doob’s stopping theorem, show that if T_1 and T_2 are two almost surely bounded stopping times of the filtration (\mathcal{F}_t)_{t \ge 0} such that T_1 \le T_2 and \mathbb{E}(\mid M_{T_1} \mid)<+\infty, \mathbb{E}(\mid M_{T_2} \mid<+\infty, then, \mathbb{E} (M_{T_2} \mid \mathcal{F}_{T_1})=M_{T_1}.

Deduce that the stochastic process (M_{t\wedge T_2})_{t \ge 0} is a martingale with respect to the filtration (\mathcal{F}_{t \wedge T_2})_{t \ge 0}.

Exercise : Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a submartingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0} whose paths are continuous. By mimicking the proof of Doob’s stopping theorem, show that if T_1 and T_2 are two almost surely bounded stopping times of the filtration (\mathcal{F}_t)_{t \ge 0} such that T_1 \le T_2 and \mathbb{E}(\mid M_{T_1} \mid)<+\infty, \mathbb{E}(\mid M_{T_2} \mid)<+\infty, then, \mathbb{E} (M_{T_1} ) \le \mathbb{E} (M_{T_2} ).D

Posted in Stochastic Calculus lectures | Leave a comment

Lecture 7. Stopping times, Martingales

In the study of a stochastic process it is often useful to consider some properties of the process that hold up to a random time. A natural question is for instance: How long is the process less than a given constant ?

Definition. Let (\mathcal{F}_t)_{t \ge 0} be a filtration on a probability space (\Omega,\mathcal{F},\mathbb{P}). Let T be a random variable, measurable with respect to \mathcal{F} and valued in \mathbb{R}_{\ge 0} \cup \{+\infty \}. We say that T is a stopping time of the filtration (\mathcal{F}_t)_{t \ge 0} if for t \ge 0, \{ T \le t \} \in \mathcal{F}_t.

Often, a stopping time will be the time during which a stochastic process adapted to the filtration (\mathcal{F}_t)_{t \ge 0} satisfies a given property. The above definition means that for any t \ge 0, at time t, one is able to decide if this property is satisfied or not.

Among the most important examples of stopping times, are the (first) hitting times of a closed set by a continuous stochastic process.

Exercise (First hitting time of a closed set by a continuous stochastic process)
Let (X_t)_{t \ge 0} be a continuous process adapted to a filtration (\mathcal{F}_t)_{t \ge 0}. Let

T=\inf \{ t \ge 0, X_t \in F \},

where F is a closed subset of \mathbb{R}. Show that T is a stopping time of the filtration (\mathcal{F}_t)_{t\ge 0}.

Given a stopping time T, we may define the \sigma-algebra of events that occur before the time T:

Proposition. Let T be a stopping time of the filtration (\mathcal{F}_t)_{t \ge 0}. Let
\mathcal{F}_T=\{ A \in \mathcal{F}, \forall t \ge 0, A \cap \{ T \le t \} \in \mathcal{F}_t \}.
Then \mathcal{F}_T is a \sigma-algebra.

Proof:

Since for every t \ge 0, \emptyset \in \mathcal{F}_t, we have that \emptyset \in \mathcal{F}_T. Let us now consider A \in \mathcal{F}_T. We have

^c A \cap \{ T \le t \} =\{ T \le t \} \backslash \left(  A \cap \{ T \le t \} \right) \in \mathcal{F}_t,

and thus ^c A \in \mathcal{F}_T. Finally, if (A_n)_{ n \in \mathbb{N}} is a sequence of subsets of \mathcal{F}_T,

(\cap_{n \in \mathbb{N} } A_n )\cap \{ T \le t \} =\cap_{n \in \mathbb{N} } ( A_n \cap \{ T \le t \}) \in  \mathcal{F}_t.
\square

If T is a stopping time of a filtration with respect to which a given process is adapted, then it is possible to stop this process in a natural way at the time T. We let the proof of the corresponding proposition as an exercise to the reader.

Proposition. Let (\mathcal{F}_t)_{t \ge 0} be a filtration on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let T be an almost surely finite stopping time of the filtration (\mathcal{F}_t)_{t \ge 0}. Let (X_t)_{t \ge 0} be a stochastic process that is adapted and progressively measurable with respect to the filtration (\mathcal{F}_t)_{t \ge 0}. The stopped stochastic process (X_{t \wedge T})_{t \ge 0} is progressively measurable with respect to the filtration (\mathcal{F}_{t \wedge T})_{t \ge 0}.

We are now ready to introduce the martingales in continuous time. Such processes were first extensively studied by Joseph Doob. Together with the Markov processes, that we will study later, they are among the most important class of stochastic processes and lie at the hearth of the theory of stochastic integration.

Definition. Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}). A process (M_t)_{t \ge 0} that is adapted to (\mathcal{F}_t)_{t \ge 0} is called a submartingale with respect to this filtration if:

  • For every t \ge 0, \mathbb{E} \left( \mid M_t \mid \right) < + \infty;
  • For every t \ge s \ge 0, \mathbb{E} \left(  M_t \mid \mathcal{F}_s  \right) \ge M_s.

A stochastic process (M_t)_{t \ge 0} that is adapted to (\mathcal{F}_t)_{t \ge 0} and such that (-M_t)_{t \ge 0} is a submartingale, is called a supermartingale. Finally, a stochastic process (M_t)_{t \ge 0} that is adapted to (\mathcal{F}_t)_{t \ge 0} and that is at the same time a submartingale and a supermartingale is called a martingale.

The following exercises provide some first properties of these processes.

Exercise. (Closed martingale)
Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let X be an integrable and \mathcal{F}-measurable random variable. Show that the process \left( \mathbb{E}(X\mid \mathcal{F}_t) \right)_{t \ge 0} is a martingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0}.

Exercise. Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a submartingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0}. Show that the function t \rightarrow \mathbb{E} (M_t) is non-decreasing.

Exercise. Let (\mathcal{F}_t)_{t \ge 0} be a filtration defined on a probability space (\Omega, \mathcal{F},\mathbb{P}) and let (M_t)_{t \ge 0} be a martingale with respect to the filtration (\mathcal{F}_t)_{t \ge 0}. Let now \psi : \mathbb{R} \rightarrow \mathbb{R} be a convex function such that for t \ge 0, \mathbb{E} \left( \mid \psi(M_t) \mid \right) < + \infty. Show that the process (\psi(M_t))_{t \ge 0} is a submartingale.

Posted in Stochastic Calculus lectures | 1 Comment

Lecture 6. The Kolmogorov continuity theorem

The Daniell-Kolmogorov theorem seen in Lecture 5 is a very useful tool since it provides existence results for stochastic processes. Nevertheless, this theorem does not say anything about the paths of this process. The following theorem, due to Kolmogorov, precises that, under mild conditions, we can work with processes whose paths are quite regular.

Definition. A function f:\mathbb{R}_{\ge 0} \rightarrow \mathbb{R}^d is said to be Hölder continuous with exponent \alpha >0 if there exists a constant C >0 such that for s,t \in \mathbb{R}_{\ge 0},

\| f(t)-f(s) \| \le C \mid t-s \mid^{\alpha}.

Hölder functions are of course in particular continuous.

Definition. A stochastic process (\tilde{X}_t)_{t \geq 0} is called a modification of the process (X_t)_{t \geq 0} if for every t \geq 0, \mathbb{P} \left( X_t = \tilde{X}_t \right)=1.

We can observe that if (\tilde{X}_t)_{t \geq 0} is a modification of (X_t)_{t \geq 0} then (\tilde{X}_t)_{t \geq 0} has the same distribution as (X_t)_{t \geq 0} (because they need to have the same finite-dimensional distributions).

Theorem. (Kolmogorov continuity theorem) Let \alpha, \varepsilon, c >0. If a d-dimensional process (X_t)_{t\in [0,1]} defined on a probability space (\Omega, \mathcal{F}, \mathbb{P}) satisfies for s,t \in [0,1],

\mathbb{E} \left( \| X_t - X_s \|^{\alpha} \right) \leq c \mid t-s \mid^{1+\varepsilon},

then there exists a modification of the process (X_t)_{t \in [0,1]} that is a continuous process and whose paths are \gamma-Hölder for every \gamma \in [0, \frac{\varepsilon}{\alpha} ).

Proof:

We make the proof for d=1 and let the reader extend it as an exercise to the case d \ge 2. For n \in \mathbb{N}, we denote

\mathcal{D}_n=\left\{ \frac{k}{2^n}, k=0,...,2^{n} \right\}

and

\mathcal{D}=\cup_{n \in \mathbb{N}} \mathcal{D}_n.

Let \gamma \in [0, \frac{\varepsilon}{\alpha} ). From Chebychev’s inequality:

\mathbb{P} \left( \max_{1 \le k \le 2^n} | X_{\frac{k}{2^n}} -X_{\frac{k-1}{2^n}} | \ge 2^{-\gamma n}\right)
=\mathbb{P} \left( \cup_{1 \le k \le 2^n} | X_{\frac{k}{2^n}} -X_{\frac{k-1}{2^n}} | \ge 2^{-\gamma n}\right)
\le \sum_{k=1}^{2^n} \mathbb{P} \left(  | X_{\frac{k}{2^n}} -X_{\frac{k-1}{2^n}} | \ge 2^{-\gamma n}\right)
\le \sum_{k=1}^{2^n} \frac{\mathbb{E}\left( | X_{\frac{k}{2^n}} -X_{\frac{k-1}{2^n}} |^{\alpha}\right)}{2^{-\gamma \alpha n}}
\le c 2^{-n(\varepsilon-\gamma \alpha)}

Therefore, since \gamma \alpha < \varepsilon, we deduce

\sum_{n=1}^{+\infty} \mathbb{P} \left( \max_{1 \le k \le 2^n} | X_{\frac{k}{2^n}} -X_{\frac{k-1}{2^n}} | \ge 2^{-\gamma n}\right)<+\infty.

From the Borel-Cantelli lemma, we can thus find a set \Omega^* \in \mathcal{F} such that \mathbb{P} ( \Omega^*)=1 and such that for \omega \in \Omega^*, there exists N(\omega) such that for n \ge N(\omega),

\max_{1 \le k \le 2^n} | X_{\frac{k}{2^n}} (\omega) -X_{\frac{k-1}{2^n}} (\omega)| \le  2^{-\gamma n}.

In particular, there exists an almost surely finite random variable C such that for every n \ge 0,

\max_{1 \le k \le 2^n} | X_{\frac{k}{2^n}} (\omega) -X_{\frac{k-1}{2^n}} (\omega)| \le C  2^{-\gamma n}

We now claim that the paths of the restricted process X_{/\Omega^*} are consequently \gamma-Hölder on \mathcal{D}. Indeed, let s,t \in \mathcal{D}, t \neq s. We can find n \ge 0 such that

\frac{1}{2^{n+1}} \le \mid s-t \mid \le \frac{1}{2^n}.

We now pick an increasing and stationary sequence (s_k)_{k \ge n} converging toward s, such that s_k \in \mathcal{D}_k and

\mid s_{k+1}-s_k \mid =2^{-(k+1)} \quad \text{or} \quad 0.

In the same way, we can find an analogue sequence (t_k)_{k \ge n} that converges toward t and such that s_n and t_n are neighbors in \mathcal{D}_n. We have then:

X_t - X_s=\sum_{i=n}^{+\infty}(X_{s_{i+1}} -X_{s_{i}}) +(X_{s_n}-X_{t_n})+\sum_{i=n}^{+\infty}(X_{t_{i}} -X_{t_{i+1}}),

where the above sums are actually finite.

Therefore,
| X_t - X_s |
\le C  2^{-\gamma n}+ 2 \sum_{k=n}^{+\infty} C 2^{-\gamma(k+1)}
\le 2C \sum_{k=n}^{+\infty} 2^{- \gamma k}
\le \frac{2C}{1-2^{-\gamma}} 2^{-\gamma n}

Hence the paths of X_{/\Omega^*} are \gamma-Hölder on the set \mathcal{D}. For \omega \in \Omega^*, let t\rightarrow \tilde{X}_t (\omega) be the unique continuous function that agrees with t\rightarrow X_t (\omega) on \mathcal{D}. For \omega \notin \Omega^*, we set \tilde{X}_t (\omega)=0. The process (\tilde{X}_t)_{t \in [0,1]} is the desired modification of (X_t)_{t \in [0,1]}. \square

Posted in Stochastic Calculus lectures | 5 Comments

Lecture 5. The Daniell-Kolmogorov existence theorem

The Daniell-Kolmogorov extension theorem is one of the first deep theorems of the theory of stochastic processes. It provides existence results for nice probability measures on path (function) spaces. It is however non-constructive and relies on the axiom of choice. In what follows, in order to avoid heavy notations we restrict to the one dimensional case d=1. The multidimensional extension is straightforward and let to the reader.

Definition. Let (X_t)_{t \ge 0} be a stochastic process. For t_1 , ... , t_n \in \mathbb{R}_{\ge 0} we denote by \mu_{t_1,...,t_n} the probability distribution of the random variable (X_{t_1},...,X_{t_n}). It is therefore a probability measure on \mathbb{R}^n. This probability measure is called a finite dimensional distribution of the process (X_t)_{t \ge 0}.

If two processes have the same finite dimensional distributions, then it is clear that the two processes induce the same distribution on the path space \mathcal{A} (\mathbb{R}_{\ge 0}, \mathbb{R}) because cylinders generate the \sigma-algebra \mathcal{T} (\mathbb{R}_{\ge 0}, \mathbb{R}) (see Lecture 2).

The finite dimensional distributions of a given process satisfy the two following properties: If t_1,...,t_n \in \mathbb{R}_{\ge 0} and if \tau is a permutation of the set \{1,...,n\}, then:

  • \mu_{t_1,...,t_n} (A_1 \times ... \times A_n)=\mu_{t_{\tau (1)},...,t_{\tau (n)}} (A_{\tau(1)} \times ... \times A_{\tau(n)}),\quad A_i \in \mathcal{B}(\mathbb{R}).
  • \mu_{t_1,...,t_n} (A_1 \times ... \times A_{n-1} \times \mathbb{R})=\mu_{t_1,...,t_{n-1}} (A_1 \times ... \times A_{n-1}),\quad A_i \in \mathcal{B}(\mathbb{R}).

Conversely,

Theorem (Daniell-Kolmogorov theorem). Assume that we are given for every t_1,...,t_n \in \mathbb{R}_{\ge 0} a probability measure \mu_{t_1,...,t_n} on \mathbb{R}^n. Let us assume that these probability measures satisfy:

  • \mu_{t_1,...,t_n} (A_1 \times ... \times A_n)=\mu_{t_{\tau (1)},...,t_{\tau (n)}} (A_{\tau(1)} \times ... \times A_{\tau(n)}),\quad A_i \in \mathcal{B}(\mathbb{R}).
  • \mu_{t_1,...,t_n} (A_1 \times ... \times A_{n-1} \times \mathbb{R})=\mu_{t_1,...,t_{n-1}} (A_1 \times ... \times A_{n-1}),\quad A_i \in \mathcal{B}(\mathbb{R}).

Then, there is a unique probability measure \mu on (\mathcal{A}(\mathbb{R}_+, \mathbb{R}), \mathcal{T}(\mathbb{R}_{\ge 0},\mathbb{R})) such that for t_1,...,t_n \in \mathbb{R}_{\ge 0}, A_1,...,A_n \in \mathcal{B}(\mathbb{R}):

\mu (\pi_{t_1} \in A_1,...,\pi_{t_n} \in A_n)=\mu_{t_1,...,t_n}(A_1 \times ... \times A_n).

The Daniell-Kolmogorov theorem is often used to construct processes thanks to the following corollary:

Corollary. Assume given for every t_1,...,t_n \in \mathbb{R}_{\ge 0} a probability measure \mu_{t_1,...,t_n} on \mathbb{R}^n. Let us further assume that these measures satisfy the assumptions of the Daniell-Kolmogorov theorem. Then, there exists a probability space \left( \Omega , \mathcal{F}, \mathbb{P} \right) as well as a process (X_t)_{t \ge 0} defined on this space such that the finite dimensional distributions of (X_t)_{t \ge 0} are given by the \mu_{t_1,...,t_n}‘s.

Proof of the corollary:

As a probability space we chose

\left( \Omega , \mathcal{F}, \mathbb{P} \right)=\left( \mathcal{A} ( \mathbb{R}_{\ge 0} ,\mathbb{R}), \mathcal{T}(\mathbb{R}_{\ge 0},\mathbb{R}), \mu \right)

where \mu is the probability measure given by the Daniell-Kolmogorov theorem. The canonical coordinate process (\pi_t)_{t \ge 0} defined on \mathcal{A}( \mathbb{R}_{\ge 0} ,\mathbb{R}) by \pi_t (f)=f(t) satisfies the required property. \square

We now turn to the proof of the Daniell-Kolmogorov theorem. This proof proceeds in several steps.

As a first step, let us recall the Caratheodory extension theorem that is often useful for the effective construction of measures (for instance the construction of the Lebesgue measure on \mathbb{R}):

Theorem (Caratheodory theorem). Let \Omega be a non-empty set and let \mathcal{A} be a family of subsets that satisfy:

  • \Omega \in \mathcal{A};
  • If A,B \in \mathcal{A}, A \cup B \in \mathcal{A};
  • If A \in \mathcal{A}, \Omega \backslash A \in \mathcal{A}.

Let \sigma ( \mathcal{A} ) be the \sigma-algebra generated by \mathcal{A}. If \mu_0 is \sigma-additive measure on ( \Omega, \mathcal{A} ) which is \sigma-finite, then there exists a unique \sigma-additive measure \mu on ( \Omega, \sigma(\mathcal{A}) ) such that for A \in \mathcal{A}, \mu_0 (A)=\mu(A).

As a first step, we prove the following fact:

Lemma. Let B_n \subset \mathbb{R}^n, n \in \mathbb{N} be a sequence of Borel sets that satisfy B_{n+1} \subset B_n \times \mathbb{R}. Let us assume that for every n \in \mathbb{N} a probability measure \mu_n is given on (\mathbb{R}^n, \mathcal{B} (\mathbb{R}^n)) and that these probability measures are compatible in the sense that

\mu_{n} (A_1 \times ... \times A_{n-1} \times \mathbb{R})=\mu_{n-1} (A_1 \times ... \times A_{n-1}),\quad A_i \in \mathcal{B}(\mathbb{R})

and satisfy:

\mu_n (B_n) > \varepsilon,

where 0<\varepsilon<1. There exists a sequence of compact sets K_n \subset \mathbb{R}^n, n \in \mathbb{N}, such that:

  • K_n \subset B_n
  • K_{n+1} \subset K_n \times \mathbb{R}.
  • \mu_n (K_n) \ge \frac{\varepsilon}{2}.

Proof of the lemma:

For every n, we can find a compact set K_n^* \subset \mathbb{R}^n such that

K^*_n \subset B_n

and

\mu_n (B_n \backslash K^*_n )\le \frac{\varepsilon}{2^{n+1}}.

Let now

K_n=(K^*_1 \times \mathbb{R}^{n-1} ) \cap ... \cap (K^*_{n-1} \times \mathbb{R}) \cap K^*_n.

It is easily checked that:

  • K_n \subset B_n
  • K_{n+1} \subset K_n \times \mathbb{R}

Moreover,

\mu_n (K_n)
=\mu_n (B_n) - \mu_n (B_n \backslash K_n)
=\mu_n (B_n)-\mu_n \left( B_n \backslash \left( (K^*_1 \times \mathbb{R}^{n-1} ) \cap ... \cap (K^*_{n-1} \times\mathbb{R}) \cap K^*_n \right) \right)
\ge \mu_n (B_n)-\mu_n \left( B_n \backslash \left( (K^*_1 \times \mathbb{R}^{n-1} )\right) \right)-...-\mu_n \left( B_n \backslash \left( K^*_{n-1} \times \mathbb{R}  \right) \right)-\mu_n (B_n \backslash K_n^*)
\ge \mu_n (B_n)-\mu_1 (B_1 \backslash K_1^*)-...-\mu_n (B_n \backslash K_n^*)
\ge \varepsilon-\frac{\varepsilon}{4}-...-\frac{\varepsilon}{2^{n+1}}
\ge \frac{\varepsilon}{2}.

\square

With this in hands, we can now turn to the proof of the Daniell-Kolmogorov theorem.

Proof of the Daniell-Kolmogorov theorem:

For the cylinder

\mathcal{C}_{t_1,...,t_n} (B)= \{ f \in \mathcal{A}(\mathbb{R}_+,\mathbb{R}), (f(t_1) ,...,f(t_n) )\in B \}

where t_1,...,t_n \in  \mathbb{R}_{\ge 0}c and where B is a Borel subset of \mathbb{R}^n, we define

\mu \left(\mathcal{C}_{t_1,...,t_n} (B)\right)=\mu_{t_1,...,t_n} (B).

Thanks to the assumptions on the \mu_{t_1,...,t_n}‘s, it is seen that such a \mu is well defined and satisfies:

\mu \left( \mathcal{A}( \mathbb{R}_{\ge 0}, \mathbb{R}) \right)=1.

The set \mathcal{A} of all the possible cylinders \mathcal{C}_{t_1,...,t_n} (B) satisfies the assumption of Caratheodory’s theorem. Therefore, in order to conclude, we have to show that \mu is \sigma-additive, that is, if \left( C_n \right)_{n \in \mathbb{N}} is a sequence of pairwise disjoint cylinders and if C=\cup_{n \in \mathbb{N}} C_n is a cylinder then

\mu \left( C \right)=\sum_{n=0}^{+\infty} \mu (C_n).

This is the difficult part of the theorem. Since for N\in \mathbb{N},

\mu (C)=\mu \left( C \backslash \cup_{n =0}^N C_n \right)+\mu \left( \cup_{n =0}^N C_n \right),

we just have to show that

\lim_{N \rightarrow + \infty} \mu \left( D_N \right)=0.

where D_N=C \backslash \cup_{n =0}^N C_n.

The sequence (\mu(D_N))_{N\in \mathbb{N}} is positive decreasing and therefore converges. Let assume that it converges toward \varepsilon >0. We shall prove that in that case

\cap_{N \in \mathbb{N}} D_N \neq \emptyset,

which is clearly absurd.

Since D_N is a cylinder, the event \cup_{N \in \mathbb{N}} D_N only involves a coutable sequence of times t_1<...<t_n<... and we may assume (otherwise we can add convenient other sets in the sequence of the D_N's) that every D_N can be described as follows

D_N =\{ f \in \mathcal{A} (\mathbb{R}_{\ge 0},\mathbb{R}), (f(t_1),...,f(t_N)) \in B_N \}

where B_n \subset \mathbb{R}^n, n \in \mathbb{N}, is a sequence of Borel sets such that

B_{n+1} \subset B_n \times \mathbb{R}.

Since we assumed \mu (D_N) \ge \varepsilon, we can use the previous lemma to construct a sequence of compact sets K_n \subset \mathbb{R}^n, n \in \mathbb{N}, such that:

  • K_n \subset B_n
  • K_{n+1} \subset K_n \times \mathbb{R}
  • \mu_{t_1,...,t_n} (K_n) \ge \frac{\varepsilon}{2}

Since K_n is non-empty, we pick (x_1^n,...,x_n^n) \in K_n.

The sequence (x_1^n)_{n\in \mathbb{N}} has a convergent subsequence (x_1^{j_1(n)})_{n\in \mathbb{N}} that converges toward x_1 \in K_1. The sequence ((x_1^{j_1(n)},x_2^{j_1 (n)})_{n\in \mathbb{N}} has a convergent subsequence that converges toward (x_1,x_2) \in K_2. By pursuing this process, we obtain a sequence (x_n)_{n \in \mathbb{N}} such that for every n, (x_1,...,x_n) \in K_n.

The event

\{ f \in \mathcal{A} (\mathbb{R}_+,\mathbb{R}), (f(t_1),...,f(t_N)) =(x_1,...,x_N) \}

is in D_N, this leads to the expected contradiction. Therefore, the sequence (\mu(D_N))_{N\in \mathbb{N}} converges toward 0, which implies the \sigma-additivity of \mu \square

As it has been stressed, the Daniell-Kolmogorov theorem is the basic tool to prove the existence of a stochastic process with given finite dimensional distributions. As an example, let us illustrate how it may be used to prove the existence of the so-called Gaussian processes.

Definition. A real-valued stochastic process (X_t)_{t \ge 0} defined on (\Omega , \mathcal{F}, \mathbb{P}) is said to be a Gaussian process if all the finite dimensional distributions of X are Gaussian random variables.

If (X_t)_{t \ge 0} is a Gaussian process, its finite dimensional distributions can be characterized, through Fourier transform, by its mean function

m(t)=\mathbb{E} (X_t)

and its covariance function

R(s,t)=\mathbb{E} \left( (X_t -m(t)) (X_s -m(s)) \right).

We can observe that the covariance function R(s,t) is symmetric (R(s,t)=R(t,s)) and positive, that is for a_1,...,a_n \in \mathbb{R} and t_1,...,t_n \in  \mathbb{R}_{\ge 0},

\sum_{1 \le i,j \le n} a_i a_j R(t_i,t_j)
=\sum_{1 \le i,j \le n} a_i a_j \mathbb{E} \left( (X_{t_i} -m(t_i)) (X_{t_j}-m(t_j)) \right)
= \mathbb{E} \left( \left( \sum_{i=1}^n a_i (X_{t_i}-m(t_i)) \right)^2 \right) \ge 0

Conversely, as an application of the Daniell-Kolmogorov theorem, we let the reader prove as an exercise the following proposition.

Proposition. Let m:\mathbb{R}_{\ge 0}  \rightarrow \mathbb{R} and let R: \mathbb{R}_{\ge 0} \times \mathbb{R}_{\ge 0} \rightarrow \mathbb{R} be a symmetric and positive function. There exists a probability space \left( \Omega , \mathcal{F}, \mathbb{P} \right) and a Gaussian process (X_t)_{t \ge 0} defined on it, whose mean function is m and whose covariance function is R.

Posted in Stochastic Calculus lectures | 2 Comments

Lecture 4. Filtrations

A stochastic process (X_t)_{t \ge 0} may also be seen as a random system evolving in time. This system carries some information. More precisely, if one observes the paths of a stochastic process up to a time t\ge 0, one is able to decide if an event

A \in \sigma( X_s,s\le t)

has occured (here and in the sequel \sigma( X_s,s\le t) denotes the smallest \sigma-field that makes all the random variables \left\{ (X_{t_1}, \cdots, X_{t_n} ), 0 \le t_1 \le \cdots \le t_n \le t \right\} measurable). This notion of information carried by a stochastic process is modeled by filtrations.

Definition. Let (\Omega, \mathcal{F},\mathbb{P}) be a probability space. A filtration (\mathcal{F}_t)_{t \ge 0} is a non-decreasing family of sub-\sigma-algebras of \mathcal{F}.

As a basic example, if (X_t)_{t \ge 0} is a stochastic process defined on (\Omega, \mathcal{F},\mathbb{P}),then

\mathcal{F}_t=\sigma( X_s,s\le t)

is a filtration. This filtration is called the natural filtration of the process X and will often be denoted by (\mathcal{F}^X_t)_{t \ge 0}.

Definition. A stochastic process (X_t)_{t \ge 0} is said to be adapted to a filtration (\mathcal{F}_t)_{t \ge 0} if for every t \ge 0, the random variable X_t is measurable with respect to \mathcal{F}_t.

Of course, a stochastic process is always adapted with respect to its natural filtration. We may observe that if a stochastic process (X_t)_{t \ge 0} is adapted to a filtration (\mathcal{F}_t)_{t \ge 0} and that if \mathcal{F}_0 contains all the subsets of \mathcal{F} that have a zero probability, then every process (\tilde{X}_t)_{t \ge 0} that satisfies

\mathbb{P}(\tilde{X}_t=X_t)=1, \quad t \ge 0,

is still adapted to the filtration (\mathcal{F}_t)_{t \ge 0}.

We previously defined the notion of measurability for a stochastic process. In order to take into account the dynamic aspect associated to a filtration, the notion of progressive measurability is needed.

Definition. A stochastic process (X_t)_{t \ge 0} that is adapted to a filtration (\mathcal{F}_t)_{t \ge 0}, is said to be progressively measurable with respect to the filtration (\mathcal{F}_t)_{t \ge 0} if for every t \ge 0,

\forall A \in \mathcal{B}(\mathbb{R}), \{(s,\omega) \in [0,t] \times \Omega, X_s (\omega) \in A \} \in \mathcal{B}([0,t])\otimes \mathcal{F}_t.

By using the diagonal method, it is possible to construct adapted but not progressively measurable processes. However, the next proposition whose proof is let as an exercise to the reader shows that an adapted and continuous stochastic process is atomically progressively measurable.

Proposition. A continuous stochastic process (X_t)_{t \ge 0}, that is adapted with respect to a filtration (\mathcal{F}_t)_{t \ge 0}, is also progressively measurable with respect to it.

Posted in Stochastic Calculus lectures | Leave a comment

Lecture 3. Stochastic processes

Let (\Omega, \mathcal{F}, \mathbb{P}) be a probability space.

Definition. A (d-dimensional) stochastic process on (\Omega, \mathcal{F}, \mathbb{P}), is a sequence (X_t)_{t \ge 0} of \mathbb{R}^d-valued random variables that are measurable with respect to \mathcal{F}.

A stochastic process (X_t)_{t \ge 0} can also be seen as an application

X(\omega) \in \mathcal{A}( \mathbb{R}_{\ge 0}, \mathbb{R}^d), t \rightarrow X_t (\omega).

The applications t \rightarrow X_t (\omega) are called the paths of the process. The application

X:(\Omega, \mathcal{F}) \rightarrow (\mathcal{A}( \mathbb{R}_{\ge 0}, \mathbb{R}^d), \mathcal{T}( \mathbb{R}_{\ge 0}, \mathbb{R}^d))

is easily seen to be measurable, where \mathcal{A}( \mathbb{R}_{\ge 0}, \mathbb{R}^d) denotes the set of functions \mathbb{R}_{\ge 0} \to  \mathbb{R}^d endowed with the \sigma-field generated by the cylinders (see Lecture 2). The probability measure defined by

\mu (A)=\mathbb{P} (X^{-1}(A)), A \in \mathcal{T}(\mathbb{R}_{\ge 0}, \mathbb{R}^d)

is then called the law (or distribution) of (X_t)_{t \ge 0}.

For t \geq 0, we denote by \pi_t the application that transforms f \in \mathcal{A}( \mathbb{R}_{\ge 0}, \mathbb{R}^d) into f(t). The stochastic process (\pi_t)_{t \in \ \mathbb{R}_{\ge 0}} which is defined on the probability space (\mathcal{A}( \mathbb{R}_{\ge 0},\mathbb{R}), \mathcal{T}( \mathbb{R}_{\ge 0}, \mathbb{R}^d),\mu) is called the canonical process associated to X. It is a process with distribution $\mu$.

Definition. A process (X_t)_{t \ge 0} is said to be measurable if the application

(t,\omega) \rightarrow X_t (\omega)

is measurable with respect to the \sigma-algebra \mathcal{B}(\mathbb{R}_{\ge 0} ) \otimes \mathcal{F} that is, if

\forall A \in \mathcal{B}(\mathbb{R}^d), \{ (t,\omega), X_t (\omega) \in A \} \in \mathcal{B}(\mathbb{R}_{\ge 0} ) \otimes \mathcal{F}.

\mathcal{B}(\mathbb{R}_{\ge 0} ) denotes here the Borel \sigma-field on \mathbb{R}_{\ge 0}.

The paths of a measurable process are, of course, measurable functions \mathbb{R}_{\ge 0} \rightarrow  \mathbb{R}^d.

Definition. If a process X takes its values in \mathcal{C}(\mathbb{R}_{\ge 0} ,  \mathbb{R}^d), that is if the paths of X are continuous functions, then we say that X is a continuous process.

If (X_t)_{t \ge 0} is a continuous process then the application

X:(\Omega, \mathcal{F}) \rightarrow (\mathcal{C}(\mathbb{R}_{\ge 0} , \mathbb{R}^d), \mathcal{B}(\mathbb{R}_{\ge 0} , \mathbb{R}))

is measurable and the distribution of X is a probability measure on (\mathcal{C}(\mathbb{R}_{\ge 0} , \mathbb{R}^d), \mathcal{B}(\mathbb{R}_{\ge 0} , \mathbb{R}^d)). Moreover, a continuous process is measurable:

Proposition
A continuous stochastic process is measurable.

Proof.

Let (X_t)_{t \ge 0} be a continuous process. Let us first prove that if A is a Borel set in $\mathbb{R}$, then

\{ (t,\omega) \in [0,1]\times \Omega, X_t (\omega) \in A \} \in \mathcal{B}(\mathbb{R}_{\ge 0}) \otimes \mathcal{F}.

For n \in \mathbb{N}, let

X_t^n=X_{\frac{[2^n t]}{2^n}}, t \in [0,1],

where [ x] denotes the integer part of x. Since the paths of
X^n are piecewise constant, we have

\{ (t,\omega) \in [0,1]\times \Omega, X^n_t (\omega) \in A \} \in \mathcal{B}(\mathbb{R}_{\ge 0} ) \otimes \mathcal{F}.

Moreover, \forall t \in [0,1], \omega \in \Omega, we have

\lim_{n \rightarrow +\infty} X^n_t (\omega)=X_t (\omega),

which implies

\{ (t,\omega) \in [0,1]\times \Omega, X_t (\omega) \in A \} \in \mathcal{B}(\mathbb{R}_{\ge 0}) \otimes \mathcal{F}.

In the same way we obtain that \forall k \in \mathbb{N},

\{ (t,\omega) \in [k,k+1]\times \Omega, X_t (\omega) \in A \} \in \mathcal{B}(\mathbb{R}_{\ge 0} ) \otimes \mathcal{F}.

Observing

\{ (t,\omega) \in \mathbb{R}_{\ge 0} \times \Omega, X_t (\omega) \in A \}=\cup_{k \in \mathbb{N}} \{ (t,\omega) \in [k,k+1] \times \Omega, X_t (\omega) \in A \},
yields the sought of conclusion. \square

Posted in Stochastic Calculus lectures | 7 Comments

Lecture 2. Measure theory in function spaces

Stochastic processes can be seen as random variables taking their values in a function space. It is therefore important to understand the naturallly associated \sigma-algebras.

Let \mathcal{A}(\mathbb{R}_{\ge 0}, \mathbb{R}^d), d \ge 1, be the set of functions \mathbb{R}_{\ge 0} \rightarrow \mathbb{R}^d. We denote by \mathcal{T}(\mathbb{R}_{\ge 0},\mathbb{R}^d) the \sigma-algebra generated by the so-called cylindrical sets

\{ f \in \mathcal{A}(\mathbb{R}_{\ge 0}, \mathbb{R}^d), f(t_1) \in  I_1,...,f(t_n) \in I_n \}

where t_1,...,t_n \in \mathbb{R}_{\ge 0} and where I_1,...,I_n are products of intervals: I_i=\Pi_{k=1}^d (a^k_i,b^k_i].
As a \sigma-algebra \mathcal{T}(\mathbb{R}_{\ge 0},\mathbb{R}^d) is also generated by the following families:

  • \{ f \in \mathcal{A}(\mathbb{R}_{\ge 0}, \mathbb{R}^d ), f(t_1) \in  B_1,...,f(t_n) \in B_n \}  where t_1,...,t_n \in \mathbb{R}_{\ge 0} and where B_1,...,B_n are Borel sets in \mathbb{R}^d.
  • \{ f \in \mathcal{A}(\mathbb{R}_{\ge 0}, \mathbb{R}^d),  (f(t_1),...,f(t_n)) \in B \}  where t_1,...,t_n \in \mathbb{R}_{\ge 0} and where B is a Borel set in (\mathbb{R}^{d})^{\otimes n}.

Exercise: Show that the following sets are not in \mathcal{T}  ([0,1],\mathbb{R}):

  • \{ f \in \mathcal{A}([0,1], \mathbb{R}), \sup_{t\in [0,1]} f(t) <1  \}
  • \{ f \in \mathcal{A}([0,1], \mathbb{R}), \exists t\in [0,1] f(t)  =0 \}

The above exercise shows that the \sigma-algebra \mathcal{T}(\mathbb{R}_{\ge 0},\mathbb{R}^d) is not rich enough to include natural events; this is due to the fact that the space \mathcal{A}(\mathbb{R}_{\ge 0}, \mathbb{R}^d) is by far too big.

In these lectures, we shall mainly be interested in processes with continuous paths. In that case, we use the space of continuous functions \mathcal{C}(\mathbb{R}_{\ge 0},  \mathbb{R}^d) endowed with the \sigma-algebra \mathcal{B}(\mathbb{R}_{\ge 0},\mathbb{R}^d) that is generated by

\{ f \in \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}^d), f(t_1) \in  I_1,...,f(t_n) \in I_n \}

where

t_1,...,t_n \in \mathbb{R}_{\ge 0}

and where I_1,...,I_n are products of intervals \Pi_{k=1}^d (a^k_i,b^k_i]. This \sigma-algebra enjoys nice properties. It is for instance generated by the open sets of the (metric) topology of uniform convergence on compact sets.

Proposition
The \sigma-algebra \mathcal{B} ( \mathbb{R}_{\ge 0}, \mathbb{R}^d) is generated by the open sets of the topology of uniform convergence on compact sets.

Proof:

We make the proof when the dimension d=1 and let the reader adapt it in higher dimension. Let us first recall that, on \mathcal{C}(\mathbb{R}_{\ge 0},\mathbb{R}) the topology of uniform convergence on compact sets is given by the distance

d(f,g)=\sum_{n=1}^{+\infty} \frac{1}{2^n} \min (\sup_{0 \le t \le n} \mid f(t) -g(t) \mid ,1).

This distance endows \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}) with the structure of a complete, separable, metric space (that is of a Polish space). Let us denote by \mathcal{O} the \sigma-field generated by the open sets of this metric space.

First, it is clear that the cylinders

\{ f \in \mathcal{C} (\mathbb{R}_{\ge 0}, \mathbb{R}), f(t_1) < a_1,...,f(t_n) < a_n \}

are open sets that generate \mathcal{B} (\mathbb{R}_{\ge 0},\mathbb{R}). Thus, we have

\mathcal{B} (\mathbb{R}_{\ge 0}, \mathbb{R}) \subset \mathcal{O}.

On the other hand, since for every g \in \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}), n \in \mathbb{N}, n \geq 1 and \rho >0,

\{ f \in \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}), \sup_{0 \le t \le  n} \mid f(t) -g(t) \mid \le \rho \}

=\cap_{t \in \mathbb{Q}, 0 \le t\le n} \{ f \in \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}), \mid f(t)-g(t) \mid \le \rho \},

we deduce that

\{ f \in \mathcal{C}(\mathbb{R}_{\ge 0}, \mathbb{R}), \sup_{0 \le t \le  n} \mid f(t) -g(t) \mid \le \rho \} \in \mathcal{B} (\mathbb{R}_{\ge 0},  \mathbb{R}).

Since \mathcal{O} is generated by the above sets, this implies

\mathcal{O} \subset \mathcal{B} (\mathbb{R}_{\ge 0}, \mathbb{R})

and concludes the proof. \square

Exercise.
Show that the following sets are in \mathcal{B} ([0,1],\mathbb{R}):

  • \{ f \in \mathcal{C}([0,1], \mathbb{R}), \sup_{t\in [0,1]} f(t) <1 \}
  • \{ f \in \mathcal{C}([0,1], \mathbb{R}), \exists t\in [0,1] f(t) =0 \}
Posted in Stochastic Calculus lectures | 5 Comments

Lecture 1. Introduction to the Brownian motion

The first stochastic process that has been extensively studied is the Brownian motion, named in honor of the botanist Robert Brown (1773-1858), who observed and described in 1828 the random movement of particles suspended in a liquid or gas. One of the first mathematical studies of this process goes back to the mathematician Louis Bachelier (1870-1946), in 1900, who presented a stochastic modelling of the stock and option markets. But, mainly due to the lack of rigorous foundations of probability theory at that time, the seminal work of Bachelier has been ignored for a long time by mathematicians. However, in his 1905 paper, Albert Einstein (1879-1955) brought this stochastic process to the attention of physicists by presenting it as a way to indirectly confirm the existence of atoms and molecules. The rigorous mathematical study of stochastic processes really began with the mathematician Andrei Kolmogorov (1903-1987). His monograph published in Russian in 1933 built up probability theory in a rigorous way from fundamental axioms in a way comparable to Euclid’s treatment of geometry. From this axiomatic, Kolmogorov gives a precise definition of stochastic processes. His point of view stresses the fact that a stochastic process is nothing else but a random variable valued in a space of functions (or a space of curves). For instance, if an economist reads a financial newspaper because he is interested in the prices of barrel of oil for last year, then he will focus on the curve of these prices. According to Kolmogorov’s point of view, saying that these prices form a stochastic process is then equivalent to saying that the curve that is seen is the realization of a random variable defined on a suitable probability space. This point of view is mathematically quite deep and provides existence results for stochastic processes (Daniell-Kolmogorov existence result, as well as pathwise regularity results (Kolmogorov continuity theorem).

Joseph Doob (1910-2004) writes in his introduction to his famous book “Stochastic processes”:

[a stochastic process is] any process running along in time and controlled by probability laws…[more precisely] any family of random variables X_t [where] a random variable … is simply a measurable function…

Doob’s point of view, which is consistent with Kolmogorov’s and built on the work by Paul Levy (1886-1971), is nowadays commonly given as a definition of a stochastic process. Relying on this point of view that emphasizes the role of time, Doob’s work, developed during the 1940’s and the 1950’s has quickly become one of the most powerful tools available to study stochastic processes.

Let us now describe the seminal considerations of Bachelier. Let X_t denote the price at time t of a given asset on a financial market. We will assume that X_0=0 (otherwise, we work with X_t-X_0). The first observation is that the price X_t can not be predicted with absolute certainty. It seems therefore reasonable to assume that X_t is a random variable defined on some probability space. One of the initial problems of Bachelier was to understand the distribution of prices at given times, that is the distribution of the random variable (X_{t_1},...,X_{t_n}), where t_1,...,t_n are fixed.

The two following fundamental observations of Bachelier were based on empirical observations:

  1. If \tau is very small then, in absolute value, the price variation X_{t+\tau}-X_t is of order \sigma \sqrt{\tau} where \sigma is a positive parameter (nowadays called the volatility of the asset);
  2. The expectation of a speculator is always zero. Quoted and translated from Bachelier: I seems that the market, the aggregate of speculators, can believe in neither a market rise nor a market fall, since, for each
    quoted price, there are as many buyers as sellers.. (nowadays, a generalization of this principle is called the absence of arbitrage).

Next, Bachelier assumes that for every t>0, X_t has a density with respect to the Lebesgue measure, let us say p(t,x). The two above observations imply that for \tau small,

p(t+\tau,x)=\frac{1}{2} p(t,x-\sigma \sqrt{\tau})+\frac{1}{2} p(t,x+\sigma \sqrt{\tau}).

Indeed, due to the first observation, if the price is x at time t+\tau, it means that at time t the price was equal to x-\sigma \sqrt{\tau} or to x+\sigma \sqrt{\tau}. According to the second observation, each of this case produces with probability \frac{1}{2}.

Now Bachelier assumes that p(t,x) is regular enough and uses the following approximations coming from a Taylor expansion:

p(t+\tau,x) \simeq p(t,x)+\tau \frac{\partial p}{\partial t} (t,x)
p(t,x-\sigma \sqrt{\tau}) \simeq p(t,x)-\sigma \sqrt{\tau} \frac{\partial p}{\partial x} (t,x)+\frac{1}{2} \sigma^2 \tau \frac{\partial^2 p}{\partial x^2} (t,x)
p(t,x+\sigma \sqrt{\tau}) \simeq p(t,x)+\sigma \sqrt{\tau} \frac{\partial p}{\partial x} (t,x)+\frac{1}{2} \sigma^2 \tau \frac{\partial^2 p}{\partial x^2} (t,x).

This gives

\frac{\partial p}{\partial t} =\frac{1}{2} \sigma^2 \frac{\partial^2 p}{\partial x^2} (t,x).

This is the so-called heat equation, which is the primary example of a diffusion equation. Explicit solutions to this equation are known, and by using the fact that at time 0, p is the Dirac distribution at 0, it is obtained that:

p(t,x)=\frac{e^{-\frac{x^2}{2\sigma^2 t} } }{\sigma \sqrt{2 \pi t}}.

It means that X_t has a Gaussian distribution with mean 0 and variance \sigma^2. Let now 0<t_1<...<t_n be fixed times and I_1,...,I_n be fixed intervals. In order to compute

\mathbb{P} (X_{t_1} \in I_1,...,X_{t_n} \in I_n)

the next step is to assume that the above analysis did not depend on the origin of time, or more precisely that the best information available at time t is given by the price X_t. That leads first to the following computation

\mathbb{P} (X_{t_1} \in I_1,X_{t_2} \in I_2)
=\int_{I_1} \mathbb{P}(X_{t_2} \in I_2 \mid X_{t_1}=x_1) p(t_1,x_1) dx_1
=\int_{I_1} \mathbb{P}(X_{t_2-t_1} +x_1 \in I_2 \mid X_{t_1}=x_1) p(t_1,x_1) dx_1
=\int_{I_1 \times I_2} p(t_2-t_1,x_2-x_1) p(t_1,x_1) dx_1dx_2

which is easily generalized to

\mathbb{P} (X_{t_1} \in I_1,...,X_{t_n} \in I_n)
= \int_{I_1 \times \cdots \times I_n} p(t_n-t_{n-1},x_n-x_{n-1}) \cdots p(t_2-t_1,x_2-x_1) p(t_1,x_1) dx_1dx_2 \cdots dx_n

In many regards, the previous computations were not rigorous but heuristic. One of the main issues here, is that the sequence of random variables X_t is not well defined from a mathematical point of view. In the next posts, we will provide a rigorous construction of this object X_t on which worked Bachelier and that is called a Brownian motion.

Posted in Stochastic Calculus lectures | Leave a comment