## Abstract

This chapter introduces probability theory as a system of models, based on measure theory, of some real-world phenomena. The models are measure spaces of total measure 1 and usually have certain distinguished measurable functions defined on them.

Section 1 begins by establishing the measure-theoretic framework and a short dictionary for passing back and forth between terminology in measure theory and terminology in probability theory. The latter terminology includes events, random variables, mean, probability distribution of a random variable, and joint probability distribution of several random variables. An important feature of probability is that it is possible to work with random variables without any explicit knowledge of the underlying measure space, the joint probability distributions of random variables being the objects of importance.

Section 2 introduces conditional probability and uses that to motivate the mathematical definition of independence of events. In turn, independence of events leads naturally to a definition of independent random variables. Independent random variables are of great importance in the subject and play a much larger role than their counterparts in abstract measure theory. Examples at the end of the section indicate the extent to which functions of independent random variables can remain independent. The techniques in the examples are of use in the subject of statistical inference, which is introduced in Section 10.

Section 3 states and proves the Kolmogorov Extension Theorem, a foundational result allowing one to create stochastic processes involving infinite sets of times out of data corresponding to finite subsets of those times. A special case of the theorem provides the existence of infinite sets of independent random variables with specified probability distributions.

Section 4 establishes the celebrated Strong Law of Large Numbers, which says that the Cesàro sums of a sequence of identically distributed independent random variables with finite mean converge almost everywhere to a constant random variable, the constant being the mean. This is a theorem that is vaguely known to the general public and is widely misunderstood. The proof is based on Kolmogorov’s inequality.

Sections 5–8 provide background for the Central Limit Theorem, whose statement and proof are in Section 9. Section 5 discusses three successively weaker kinds of convergence for random variables—almost sure convergence, convergence in probability, and convergence in distribution. Convergence in distribution will be the appropriate kind for the Central Limit Theorem. Section 6 contains the Portmanteau Lemma, which gives some equivalent formulations of convergence in distribution, Section 7 introduces characteristic functions as Fourier transforms of probability distributions, and Section 8 proves the Lévy Continuity Theorem, which formulates convergence in distribution in terms of characteristic functions.

Section 9 contains the statement and proof of the Central Limit Theorem, followed by some simple examples. This theorem is the most celebrated result in probability theory and has many applications in mathematics and other fields.

Section 10 is a brief introduction to the subject of statistical inference, showing how the Central Limit theorem plays a role in practice through the $t$ test of W. S. Gosset.

## Information

Digital Object Identifier: 10.3792/euclid/9781429799911-9