Universality of the Uniform
In this section, we will discuss a remarkable property of the Uniform distribution: given a r.v., we can construct an r.v. with any continuous distribution we want. Conversely, given an r.v. with an arbitrary continuous distribution, we can create a r.v. We call this the universality of the Uniform, because it tells us the Uniform is a universal starting point for building r.v.s with other distributions. Universality of the Uniform also goes by many other names, such as the probability integral transform, inverse transform sampling, the quantile transformation, and even the fundamental theorem of simulation.
To keep the proofs simple, we will state the universality of the Uniform for a case where we know that the inverse of the desired CDF exists. More generally, similar ideas can be used to simulate a random draw from any desired CDF, as a function of a r.v.
Theorem: Universality of the Uniform
Let be a CDF which is a continuous function and strictly increasing on the support of the distribution. This ensures that the inverse function exists, as a function from to . We then have the following results.
- Let and . Then is an r.v. with CDF .
- Let be an r.v. with CDF . Then .
Let's make sure we understand what each part of the theorem is saying.
The first part of the theorem says that if we start with and a CDF , then we can create an r.v. whose CDF is by plugging into the inverse CDF . Since is a function (known as the quantile function), is a random variable, and a function of a random variable is a random variable, is a random variable; universality of the Uniform says its CDF is .
The second part of the theorem goes in the reverse direction, starting from an r.v. whose CDF is and then creating a r.v. Again, is a function, is a random variable, and a function of a random variable is a random variable, so is a random variable. Since any CDF is between and everywhere, must take values between and . Universality of the Uniform says that the distribution of is Uniform on .
Warning
The second part of universality of the Uniform involves plugging a random variable into its own CDF . This may seem strangely self-referential, but it makes sense because is just a function (that satisfies the properties of a valid CDF), so is a function of a random variable and hence is itself a random variable. There is a potential notational confusion, however: by definition, but it would be incorrect to say ''''. Rather, we should first find an expression for the CDF as a function of , then replace with to obtain a random variable. For example, if the CDF of is for , then .
Proof:
To gain more insight into what the quantile function and universality of the Uniform mean, let's consider an example that is familiar to millions of students: percentiles on an exam.
Example: Percentiles
Example: Universality with Logistic
Normal
The Normal distribution is a famous continuous distribution with a bell-shaped PDF. It is extremely widely used in statistics because of a theorem, the central limit theorem, which says that under very weak assumptions, the sum of a large number of i.i.d. random variables has an approximately Normal distribution, regardless of the distribution of the individual r.v.s. This means we can start with independent r.v.s from almost any distribution, discrete or continuous, but once we add up a bunch of them, the distribution of the resulting r.v. looks like a Normal distribution.
Definition: Standard Normal Distribution
A continuous r.v. is said to have the standard Normal distribution if its PDF is given by
We write this as .
The constant in front of the PDF may look surprising (why is something with needed in front of something with , when there are no circles in sight?), but it turns out to be what is needed to make the PDF integrate to 1. Such constants are called normalizing constants because they normalize the total area under the PDF to 1. The standard Normal CDF is the accumulated area under the PDF:
We need to leave this as an integral: it turns out to be mathematically impossible to find a closed-form expression for the antiderivative of , meaning that we cannot express as a finite sum of more familiar functions like polynomials or exponentials. But closed-form or no, it's still a well-defined function: if we give an input , it returns the accumulated area under the PDF from up to .
Notation
By convention, we use for the standard Normal PDF and for the standard Normal CDF. We will often use to denote a standard Normal r.v. The standard Normal PDF and CDF are plotted in the following figure. The PDF is bell-shaped and symmetric about 0, and the CDF is -shaped. These have the same general shape as the Logistic PDF and CDF that we saw in a couple of previous examples, but the Normal PDF decays to much more quickly: notice that nearly all of the area under is between -3 and 3, whereas we had to go out to -5 and 5 for the Logistic PDF.
There are several important symmetry properties that can be deduced from the standard Normal PDF and CDF.
The general Normal distribution has two parameters, denoted and , which are the mean and variance (the mean and variance of a distribution are measures of the average and how spread out the distribution is, respectively; these are defined and explored in the next unit). Starting with a standard Normal r.v. , we can convert to a Normal r.v. with any desired parameters and by a location-scale transformation.
Definition: Normal Distribution
If , then
is said to have the Normal distribution with mean parameter and variance parameter , for any real and with . We denote this by .
Of course, if we can get from to , then we can get from back to . The process of getting a standard Normal from a non-standard Normal is called, appropriately enough, standardization. For , the standardized version of is
We can use standardization to find the CDF and PDF of in terms of the standard Normal CDF and PDF.
Theorem:Normal CDF and PDF
Let . Then the CDF of is
and the PDF of is
Proof:
Three important benchmarks for the Normal distribution are the probabilities of falling within one, two, and three standard deviations of the mean parameter . The 68-95-99.7 rule tells us that these probabilities are what the name suggests.
Theorem: 68-95-99.7 Rule
If , then
Example
Let . What is , exactly (in terms of ) and approximately?
Solution:
One more useful property of the Normal distribution is that the sum of independent Normals is Normal.
Theorem: Sum of Independent Normals
If and are independent, then