Probability density function
So far we have been working with discrete random variables, whose possible values can be written down as a list. In this unit we will introduce continuous r.v.s, which can take on any real value in an interval (possibly of infinite length, such as or the entire real line).
First we'll look at properties of continuous r.v.s in general. Then we'll introduce three famous continuous distributions---the Uniform, Normal, and Exponential---which, in addition to having important stories in their own right, serve as building blocks for many other useful continuous distributions.
Recall that for a discrete r.v., the CDF jumps at every point in the support, and is flat everywhere else. In contrast, for a continuous r.v. the CDF increases smoothly; see the above figure for a comparison of discrete vs. continuous CDFs.
Definition: Continuous r.v.
An r.v. has a continuous distribution if its CDF is differentiable. We also allow there to be endpoints (or finitely many points) where the CDF is continuous but not differentiable, as long as the CDF is differentiable everywhere else. A continuous random variable is a random variable with a continuous distribution.
Definition: Probability Density Function
For a continuous r.v. with CDF , the probability density function (PDF) of is the derivative of the CDF, given by The support of , and of its distribution, is the set of all where .
An important way in which continuous r.v.s differ from discrete r.v.s is that for a continuous r.v. , for all . This is because is the height of a jump in the CDF at , but the CDF of has no jumps! Since the PMF of a continuous r.v. would just be 0 everywhere, we work with a PDF instead.
The PDF is analogous to the PMF in many ways, but there is a key difference: for a PDF , the quantity is not a probability, and in fact it is possible to have for some values of . To obtain a probability, we need to integrate the PDF. The fundamental theorem of calculus tells us how to get from the PDF back to the CDF.
Proposition: PDF to CDF
Let be a continuous r.v. with PDF . Then the CDF of is given by
Proof: By the definition of PDF, is an antiderivative of . So by the fundamental theorem of calculus,
The above result is analogous to how we obtained the value of a discrete CDF at by summing the PMF over all values less than or equal to ; here we integrate the PDF over all values up to , so the CDF is the accumulated area under the PDF. Since we can freely convert between the PDF and the CDF using the inverse operations of integration and differentiation, both the PDF and CDF carry complete information about the distribution of a continuous r.v. Since the PDF determines the distribution, we should be able to use it to find the probability of falling into an interval . A handy fact is that we can include or exclude the endpoints as we wish without altering the probability, since the endpoints have probability 0:
Warning Including or Excluding Endpoints
We can be carefree about including or excluding endpoints as above for continuous r.v.s, but we must not be careless about this for discrete r.v.s.
By the definition of CDF and the fundamental theorem of calculus,
Therefore, to find the probability of falling in the interval (or , , or ) using the PDF, we simply integrate the PDF from to . Just as a valid PMF must be nonnegative and sum to 1, a valid PDF must be nonnegative and integrate to 1.
Theorem: Valid PDFs
The PDF of a continuous r.v. must satisfy the following two criteria:
- Nonnegative: ;
- Integrates to 1:.
Proof: The first criterion is true because probability is nonnegative; if were negative, then we could integrate over a tiny region around and get a negative probability. Alternatively, note that the PDF at is the slope of the CDF at , so would imply that the CDF is decreasing at , which is not allowed. The second criterion is true since is the probability of falling somewhere on the real line, which is 1.
For practice, let's now look at a specific example of a PDF.
Example Logistic
The following figure shows the Logistic PDF (left) and CDF (right). On the PDF, is represented by the shaded area; on the CDF, it is represented by the height of the curly brace. You can check that the properties of a valid PDF and CDF are satisfied.
Although the height of a PDF at does not represent a probability, it is closely related to the probability of falling into a tiny interval around , as the following intuition explains.
Intuition
Uniform
Intuitively, a Uniform r.v. on the interval is a completely random number between and . We formalize the notion of ''completely random'' on an interval by specifying that the PDF should be constant over the interval.
Definition: Uniform Distribution
A continuous r.v. is said to have the Uniform distribution on the interval if its PDF is
We denote this by .
This is a valid PDF because the area under the curve is just the area of a rectangle with width and height . The CDF is the accumulated area under the PDF:
The Uniform distribution that we will most frequently use is the distribution, also called the standard Uniform. The PDF and CDF are particularly simple: and for . The following figure shows the PDF and CDF side by side.
For a general distribution, the PDF is constant on $$(a, b), and the CDF is ramp-shaped, increasing linearly from 0 to 1 as ranges from to .
For Uniform distributions, probability is proportional to length.
Proposition
Definition: Location-scale Transformation
Let be an r.v. and , where and are constants with . Then we say that has been obtained as a location-scale transformation of . Here controls how the location is changed and controls how the scale is changed.
Warning
In a location-scale transformation, starting with and transforming it to where and are constants with , is a linear function of and Uniformity is preserved: .
But if is defined as a nonlinear transformation of , then will not be linear in general. For example, for with , the transformed r.v. has support but is not Uniform on that interval.
In studying Uniform distributions, a useful strategy is to start with an r.v. that has the simplest Uniform distribution, figure things out in the friendly simple case, and then use a location-scale transformation to handle the general case.
Warning
When using location-scale transformations, the shifting and scaling should be applied to the random variables themselves, not to their PDFs. For example, let , so the PDF has on (and elsewhere). Then , but is the function that equals 4 on and 1 elsewhere, which is not a valid PDF since it does not integrate to 1.