Continuous Random Variables 1

168 阅读4分钟

Probability density function

So far we have been working with discrete random variables, whose possible values can be written down as a list. In this unit we will introduce continuous r.v.s, which can take on any real value in an interval (possibly of infinite length, such as (0,)(0,\infty) or the entire real line).

First we'll look at properties of continuous r.v.s in general. Then we'll introduce three famous continuous distributions---the Uniform, Normal, and Exponential---which, in addition to having important stories in their own right, serve as building blocks for many other useful continuous distributions.

4_discretevscont.png

Recall that for a discrete r.v., the CDF jumps at every point in the support, and is flat everywhere else. In contrast, for a continuous r.v. the CDF increases smoothly; see the above figure for a comparison of discrete vs. continuous CDFs.

Definition: Continuous r.v.

An r.v. has a continuous distribution if its CDF is differentiable. We also allow there to be endpoints (or finitely many points) where the CDF is continuous but not differentiable, as long as the CDF is differentiable everywhere else. A continuous random variable is a random variable with a continuous distribution.

Definition: Probability Density Function

For a continuous r.v. XX with CDF FF, the probability density function (PDF) of XX is the derivative ff of the CDF, given by f(x)=F(x).f(x) = F'(x). The support of XX, and of its distribution, is the set of all xx where f(x)>0f(x)>0.

An important way in which continuous r.v.s differ from discrete r.v.s is that for a continuous r.v. XX, P(X=x)=0P(X=x) = 0 for all xx. This is because P(X=x)P(X=x) is the height of a jump in the CDF at xx, but the CDF of XX has no jumps! Since the PMF of a continuous r.v. would just be 0 everywhere, we work with a PDF instead.

The PDF is analogous to the PMF in many ways, but there is a key difference: for a PDF ff, the quantity f(x)f(x) is not a probability, and in fact it is possible to have f(x)>1f(x) > 1 for some values of xx. To obtain a probability, we need to integrate the PDF. The fundamental theorem of calculus tells us how to get from the PDF back to the CDF.

Proposition: PDF to CDF

Let XX be a continuous r.v. with PDF ff. Then the CDF of XX is given by

F(x)=xf(t)dt.F(x) = \int_{-\infty}^x f(t) dt.

Proof: By the definition of PDF, FF is an antiderivative of ff. So by the fundamental theorem of calculus,

xf(t)dt=F(x)F()=F(x).\int_{-\infty}^x f(t) dt = F(x) - F(-\infty) = F(x).

The above result is analogous to how we obtained the value of a discrete CDF at xx by summing the PMF over all values less than or equal to xx; here we integrate the PDF over all values up to xx, so the CDF is the accumulated area under the PDF. Since we can freely convert between the PDF and the CDF using the inverse operations of integration and differentiation, both the PDF and CDF carry complete information about the distribution of a continuous r.v. Since the PDF determines the distribution, we should be able to use it to find the probability of XX falling into an interval (a,b)(a, b). A handy fact is that we can include or exclude the endpoints as we wish without altering the probability, since the endpoints have probability 0: P(a<X<b)=P(a<Xb)=P(aX<b)=P(aXb).P(a < X < b) = P(a < X \leq b) = P(a \leq X < b) = P(a \leq X \leq b).

Warning Including or Excluding Endpoints

We can be carefree about including or excluding endpoints as above for continuous r.v.s, but we must not be careless about this for discrete r.v.s.

By the definition of CDF and the fundamental theorem of calculus,

P(a<Xb)=F(b)F(a)=abf(x)dx.P(a < X \leq b) = F(b) - F(a) = \int_a^b f(x) dx.

Therefore, to find the probability of XX falling in the interval (a,b](a, b] (or (a,b)(a, b), [a,b][a, b], or [a,b)[a, b)) using the PDF, we simply integrate the PDF from aa to bb. Just as a valid PMF must be nonnegative and sum to 1, a valid PDF must be nonnegative and integrate to 1.

Theorem: Valid PDFs

The PDF ff of a continuous r.v. must satisfy the following two criteria:

  • Nonnegative: f(x)0f(x) \geq 0;
  • Integrates to 1:_f(x)dx=1\int\_{-\infty}^\infty f(x) dx= 1.

Proof: The first criterion is true because probability is nonnegative; if f(x0)f(x_0) were negative, then we could integrate over a tiny region around x0x_0 and get a negative probability. Alternatively, note that the PDF at is the slope of the CDF at x0x_0, so f(x0)<0f(x_0) < 0 would imply that the CDF is decreasing at x0x_0, which is not allowed. The second criterion is true since f(x)dx\int_{-\infty}^\infty f(x) dx is the probability of XX falling somewhere on the real line, which is 1.

For practice, let's now look at a specific example of a PDF.

Example Logistic

捕获.JPG

The following figure shows the Logistic PDF (left) and CDF (right). On the PDF, P(2<X<2)P(-2 < X < 2) is represented by the shaded area; on the CDF, it is represented by the height of the curly brace. You can check that the properties of a valid PDF and CDF are satisfied.

4_logistic.png

Although the height of a PDF at xx does not represent a probability, it is closely related to the probability of falling into a tiny interval around xx, as the following intuition explains.

Intuition

捕获.JPG

Uniform

Intuitively, a Uniform r.v. on the interval (a,b)(a, b) is a completely random number between aa and bb. We formalize the notion of ''completely random'' on an interval by specifying that the PDF should be constant over the interval.

Definition: Uniform Distribution

A continuous r.v. UU is said to have the Uniform distribution on the interval (a,b)(a,b) if its PDF is

f(x)={1baif a<x<b,0otherwise.f(x) = \left\{ \begin{array}{ll} \frac{1}{b-a} & \textrm{if } a < x < b, \\[2pt] 0 & \textrm{otherwise.} \end{array} \right.

We denote this by UUnif(a,b)U \sim \textrm{Unif}(a,b).

This is a valid PDF because the area under the curve is just the area of a rectangle with width bab-a and height 1/(ba)1/(b-a). The CDF is the accumulated area under the PDF:

F(x)={0if xa,xabaif a<x<b,1if xb.F(x) = \left\{ \begin{array}{ll} 0 & \textrm{if } x \leq a, \\[2pt] \frac{x-a}{b-a} & \textrm{if } a < x < b, \\[2pt] 1 & \textrm{if } x \geq b. \end{array} \right.

The Uniform distribution that we will most frequently use is the Unif(0,1)\textrm{Unif}(0,1) distribution, also called the standard Uniform. The Unif(0,1)\textrm{Unif}(0,1) PDF and CDF are particularly simple: f(x)=1f(x) = 1 and F(x)=xF(x) = x for 0<x<10 < x < 1. The following figure shows the Unif(0,1)\textrm{Unif}(0,1) PDF and CDF side by side.

4_Unifpdfcdf.png

For a general Unif(a,b)\textrm{Unif}(a,b) distribution, the PDF is constant on $$(a, b), and the CDF is ramp-shaped, increasing linearly from 0 to 1 as xx ranges from aa to bb.

For Uniform distributions, probability is proportional to length.

Proposition

捕获.JPG

Definition: Location-scale Transformation

Let xx be an r.v. and Y=σX+μY = \sigma X + \mu, where σ\sigma and μ\mu are constants with σ>0\sigma > 0. Then we say that YY has been obtained as a location-scale transformation of XX. Here μ\mu controls how the location is changed and σ\sigma controls how the scale is changed.

Warning

In a location-scale transformation, starting with XUnif(a,b)X \sim \textrm{Unif}(a,b) and transforming it to Y=cX+dY=cX+d where cc and dd are constants with c>0c>0, YY is a linear function of XX and Uniformity is preserved: YUnif(ca+d,cb+d)Y \sim \textrm{Unif}(ca+d,cb+d).

But if YY is defined as a nonlinear transformation of XX, then YY will not be linear in general. For example, for XUnif(a,b)X \sim \textrm{Unif}(a,b) with 0a<b0 \leq a < b, the transformed r.v. Y=X2Y = X^2 has support (a2,b2)(a^2, b^2) but is not Uniform on that interval.

In studying Uniform distributions, a useful strategy is to start with an r.v. that has the simplest Uniform distribution, figure things out in the friendly simple case, and then use a location-scale transformation to handle the general case.

Warning

When using location-scale transformations, the shifting and scaling should be applied to the random variables themselves, not to their PDFs. For example, let UUnif(0,1)U \sim \textrm{Unif}(0,1), so the PDF ff has f(x)=1f(x)=1 on (0,1)(0,1) (and f(x)=0f(x)=0 elsewhere). Then 3U+1Unif(1,4)3U+1 \sim \textrm{Unif}(1,4), but 3f+13f+1 is the function that equals 4 on (0,1)(0, 1) and 1 elsewhere, which is not a valid PDF since it does not integrate to 1.