Continuous Random Variables 3

151 阅读4分钟

Exponential

The Exponential distribution is a continuous distribution that is widely used as a simple model for the waiting time for a certain kind of event to occur, e.g., the time until the next email arrives.

Definition: Exponential Distribution

A continuous r.v. XX is said to have the Exponential distribution with parameter λ\lambda, where λ>0\lambda > 0, if its PDF is

f(x)=λeλx,x>0.f(x) = \lambda e^{-\lambda x}, \quad x > 0.

We denote this by XExpo(λ)X \sim \textrm{Expo}(\lambda).

The corresponding CDF is

F(x)=1eλx,x>0.F(x) = 1-e^{-\lambda x}, \quad x > 0.

The Expo(1)\textrm{Expo}(1) PDF and CDF are plotted in the following figure.

4_Expopdfcdf.png

We've seen how all Uniform and Normal distributions are related to one another via location-scale transformations, and we might wonder whether the Exponential distribution allows this too. Exponential r.v.s are defined to have support (0,)(0,\infty), and shifting would change the left endpoint. But scale transformations work nicely, and we can use scaling to get from the simple Expo(1)\textrm{Expo}(1) to the general Expo(λ)\textrm{Expo}(\lambda): if XExpo(1)X \sim \textrm{Expo}(1), then

Y=XλExpo(λ),Y = \frac{X}{\lambda} \sim \textrm{Expo}(\lambda),

sinc

P(Yy)=P(Xλy)=P(Xλy)=1eλy,y>0.P(Y \leq y) = P\left(\frac{X}{\lambda} \leq y\right) = P(X \leq \lambda y) = 1 - e^{-\lambda y}, \quad y > 0.

Conversely, if YExpo(λ)Y \sim \textrm{Expo}(\lambda), then λYExpo(1)\lambda Y \sim \textrm{Expo}(1) . The Exponential distribution has a very special property called the memoryless property. If the waiting time for a certain event to occur is Exponential, then the memoryless property says that no matter how long you have waited so far, your additional waiting time is still Exponential (with the same parameter).

Definition: Memoryless Property

A distribution is said to have the memoryless property if a random variable XX from that distribution satisfies

P(Xs+tXs)=P(Xt)P(X \geq s+t | X \geq s) = P(X \geq t)

for all s,t>0s,t>0.

Here ss represents the time you've already spent waiting; the definition says that after you've waited ss minutes, the probability you'll have to wait another tt minutes is exactly the same as the probability of having to wait tt minutes with no previous waiting time under your belt. Another way to state the memoryless property is that conditional on XsX \geq s, the additional waiting time XsX-s is still distributed Expo(λ)\textrm{Expo}(\lambda) .

Using the definition of conditional probability, we can directly verify that the Exponential distribution has the memoryless property. Let XExpo(λ)X \sim \textrm{Expo}(\lambda). Then

P(Xs+tXs)=P(Xs+t)P(Xs)=eλ(s+t)eλs=eλt=P(Xt).P(X \geq s+t | X \geq s) = \frac{P(X \geq s+t)}{P(X \geq s)} = \frac{e^{-\lambda(s+t)}}{e^{-\lambda s}} = e^{-\lambda t} = P(X \geq t).

What are the implications of the memoryless property? If you're waiting at a bus stop and the time until the bus arrives has an Exponential distribution, then conditional on your having waited 30 minutes, the bus isn't due to arrive soon. The distribution simply forgets that you've been waiting for half an hour, and your remaining wait time is the same as if you had just shown up to the bus stop. If the lifetime of a machine has an Exponential distribution, then no matter how long the machine has been functional, conditional on having lived that long, the machine is as good as new: there is no wear-and-tear effect that makes the machine more likely to break down soon. If human lifetimes were Exponential, then conditional on having survived to the age of 80, your remaining lifetime would have the same distribution as that of a newborn baby!

Clearly, the memoryless property is not an appropriate description for human or machine lifetimes. Why then do we care about the Exponential distribution?

  1. Some physical phenomena, such as radioactive decay, truly do exhibit the memoryless property, so the Exponential is an important model in its own right.
  2. The Exponential distribution is well-connected to other named distributions. In the next section, we'll see how the Exponential and Poisson distributions can be united by a shared story.
  3. The Exponential serves as a building block for more flexible distributions, such as a distribution known as the Weibull, that allow for a wear-and-tear effect (where older units are due to break down) or a survival-of-the-fittest effect (where the longer you've lived, the stronger you get). To understand these distributions, we first have to understand the Exponential.

Poisson Processes

The Exponential distribution is closely connected to the Poisson distribution, as suggested by our use of λ\lambda for the parameters of both distributions. In this section we will see that the Exponential and Poisson are linked by a common story, the Poisson process.

Definition: Poisson Process

A process of arrivals in continuous time is called a Poisson process with rate λ\lambda if the following two conditions hold.

  1. The number of arrivals that occur in an interval of length tt is a Pois(λt)\textrm{Pois}(\lambda t) random variable.
  2. The numbers of arrivals that occur in disjoint intervals are independent of each other. For example, the numbers of arrivals in the intervals (0,10),[10,12),(0,10), [10, 12), and [15,)[15, \infty) are independent.

A sketch of a Poisson process is pictured in the following figure. Each X marks the spot of an arrival.

4_pp.png

For concreteness, suppose the arrivals are emails landing in an inbox according to a Poisson process with rate λ\lambda. There are several things we might want to know about this process. One question we could ask is: in one hour, how many emails will arrive? The answer comes directly from the definition, which tells us that the number of emails in an hour follows a Pois(λ)\textrm{Pois}(\lambda) distribution. Notice that the number of emails is a nonnegative integer, so a discrete distribution is appropriate.

But we could also flip the question around and ask: how long does it take until the first email arrives (measured relative to some fixed starting point)? The waiting time for the first email is a positive real number, so a continuous distribution on (0,)(0, \infty) is appropriate. Let T1T_1 be the time until the first email arrives. To find the distribution of T1T_1, we just need to understand one crucial fact: saying that the waiting time for the first email is greater than tt is the same as saying that no emails have arrived between 0 and tt. In other words, if NtN_t is the number of emails that arrive at or before time tt, then

T1>t is the same event as Nt=0.T_1 > t \textrm{ is the same event as } N_t = 0.

We call this the count-time duality because it connects a discrete r.v., NtN_t, which counts the number of arrivals, with a continuous r.v., T1T_1, which marks the time of the first arrival.

If two events are the same, they have the same probability. Since NtPois(λt)N_t \sim \textrm{Pois}(\lambda t) by the definition of Poisson process,

P(T1>t)=P(Nt=0)=eλt(λt)00!=eλt.P(T_1 > t) = P(N_t = 0) = \frac{e^{-\lambda t} (\lambda t)^0}{0!} = e^{-\lambda t}.

Therefore P(T1t)=1eλtP(T_1 \leq t) = 1 - e^{-\lambda t}, so T1Expo(λ)T_1 \sim \textrm{Expo}(\lambda)! The time until the first arrival in a Poisson process of rate λ\lambda has an Exponential distribution with parameter λ\lambda.

What about T2T1T_2 - T_1, the time between the first and second arrivals? Since disjoint intervals in a Poisson process are independent by definition, the past is irrelevant once the first arrival occurs. Thus T2T1T_2 - T_1 is independent of the time until the first arrival, and by the same argument as before, T2T1T_2 - T_1 also has an Exponential distribution with rate λ\lambda. Similarly, T3T2Expo(λ)T_3 - T_2 \sim \textrm{Expo}(\lambda) independently of T1T_1 and T2T1T_2-T_1. Continuing in this way, we deduce that all the interarrival times are i.i.d. T3T2Expo(λ)T_3 - T_2 \sim \textrm{Expo}(\lambda) random variables. To summarize what we've learned: in a Poisson process of rate λ\lambda,

  • the number of arrivals in an interval of length 1 is Pois(λ)\textrm{Pois}(\lambda), and
  • the times between arrivals are i.i.d. Expo(λ)\textrm{Expo}(\lambda).

Thus, Poisson processes tie together two important distributions, one discrete and one continuous, and the use of a common symbol λ\lambda for both the Poisson and Exponential parameters is felicitous notation, for λ\lambda is the arrival rate in the process that unites the two distributions.

The story of the Poisson process provides intuition for the fact that the minimum of independent Exponential r.v.s is another Exponential r.v.

Example

Let X1,,XnX_1,\dots,X_n be independent, with XjExpo(λj)X_j \sim \textrm{Expo}(\lambda_j). Let L=min(X1,,Xn)L = \min(X_1,\dots,X_n). Show that LExpo(λ1++λn),L \sim \textrm{Expo}(\lambda_1 + \dots + \lambda_n), and interpret this intuitively.

Solution:

捕获.JPG