Averages, Law of Large Numbers, and Central Limit Theorem 2

251 阅读3分钟

Geometric and Negative Binomial

We now introduce two more famous discrete distributions, the Geometric and Negative Binomial, and calculate their expected values.

Story: Geometric Distribution

Consider a sequence of independent Bernoulli trials, each with the same success probability p(0,1)p \in (0,1), with trials performed until a success occurs. Let XX be the number of failures before the first successful trial. Then XX has the Geometric distribution with parameter pp; we denote this by XGeom(p)X \sim \textrm{Geom}(p).

For example, if we flip a fair coin until it lands Heads for the first time, then the number of Tails before the first occurrence of Heads is distributed as Geom(1/2)\textrm{Geom}(1/2). To get the Geometric PMF from the story, imagine the Bernoulli trials as a string of 0's (failures) ending in a single 1 (success). Each 0 has probability q=1pq=1-p and the final 1 has probability pp, so a string of kk failures followed by one success has probability qkpq^kp.

Theorem: Geometric PMF

If XGeom(p)X \sim \textrm{Geom}(p), then the PMF of XX is

P(X=k)=qkpP(X=k) = q^k p

for k=0,1,2,,k = 0, 1, 2, \dots, where q=1pq=1-p.

This is a valid PMF because

k=0qkp=pk=0qk=p11q=1.\sum_{k=0}^\infty q^k p = p \sum_{k=0}^\infty q^k = p \cdot \frac{1}{1-q} = 1.

Just as the binomial theorem shows that the Binomial PMF is valid, a geometric series shows that the Geometric PMF is valid!

Warning

There are differing conventions for the definition of the Geometric distribution; some sources define the Geometric as the total number of trials, including the success. In our convention, the Geometric distribution excludes the success, and the First Success distribution includes the success.

Definition: First Success Distribution

In a sequence of independent Bernoulli trials with success probability pp, let YY be the number of trials until the first successful trial, including the success. Then YY has the First Success distribution with parameter pp; we denote this by YFS(p)Y \sim \textrm{FS}(p).

It is easy to convert back and forth between the two but important to be careful about which convention is being used. By definition, if YFS(p)Y \sim \textrm{FS}(p) then Y1Geom(p)Y-1 \sim \textrm{Geom}(p), and we can convert between the PMFs of YY and Y1Y-1 by writing P(Y=k)=P(Y1=k1)P(Y=k) = P(Y-1=k-1). Conversely, if XGeom(p)X \sim \textrm{Geom}(p), then X+1FS(p)X+1 \sim \textrm{FS}(p).

Example Geometric Expectation

捕获.JPG

Example First Success Expectation

捕获.JPG

Story: Negative Binomial Distribution

In a sequence of independent Bernoulli trials with success probability pp, if XX is the number of failures before the rrth success, then XX is said to have the Negative Binomial distribution with parameters rr and pp, denoted XNBin(r,p)X \sim \textrm{NBin}(r,p). Both the Binomial and the Negative Binomial distributions are based on independent Bernoulli trials; they differ in the stopping rule and in what they are counting: the Binomial counts the number of successes in a fixed number of trials, while the Negative Binomial counts the number of failures until a fixed number of successes.

Theorem: Negative Binomial PMF

If XNBin(r,p)X \sim \textrm{NBin}(r,p), then the PMF of XX is

P(X=n)=(n+r1r1)prqnP(X=n) = {n+r-1 \choose r-1} p^r q^n

for n=0,1,2,n = 0, 1, 2\dots, where q=1pq=1-p.

Proof:

捕获.JPG

Just as a Binomial r.v. can be represented as a sum of i.i.d. Bernoullis, a Negative Binomial r.v. can be represented as a sum of i.i.d. Geometrics.

Theorem

Let XNBin(r,p)X \sim \textrm{NBin}(r,p), viewed as the number of failures before the rrth success in a sequence of independent Bernoulli trials with success probability pp. Then we can write X=X1++XrX = X_1 + \dots + X_r where the XiX_i are i.i.d. Geom(p).\textrm{Geom}(p).

Proof:

捕获.JPG

Example Negative Binomial Expectation

捕获.JPG

Example Coupon collector

Suppose there are nn types of toys, which you are collecting one by one, with the goal of getting a complete set. When collecting toys, the toy types are random (as is sometimes the case, for example, with toys included in cereal boxes or included with kids' meals from a fast food restaurant). Assume that each time you collect a toy, it is equally likely to be any of the nn types. What is the expected number of toys needed until you have a complete set?

Solution: Let nn be the number of toys needed; we want to find E(N)E(N). Our strategy will be to break up NN into a sum of simpler r.v.s so that we can apply linearity. So write

N=N1+N2++Nn,N = N_1 + N_2 + \dots + N_n,

where N1N_1 is the number of toys until the first toy type you haven't seen before (which is always 1, as the first toy is always a new type), N2N_2 is the additional number of toys until the second toy type you haven't seen before, and so forth. The following figure illustrates these definitions with n=3n=3 toy types.

5_couponcollector.png

捕获.JPG

Warning

Expectation is linear, but in general we do not have E(g(X))=g(E(X))E(g(X)) = g(E(X)) for arbitrary functions gg. We must be careful not to move the EE around when gg is not linear. The next example shows a situation in which E(g(X))E(g(X)) is very different from g(E(X))g(E(X)).

Example St. Petersburg paradox

Suppose a wealthy stranger offers to play the following game with you. You will flip a fair coin until it lands Heads for the first time, and you will receive 2ifthegamelastsfor1round,2 if the game lasts for 1 round, 4 if the game lasts for 2 rounds, $8 if the game lasts for 3 rounds, and in general, 2n2^n if the game lasts for nn rounds. What is the fair value of this game (the expected payoff)? How much would you be willing to pay to play this game once?

Solution:

捕获.JPG