Discrete Random Variables 4

185 阅读2分钟

Functions of Random Variables

In this section we will discuss what it means to take a function of a random variable, and we will build understanding for why a function of a random variable is a random variable. That is, if XX is a random variable, then X2X^2, exe^x, and sin(X)sin(X) are also random variables, as is g(X)g(X) for any function g:RRg: \mathbb{R} \to \mathbb{R}.

Definition: Function of an r.v.

For an experiment with sample space SS, an r.v. XX, and a function g:RRg: \mathbb{R} \to \mathbb{R}, g(X)g(X) is the r.v. that maps ss to g(X(s))g(X(s)) for all sSs \in S.

Taking g(x)=xg(x) = \sqrt{x} for concreteness, the following figure represents g(X)g(X) by directly labeling the sample outcomes. If XX crystallizes to 4, then g(X)g(X) crystallizes to 2.

3_gx.png

Warning Category Errors

Many common mistakes in probability can be traced to confusing two of the following fundamental objects with each other: distributions, random variables, events, and numbers. Such mistakes are examples of category errors. In general, a category error is a mistake that doesn't just happen to be wrong, but in fact is necessarily wrong since it is based on the wrong category of object.

For example, answering the question ''How many people live in Boston?" with ''" or ''" or ''pink elephants" would be a category error - we may not know the population size of a city, but we do know that it is a nonnegative integer at any point in time. To help avoid being categorically wrong, always think about what category an answer should have.

An especially common category error is to confuse a random variable with its distribution. The following saying sheds light on the distinction between a random variable and its distribution:

''The word is not the thing; the map is not the territory.'' - Alfred Korzybski

We can think of the distribution of a random variable as a map or blueprint describing the r.v. Just as different houses can share the same blueprint, different r.v.s can have the same distribution, even if the experiments they summarize, and the sample spaces they map from, are not the same.

Independence of Random Variables

Just as we had the notion of independence of events, we can define independence of random variables. Intuitively, if two r.v.s XX and YY are independent, then knowing the value of XX gives no information about the value of YY, and vice versa.

Definition: Independence of two r.v.s

Random variables XX and YY are said to be independent if

P(Xx,Yy)=P(Xx)P(Yy),P(X \leq x, Y \leq y) = P(X \leq x) P(Y \leq y),

for all x,yRx, y \in \mathbb {R}. In the discrete case, this is equivalent to the condition

P(X=x,Y=y)=P(X=x)P(Y=y),P(X=x, Y=y) = P(X=x) P(Y=y),

for all x,yx, y with xx in the support of XX and yy in the support of YY.

Example

In a roll of two fair dice, if XX is the number on the first die and YY is the number on the second die, then X+YX + Y is not independent of XYX - Y. To see why, note that

0=P(X+Y=12,XY=1)P(X+Y=12)P(XY=1)=136536.0 = P(X+Y=12, X-Y=1) \neq P(X+Y=12) P(X-Y=1) = \frac{1}{36} \cdot \frac{5}{36}.

Since we have found a pair of values (s,d)(s, d) for which

P(X+Y=s,XY=d)P(X+Y=s)P(XY=d),P(X+Y=s, X-Y=d) \neq P(X+Y=s) P(X-Y=d),

X+YX + Y and XYX - Y are dependent. This also makes sense intuitively: knowing the sum of the dice is 12 tells us their difference must be 0, so the r.v.s provide information about each other.

If XX and YY are independent then it is also true, for example, that X2X^2 is independent of Y3Y^3, since if X2X^2 provided information about Y3Y^3 then XX would give information about YY (using X2X^2 and Y3Y^3 as intermediaries). In general, if XX and YY are independent then any function of XX is independent of any function of YY.

i.i.d.

Definition: i.i.d.

We will often work with random variables that are independent and have the same distribution. We call such r.v.s independent and identically distributed, or i.i.d. for short.

By taking a sum of i.i.d. Bernoulli r.v.s, we can write down the story of the Binomial distribution in an algebraic form.

Theorem

If XBin(n,p)X \sim \textrm{Bin}(n,p), viewed as the number of successes in n independent Bernoulli trials with success probability pp, then we can write X=X1++XnX=X_1 + \dots + X_n where the XiX_i are i.i.d. Bern(p)Bern(p).

Proof:

捕获.JPG

Theorem

If XBin(n,p)X \sim \textrm{Bin}(n,p), YBin(m,p)Y \sim \textrm{Bin}(m,p), and XX is independent of YY, then X+YBin(n+m,p)X+Y \sim \textrm{Bin}(n+m,p).

Proof: We present three proofs, since each illustrates a useful technique.

捕获.JPG

捕获.JPG

捕获.JPG

Definition: Conditional Independence of r.v.s

Random variables XX and YY are conditionally independent given an r.v. ZZ if for all x,yRx,y \in \mathbb{R} and all zz in the support of ZZ,

P(Xx,YyZ=z)=P(XxZ=z)P(YyZ=z).P(X \leq x, Y \leq y | Z=z) = P(X \leq x | Z=z) P(Y \leq y | Z=z).

For discrete r.v.s, an equivalent definition is to require

P(X=x,Y=yZ=z)=P(X=xZ=z)P(Y=yZ=z).P(X = x, Y = y | Z=z) = P(X = x | Z=z) P(Y = y | Z=z).

This is the definition of independence, except that we condition on Z=zZ = z everywhere, and require the equality to hold for all zz in the support of ZZ.

Definition: Conditional PMF

For any discrete r.v.s XX and YY, the function P(X=xZ=z)P(X=x | Z=z), when considered as a function of xx for fixed zz, is called the conditional PMF of XX given Z=zZ = z.

Independence of r.v.s does not imply conditional independence, nor vice versa. First let us show why independence does not imply conditional independence.

Example Matching Pennies

捕获.JPG

Example Mystery Opponent

捕获.JPG