Functions of Random Variables

In this section we will discuss what it means to take a function of a random variable, and we will build understanding for why a function of a random variable is a random variable. That is, if $X$ is a random variable, then $X^2$ , $e^x$ , and $sin(X)$ are also random variables, as is $g(X)$ for any function $g: \mathbb{R} \to \mathbb{R}$ .

Definition: Function of an r.v.

For an experiment with sample space $S$ , an r.v. $X$ , and a function $g: \mathbb{R} \to \mathbb{R}$ , $g(X)$ is the r.v. that maps $s$ to $g(X(s))$ for all $s \in S$ .

Taking $g(x) = \sqrt{x}$ for concreteness, the following figure represents $g(X)$ by directly labeling the sample outcomes. If $X$ crystallizes to 4, then $g(X)$ crystallizes to 2.

Warning Category Errors

Many common mistakes in probability can be traced to confusing two of the following fundamental objects with each other: distributions, random variables, events, and numbers. Such mistakes are examples of category errors. In general, a category error is a mistake that doesn't just happen to be wrong, but in fact is necessarily wrong since it is based on the wrong category of object.

For example, answering the question ''How many people live in Boston?" with ''" or ''" or ''pink elephants" would be a category error - we may not know the population size of a city, but we do know that it is a nonnegative integer at any point in time. To help avoid being categorically wrong, always think about what category an answer should have.

An especially common category error is to confuse a random variable with its distribution. The following saying sheds light on the distinction between a random variable and its distribution:

''The word is not the thing; the map is not the territory.'' - Alfred Korzybski

We can think of the distribution of a random variable as a map or blueprint describing the r.v. Just as different houses can share the same blueprint, different r.v.s can have the same distribution, even if the experiments they summarize, and the sample spaces they map from, are not the same.

Independence of Random Variables

Just as we had the notion of independence of events, we can define independence of random variables. Intuitively, if two r.v.s $X$ and $Y$ are independent, then knowing the value of $X$ gives no information about the value of $Y$ , and vice versa.

Definition: Independence of two r.v.s

Random variables $X$ and $Y$ are said to be independent if
$P(X \leq x, Y \leq y) = P(X \leq x) P(Y \leq y),$
for all $x, y \in \mathbb {R}$ . In the discrete case, this is equivalent to the condition
$P(X=x, Y=y) = P(X=x) P(Y=y),$
for all $x, y$ with $x$ in the support of $X$ and $y$ in the support of $Y$ .

Example

In a roll of two fair dice, if $X$ is the number on the first die and $Y$ is the number on the second die, then $X + Y$ is not independent of $X - Y$ . To see why, note that

0 = P(X+Y=12, X-Y=1) \neq P(X+Y=12) P(X-Y=1) = \frac{1}{36} \cdot \frac{5}{36}.

Since we have found a pair of values $(s, d)$ for which

P(X+Y=s, X-Y=d) \neq P(X+Y=s) P(X-Y=d),

$X + Y$ and $X - Y$ are dependent. This also makes sense intuitively: knowing the sum of the dice is 12 tells us their difference must be 0, so the r.v.s provide information about each other.

If $X$ and $Y$ are independent then it is also true, for example, that $X^2$ is independent of $Y^3$ , since if $X^2$ provided information about $Y^3$ then $X$ would give information about $Y$ (using $X^2$ and $Y^3$ as intermediaries). In general, if $X$ and $Y$ are independent then any function of $X$ is independent of any function of $Y$ .

i.i.d.

Definition: i.i.d.

We will often work with random variables that are independent and have the same distribution. We call such r.v.s independent and identically distributed, or i.i.d. for short.

By taking a sum of i.i.d. Bernoulli r.v.s, we can write down the story of the Binomial distribution in an algebraic form.

Theorem

If $X \sim \textrm{Bin}(n,p)$ , viewed as the number of successes in n independent Bernoulli trials with success probability $p$ , then we can write $X=X_1 + \dots + X_n$ where the $X_i$ are i.i.d. $Bern(p)$ .

Proof:

捕获.JPG

Theorem

If $X \sim \textrm{Bin}(n,p)$ , $Y \sim \textrm{Bin}(m,p)$ , and $X$ is independent of $Y$ , then $X+Y \sim \textrm{Bin}(n+m,p)$ .

Proof: We present three proofs, since each illustrates a useful technique.

捕获.JPG

Definition: Conditional Independence of r.v.s

Random variables $X$ and $Y$ are conditionally independent given an r.v. $Z$ if for all $x,y \in \mathbb{R}$ and all $z$ in the support of $Z$ ,
$P(X \leq x, Y \leq y | Z=z) = P(X \leq x | Z=z) P(Y \leq y | Z=z).$
For discrete r.v.s, an equivalent definition is to require
$P(X = x, Y = y | Z=z) = P(X = x | Z=z) P(Y = y | Z=z).$

This is the definition of independence, except that we condition on $Z = z$ everywhere, and require the equality to hold for all $z$ in the support of $Z$ .

Definition: Conditional PMF

For any discrete r.v.s $X$ and $Y$ , the function $P(X=x | Z=z)$ , when considered as a function of $x$ for fixed $z$ , is called the conditional PMF of $X$ given $Z = z$ .

Independence of r.v.s does not imply conditional independence, nor vice versa. First let us show why independence does not imply conditional independence.

Example Matching Pennies

捕获.JPG

Example Mystery Opponent

捕获.JPG

Discrete Random Variables 4

Functions of Random Variables

Independence of Random Variables

Example

i.i.d.

Example Matching Pennies

Example Mystery Opponent