Averages, Law of Large Numbers, and Central Limit Theorem 4

71 阅读1分钟

Variance

One important application of LOTUS is for finding the* variance* of a random variable. Like expected value, variance is a single-number summary of the distribution of a random variable. While the expected value tells us the center of mass of a distribution, the variance tells us how spread out the distribution is.

Definition: Variance and Standard Deviation

The variance of an r.v. XX is

Var(X)=E(XEX)2.\textrm{Var}(X) = E(X-EX)^2.

The square root of the variance is called the standard deviation (SD):

SD(X)=Var(X).\textrm{SD}(X) = \sqrt{\textrm{Var}(X)}.

Recall that when we write E(XEX)2E(X-EX)^2, we mean the expectation of the random variable (XEX)2(X-EX)^2, not (E(XEX))2(E(X-EX))^2 (which is 00 by linearity).

The variance of XX measures how far XX is from its mean on average, but instead of simply taking the average difference between XX and its mean EXEX, we take the average squared difference. To see why, note that the average deviation from the mean, E(XEX)E(X-EX), always equals 0 by linearity; positive and negative deviations cancel each other out. By squaring the deviations, we ensure that both positive and negative deviations contribute to the overall variability. However, because variance is an average squared distance, it has the wrong units: if XX is in dollars, Var(X)\textrm{Var}(X) is in squared dollars. To get back to our original units, we take the square root; this gives us the standard deviation.

One might wonder why variance isn't defined as EXEXE|X-EX|, which would achieve the goal of counting both positive and negative deviations while maintaining the same units as XX. This measure of variability isn't nearly as popular as E(XEX)2E(X-EX)^2, for a variety of reasons. The absolute value function isn't differentiable at 0, so it doesn't have as nice properties as the squaring function. Squared distances are also connected to geometry via the distance formula and the Pythagorean theorem, which turn out to have corresponding statistical interpretations.

An equivalent expression for variance is E(XEX)2E(X-EX)^2. This formula is often easier to work with when doing actual calculations. Since this is the variance formula we will use over and over again, we state it as its own theorem.

Theorem

For any r.v. XX,

Var(X)=E(X2)(EX)2.\textrm{Var}(X) = E(X^2) - (EX)^2.

Proof: Let μ=EX\mu = EX. Expand (Xμ)2(X-\mu)^2 and use linearity:

Var(X)=E(Xμ)2=E(X22μX+μ2)=E(X2)2μEX+μ2=E(X2)μ2.\textrm{Var}(X) = E(X-\mu)^2 = E(X^2 - 2 \mu X + \mu^2) = E(X^2) - 2\mu EX + \mu^2 = E(X^2) - \mu^2.

Variance has the following properties. The first two are easily verified from the definition, the third will be addressed in a later chapter, and the last one is proven just after stating it.

捕获.JPG

To prove the last property, note that Var(X)\textrm{Var}(X) is the expectation of the nonnegative r.v. (XEX)2(X-EX)^2, so Var(X)0\textrm{Var}(X) \geq 0. If P(X=a)=1P(X=a)=1 for some constant aa, then E(X)=aE(X)=a and E(X2)=a2E(X^2)=a^2, so Var(X)=0\textrm{Var}(X)=0.

Conversely, suppose that Var(X)=0\textrm{Var}(X) = 0. Then E(XEX)2=0E(X-EX)^2 = 0, which shows that (XEX)2=0(X-EX)^2=0 has probability 11, which in turn shows that XX equals its mean with probability 11.

Example Geometric and Negative Binomial Variance

捕获.JPG

捕获.JPG

Example Binomial Variance

捕获.JPG