CHAPTER 3 Stochastic Least-Squares Problems
3.1提出问题
3.2给出一个简单的结果
3.3给出几何解释
3.4开始讲 线性 方法
3.5给出随机性最小二乘和确定性最小二乘的等价关系和对偶关系
3.1 随机估计问题THE PROBLEM OF STOCHASTIC ESTIMATION
给出两个独立的随机变量,
为已知,
为未知。
problem1:由的一个观测值y估计出
的一个观测值x_hat:(3.1.1)
或者更一般可以说,由随机变量来估计出随机变量
:(3.1.2)
引出problem2:h()函数如何确定?
要找到一个满足optimality criterion的估计方法:the least-mean-squares criterion(和C2中的least-squares criterion类似)
通过解条件期望:(推导见3.A.1)
需要用到联合概率分布,很难求。所以我们简化一下:
让h()为一个线性函数。我们还可能指出,当{x,y}为联合高斯分布时,通常是合理的假设,则无约束的最小均方估计量是线性的。
3.2线性最小均方估计LINEAR LEAST-MEAN-SQUARES ESTIMATORS
对于x和y这样的复数值和零均值随机变量,我们将协方差矩阵定义为 (-期望的乘积那一项因为他们各自期望为0所以为0),其中表示标量随机变量的复共轭和复共轭转置 (即所谓的Hermitian转置)用于向量值的随机变量。
此定义的主要原因是要确保,当x为标量时为非负标量,当x为向量随机变量时为非负定矩阵。
3.2.1基本方程式
我们的目标是在给定随机变量{yi}假定某些值{yi}的情况下估计由随机变量x假定的值。 我们对x的线性估计量感兴趣,即对随机变量{yi} 进行线性运算的估计量.
假定估计值可以由参数矩阵K0(还未知)得出:
此时得到的最优估计值就可以使误差协方差矩阵达到最小
注:矩阵的数学期望是矩阵, 矩阵中的元素是矩阵中每一元素的期望.
Theorem 3.21 (Optimal Linear L.M.S. Estimators) Given two complex zero mean random variables and
the
. Im.s. estimator of
given
defined by
is given by any solution
of the so-called normal equations
where and
The corresponding minimum-mean-square-error matrix (or error covariance matrix) is
证明: is a solution of the optimization problem (3.2.1)-(3.2.2) if, and only if, for all vectors
is a minimum of
, where
Note that is a scalar function of a complex-valued (row) vector quantity aK. Then (see App. A.6) differentiating
with respect to
and setting the derivative equal to zero at
leads to the equations
The corresponding minimum-mean-square-error (or, m.m.s.e. for short) matrix is
其中有? a P(K) a^{} \geq a P\left(K_{o}\right) a^{} ? for every and for every row vector
. The solution to the above problem is given by the following theorem.
也即 also minimizes the mean-square error in the estimator of each component of the vector
.
Theorem 3.2.2 (Unique Solutions) Assume that (R_y正定方阵,必可逆)Then the optimum choice
that minimizes
is given by
The m. m.s.e. (see ) can be written as
K is always nonnegative (because Ry > 0)。
3.2.2Stochastic Interpretation of Triangular Fadorization (LDL,UDU,舒尔补)
从分块矩阵、三角矩阵、舒尔补的角度来看上一节推导出的结果:(另一种简便的求解的方法)
assuming i.e.,
is the 舒尔补 of
in the joint covariance matrix:
对于这个joint covariance matrix,可以很方便地进行分解和
分解
When the UDU* decomposition如下:
follows from the representation of the pair of correlated random variables in terms of the obviously uncorrelated pair
Whese
扩展到从多个随机变量估计一个随机变量的情况:y,z -> x
and verify that the covariance matrix of the error in estimating given both
and
is given by
这个结果对于合并基于不同观测值的估计量的问题很有用,我们将在 sec.3.4.3和 prob.3.23。
3.2.3奇异数据的协方差矩阵Singular Data Covariance Matrices
先前我们都是假设,Ry正定 = = Ry可逆矩阵==Ry向量线性无关。现在我们假设Ry是个奇异矩阵,然后通过...证明这条假设是矛盾的(在复数下?),所以Ry就是正定矩阵。
如果我们这里强行假设Ry为奇异矩阵,我们想知道normal function的解的情况。讨论了的不必要性,不>0虽然normal function有多个解,但是最优估计和cost function都是唯一的,见如下定理。
Theorem 32.3 (Non-unique Solutions) Even if is singular, the normal equations
will be consistent, and there will be many solutions. No matter which solution
is used, the corresponding l. l.m.s. estimator
will, however, be unique, and so of course will
3.2.4非零均值and居中Nonzero-Mean Values and Centering
前面的讨论以及normal function的解都是在x和y的均值/期望为0的条件下进行的。
现在如果x和y的均值/期望不为0,我们则通过Centering的方法进行仿射变换。
他们的协方差/互协方差矩阵为如下:
the covariance matrix of
the cross-covariance matrix of
and
最优估计值为:
或等价为
我们可以看到,严格来说,给定y的x的线性最小均方估计量实际上是y的仿射函数,而不是线性函数。 然而,很容易为继续将x称为y的线性函数而辩解。
3.2.5复数值随机变量的估计量Estimators for Complex-Valued Random Variables
复数值随机变量的估计有两种方法:不管我们是把问题简化为实数随机变量来做,还是用复杂的复数随机变量来做,normal function还是相同的;唯一的区别是,我们可能正在使用给定的向量{x,y}或它们的扩展版本{xR,xI,yR,yI}。
3.3几何观点A GEOMETRIC FORMULATION
3.3.1正交条件
那么(3.3.2)就可以用几何观点来看:
对比一下确定性最小二乘的几何观点:
注意:投影平面是不一样的!!figure3.1中?K_{0}?是系数,figure2.1中是系数
用引理来说明几何观点:
Lemma 3.3.1 (The Orthogonality Condition) The linear least-mean-squares estimator (LLm.se) of a random variable x given a set of other random variables y is characterized by the fact that the error in the estimator is orthogonal to (i.e., uncorrelated with) each
of the random variables used to form the estimator. Equivalently, the LL.m.se. is the projection of x onto
Projection onto the linear space (which we denote here by
) has the important properties
and
These geometrically intuitive properties can be formally verified by using the explicit formula
3.3.2例子
3.4线性模型
An extremely important special case that will often arise in our analysis occurs when and
are linearly related, say as
where is a known matrix and
is a zero-mean random-noise vector uncorrelated with
. Assume that
and
are known and also that.
Then the l.l.m.s.e. and the corresponding m.m.s.e. can be written as
and
These formulas will be encountered in many different contexts in later chapters.
3.4.1 Rx> 0和Rv> 0时的information forms
We may remark that formulas using inverses of covariance matrices are sometimes called Information Form results, because loosely speaking the amount of information obtained by observing a random variable varies inversely as its variance.
使用矩阵求逆定理:
重新表示:
有如下的nice fomula
3.4.2高斯-马尔可夫定理
3.4.3组合估计器Combining Estimators
用多个随机变量来估计未知随机变量:
Lemma 3.4.1 (Combining Estimators) Let and
be two separate observations
of a zero-mean random variable
, such that
and
where
are mutually uncorrelated zero-mean random variables with covariance matrices
and
respectively. Denote by
and
the L.l.m.s. estimators
of
given
and
respectively, and likewise define the error covariance matrices.
and
Then
the
estimator of
given
both
and
can be found as
where
, the corresponding error covariance matrix, is given by
♥3.5与确定性最小二乘的等价性
Appendix for Chapter 3
3.A最小均方估计LEAST-MEAN-SQUARES ESTIMATION
在本附录中,我们考虑一个更普遍的问题,即确定一个可能的非线性函数h(·),该函数在一个随机变量的最小均方意义上根据另一个随机变量
的观测值作为最佳估计.
定义一个error 随机变量:
The least-mean-squares criterion minimizes the "variance" of the error variable:
cost function:(结果是(误差)协方差矩阵)
Theorem 3.A.1 (The Optimal Least-Mean-Squares Estimator) The optimal leastmean-squares ( l.m.s. ) estimator (cf. ) of a random variable
given the value of another random variable y is given by the conditional expectation
In particular, if and
are independent random variables, then the optimal estimator of
is