Lagrange Dual Problem

214 阅读1分钟

1. Preface

For the past few months, I have been troubled with the following questions:

  • How to construct the dual problem of a linear programming? Is there a general method to do this?
  • Why is the number of variables in dual problem equal to the number of constraints in primal problem?
  • Why is the number of constraints in dual problem equal to the number of variables in primal problem?
  • Why is the sign of the i\footnotesize i-th variable of dual problem determined by the sign of the i\footnotesize i-th constraint of primal problem?
  • Why is the the sign of the j\footnotesize j-th constraint of dual problem is determined by the sign of the j\footnotesize j-th variable of primal problem?

If you also want to know the reasons behind them, go on reading and you will have a deeper understanding of the dual problem.

2. Lagrange Dual Problem

2.1 Dual Pair

The two problems

z=max{c(x)xX}\small z=\max\{c(x)\,|\,x\in X\}
w=min{w(u)uU}\small w=\min\{w(u)\,|\,u\in U\}

form a (weak)-dual pair if c(x)w(u)\footnotesize c(x)\leq w(u) for all xX\footnotesize x \in X and all uU\footnotesize u\in U. When z=w\footnotesize z=w, they form a strong-dual pair.

2.2 Primal Problem

Consider an optimization problem

p=minf0(x)s.t.fi(x)0,i=1,,m\small \begin{aligned} &p^*=\min f_0(x) \\ s.t.\,\, &f_i(x)\leq 0,i=1,\cdots,m \end{aligned}

We denote by D\footnotesize D the domain of the problem, with DRn\footnotesize D\subseteq \mathbb{R}^n. The above is referred as primal problem.

One purpose of Lagrange duality is to find a lower bound on minimization problem.

2.3 Dual Problem

To the problem we associate the Lagrangian L:Rn×RmR\footnotesize \mathcal{L}: \mathbb{R}^n\times \mathbb{R}^m\rightarrow \mathbb{R}

L(x,λ)=f0(x)+i=1mλifi(x)\small \mathcal{L}(x,\lambda) = f_0(x) + \sum_{i=1}^m\lambda_if_i(x)

The variables λRm\footnotesize \lambda\in \mathbb{R}^m are called Lagrange multipliers.

It can be easily verified that

f0(x)L(x,λ),xD,λ0\small f_0(x)\geq \mathcal{L}(x,\lambda),\forall\, x\in D,\lambda\geq 0

So the primal problem can be precisely expressed as

p=minxDmaxλ0L(x,λ)\small p^* = \min_{x\in D}\max_{\lambda\geq 0} \mathcal{L}(x,\lambda)

We then define the Lagrange dual function

g(λ)=minxRnL(x,λ)\small g(\lambda) = \min_{x\in \mathbb{R}^n} \mathcal{L}(x,\lambda)

Thus we can obtain

f0(x)L(x,λ)g(λ),xD,λ0\small f_0(x)\geq \mathcal{L}(x,\lambda) \geq g(\lambda),\forall\, x\in D,\lambda\geq 0

so the problem

d=maxg(λ)s.t.λ0\small \begin{aligned} &d^* = \max g(\lambda) \\ s.t.\,\,& \lambda \geq 0 \end{aligned}

and the primal problem form a (weak)-dual pair. The above problem is called Lagrange Dual Problem.

2.4 Cases With Equality Constraints

Generally, consider cases with equality constraints:

p=minf0(x)s.t.{fi(x)0,i=1,,mhi(x)=0,i=1,,p\small \begin{aligned} &p^*=\min f_0(x) \\ s.t.& \begin{cases} f_i(x)\leq 0,i=1,\cdots,m \\ h_i(x) = 0,i=1,\cdots,p \end{cases} \end{aligned}

Rewrite the problem as:

p=minf0(x)s.t.{fi(x)0,i=1,,mhi(x)0,i=1,,phi(x)0,i=1,,p\small \begin{aligned} &p^*=\min f_0(x) \\ s.t.& \begin{cases} f_i(x)\leq 0,&i=1,\cdots,m \\ h_i(x) \leq 0,&i=1,\cdots,p \\ -h_i(x) \leq 0,&i=1,\cdots,p \end{cases} \end{aligned}

Using a multiplier vi+,vi\footnotesize v^+_i,v^-_i for the constraint hi(x)0\footnotesize h_i(x) \leq 0 and hi(x)0\footnotesize -h_i(x) \leq 0, we write the associated Lagrangian as

L(x,λ,v+,v)=f0(x)+i=1mλifi(x)+i=1pvi+hi(x)+i=1pvi(hi(x))=f0(x)+i=1mλifi(x)+i=1pvihi(x)\small \begin{aligned} \mathcal{L}(x,\lambda,v^+,v^-) &= f_0(x) + \sum_{i=1}^m\lambda_if_i(x) + \sum_{i=1}^pv^+_ih_i(x)+\sum_{i=1}^pv^-_i(-h_i(x)) \\ & = f_0(x) + \sum_{i=1}^m\lambda_if_i(x) + \sum_{i=1}^pv_ih_i(x) \end{aligned}

where v=v+v\footnotesize v=v^+ - v^- doesn't have any sign constraints.

Thus, inequality constraints in the original problem are associated with sign constraints on the corresponding multipliers.

3. Examples

Based on the above theory, let's construct the dual problem of Linear Programming.

3.1 Inequality Form

Consider the following form,

maxcTxs.t.{Axbx0\small \begin{aligned} &\max c^Tx \\ s.t.& \begin{cases} Ax\leq b \\ x\geq 0 \end{cases} \end{aligned}

change the form into

mincTxs.t.{Axb0x0\small \begin{aligned} &\min -c^Tx \\ s.t.& \begin{cases} Ax-b\leq 0 \\ -x\leq 0 \end{cases} \end{aligned}

Construct the Lagrangian

L(x,λ,v)=cTx+λT(Axb)+vT(x)=(cT+λTAvT)xλTb\small \mathcal{L}(x,\lambda,v) = -c^Tx+\lambda^T(Ax-b)+v^T(-x) = (-c^T+\lambda^TA-v^T)x-\lambda^Tb
cTxL(x,λ,v),λ0,v0\small -c^Tx\geq \mathcal{L}(x,\lambda,v),\forall\lambda\geq 0,v\geq 0

If cT+λTAvT=0\footnotesize -c^T+\lambda^TA-v^T =0, then

g(λ,v)=minxL(x,λ,v)=λTb\small g(\lambda,v) = \min_x \mathcal{L}(x,\lambda,v) = -\lambda^Tb
cTxL(x,λ,v)g(λ,v),λ0,v0\small -c^Tx\geq \mathcal{L}(x,\lambda,v)\geq g(\lambda,v),\forall \lambda\geq 0,v\geq 0

So the dual problem is

maxλTbs.t.{cT+λTA=vTλ0,v0\small \begin{aligned} &\max -\lambda^Tb \\ s.t.& \begin{cases} -c^T+\lambda^TA = v^T \\ \lambda\geq 0,v\geq 0 \end{cases} \end{aligned}

The final form can be obtained as follow:

minλTbs.t.{ATλcλ0\small \begin{aligned} &\min \lambda^Tb \\ s.t.& \begin{cases} A^T\lambda \geq c\\ \lambda\geq 0 \end{cases} \end{aligned}

3.2 A General Form

Consider the following problem

maxc1x1+c2x2+c3x3s.t.{a11x1+a12x2+a13x3b1a21x1+a22x2+a23x3b2a31x1+a32x2+a33x3=b3x10,x20,x3 free\small \begin{aligned} &\max\, c_1x_1+c_2x_2+c_3x_3 \\ s.t.& \begin{cases} a_{11}x_1+a_{12}x_2+a_{13}x_3 \leq b_1 \\ a_{21}x_1+a_{22}x_2+a_{23}x_3 \geq b_2 \\ a_{31}x_1+a_{32}x_2+a_{33}x_3 = b_3 \\ x_1\geq 0,x_2\leq 0, x_3\text{ free} \end{cases} \end{aligned}

rewrite the problem as:

min(c1x1+c2x2+c3x3)s.t.{a11x1+a12x2+a13x3b10(a21x1+a22x2+a23x3b2)0a31x1+a32x2+a33x3b30(a31x1+a32x2+a33x3b3)0x10x20\small \begin{aligned} &\min\,-(c_1x_1+c_2x_2+c_3x_3) \\ s.t.& \begin{cases} a_{11}x_1+a_{12}x_2+a_{13}x_3-b_1 &\leq 0 \\ -(a_{21}x_1+a_{22}x_2+a_{23}x_3-b_2)&\leq 0 \\ a_{31}x_1+a_{32}x_2+a_{33}x_3 - b_3 &\leq 0 \\ -(a_{31}x_1+a_{32}x_2+a_{33}x_3 - b_3)&\leq 0 \\ -x_1\leq 0\\ x_2\leq 0 \end{cases} \end{aligned}

Construct the Lagrangian

L(x,λ,v)=c1x1c2x2c3x3+λ1+(a11x1+a12x2+a13x3b1)+λ2((a21x1+a22x2+a23x3b2))+λ3+(a31x1+a32x2+a33x3b3)+λ3((a31x1+a32x2+a33x3b3))+v1(x1)+v2+x2=(c1+λ1+a11λ2a21+(λ3+λ3)a31v1)x1+(c2+λ1+a12λ2a22+(λ3+λ3)a32+v2+)x2+(c3+λ1+a13λ2a23+(λ3+λ3)a33)x3(λ1+b1λ2b2+(λ3+λ3)b3)\small \begin{aligned} \mathcal{L}(x,\lambda,v) =\,& -c_1x_1-c_2x_2-c_3x_3 + \\ &\lambda_1^+(a_{11}x_1+a_{12}x_2+a_{13}x_3-b_1) +\\ &\lambda_2^-(-(a_{21}x_1+a_{22}x_2+a_{23}x_3-b_2))+\\ &\lambda_3^+(a_{31}x_1+a_{32}x_2+a_{33}x_3 - b_3) +\\ &\lambda_3^-(-(a_{31}x_1+a_{32}x_2+a_{33}x_3 - b_3))+\\ &v_1^-(-x_1) + v_2^+x_2 \\ =\, &(-c_1+\lambda_1^+a_{11}-\lambda_2^-a_{21} + (\lambda_3^+ -\lambda_3^-)a_{31} - v_1^-)x_1 +\\ &(-c_2+\lambda_1^+a_{12}-\lambda_2^-a_{22} + (\lambda_3^+ -\lambda_3^-)a_{32}+v_2^+)x_2 +\\ &(-c_3+\lambda_1^+a_{13}-\lambda_2^-a_{23} + (\lambda_3^+ -\lambda_3^-)a_{33})x_3- \\ & (\lambda_1^+b_1-\lambda_2^-b_2+(\lambda_3^+ -\lambda_3^-)b_3) \end{aligned}
c1x1c2x2c3x3L(x,λ,v),λ1+,λ2,λ3+,λ3,v1,v2+0\small -c_1x_1-c_2x_2-c_3x_3\geq \mathcal{L}(x,\lambda,v),\forall\lambda_1^+,\lambda_2^-,\lambda_3^+,\lambda_3^-,v_1^-,v_2^+\geq 0

If

c1+λ1+a11λ2a21+(λ3+λ3)a31v1=0c2+λ1+a12λ2a22+(λ3+λ3)a32+v2+=0c3+λ1+a13λ2a23+(λ3+λ3)a33=0\small \begin{aligned} &-c_1+\lambda_1^+a_{11}-\lambda_2^-a_{21} + (\lambda_3^+ -\lambda_3^-)a_{31} - v_1^-=0\\ &-c_2+\lambda_1^+a_{12}-\lambda_2^-a_{22} + (\lambda_3^+ -\lambda_3^-)a_{32}+v_2^+=0\\ &-c_3+\lambda_1^+a_{13}-\lambda_2^-a_{23} + (\lambda_3^+ -\lambda_3^-)a_{33}=0 \end{aligned}

let λ1=λ1+,λ2=λ2,λ3=λ3+λ3\footnotesize \lambda_1 = \lambda_1^+,\lambda_2 = -\lambda_2^-,\lambda_3 = \lambda_3^+ -\lambda_3^-, then λ10,λ20,λ3 free\footnotesize \lambda_1\geq 0,\lambda_2\leq 0,\lambda_3\text{ free},

c1+λ1a11+λ2a21+λ3a31=v10c2+λ1a12+λ2a22+λ3a32=v2+0c3+λ1a13+λ2a23+λ3a33=0\small \begin{aligned} &-c_1+\lambda_1a_{11}+\lambda_2a_{21} +\lambda_3a_{31} = v_1^- \geq 0\\ &-c_2+\lambda_1a_{12}+\lambda_2a_{22} +\lambda_3a_{32} = -v_2^+ \leq 0\\ &-c_3+\lambda_1a_{13}+\lambda_2a_{23} +\lambda_3a_{33} = 0 \end{aligned}

then

g(λ,v)=minxL(x,λ,v)=(λ1+b1λ2b2+(λ3+λ3)b3)=(λ1b1+λ2b2+λ3b3)\small g(\lambda,v) = \min_x \mathcal{L}(x,\lambda,v) =- (\lambda_1^+b_1-\lambda_2^-b_2+(\lambda_3^+ -\lambda_3^-)b_3) = -(\lambda_1b_1+\lambda_2b_2+\lambda_3b_3)

the dual problem is

minλ1b1+λ2b2+λ3b3s.t.{λ1a11+λ2a21+λ3a31c1λ1a12+λ2a22+λ3a32c2λ1a13+λ2a23+λ3a33=c3λ10,λ20,λ3 free\small \begin{aligned} &\min\, \lambda_1b_1+\lambda_2b_2+\lambda_3b_3 \\ s.t.& \begin{cases} \lambda_1a_{11}+\lambda_2a_{21} +\lambda_3a_{31} \geq c_1 \\ \lambda_1a_{12}+\lambda_2a_{22} +\lambda_3a_{32} \leq c_2 \\ \lambda_1a_{13}+\lambda_2a_{23} +\lambda_3a_{33} = c_3 \\ \lambda_1\geq 0,\lambda_2\leq 0,\lambda_3\text{ free} \end{cases} \end{aligned}

Now, I think your trouble has disappeared. Thanks for your attention!

4. Reference

[1]. Laurence A. Wolsey, Integer programming; John Wiley & Sons, Inc: New York, America, 1998; pp. 28.

[2]. Lecture 7: Weak Duality(Lecturer: Laurent El Ghaoui): people.eecs.berkeley.edu/~elghaoui/T…