Abstract Duality in Optimization (Part III)

14 minute read

Published:

Difficulty: Experts & Advanced Graduate Students

We now look at some specialized applications of our previous results. This particular post covers the general framework of Rockafellar and Fenchel and derives the classic Karush-Kuhn-Tucker (KKT) conditions of Arrow and Hurwicz.

Coupled Composite Functions

Consider the specialization

\[\begin{aligned} F(u) & :=J(u,Au),\\ \Phi(u,p) & :=J(u,Au-p),\end{aligned}\]

where $J$ is a function of $V\times Y$ into $\overline{\mathbb{R}}$ and $A$ is a linear operator. It is straightforward to show that

\[\Phi^{*}(0,p^{*})=J^{*}(A^{*}p^{*},-p^{*})\]

where $A^{*}$ is the adjoint of $A$. Furthermore, if $J$ is convex (resp. in $\Gamma_{0}(V\times Y)$) then $\Phi$ is convex (resp. in $\Gamma_{0}(V\times Y)$).

Remark V.1. The Lagrangian associated with this specialization is

\[L(u,p^{*})=J_{u}^{*}(-p^{*})-\left\langle p^{*},Au\right\rangle\]

where $J_{u}$ is the function $p\mapsto J(u,p)$ and $J_{u}^{*}$ is its conjugate function.

The next result specializes the extremality conditions in part III.

Proposition V.2. The following statements are equivalent:

  1. $\bar{u}$ is a solution of ${\cal P}$, $\bar{p}^{*}$ is a solution of ${\cal P}^{*}$, and $\inf\ {\cal P}=\sup\ {\cal P}^{*}$.

  2. $J(\bar{u},A\bar{u})+J^{*}(A^{*}\bar{p}^{*},-\bar{p}^{*})=0$.

  3. $(A^{*}\bar{p}^{*},-\bar{p})\in\partial J(\bar{u},A\bar{u})$.

Decoupled Conjugate Functions

Let us further specialize $J$ as \(J(u,Au)=G(u)+H(Au).\) It is easy to verify that

\[\begin{aligned} J^{*}(u^{*},p^{*}) & =G^{*}(u^{*})+H^{*}(p^{*}),\\ {\cal P}^{*} & \equiv\sup_{p^{*}\in Y^{*}}\left\{ -G^{*}(A^{*}p^{*})-H^{*}(-p^{*})\right\} .\end{aligned}\]

Furthermore, if $G$ and $H$ are convex (resp. convex and proper lower semicontinuous) then $J$ is convex (resp. convex and proper lower semicontinuous).

Remark V.3. The extremality conditions for strong duality of ${\cal P}$ can be simplified to

\[\begin{aligned} G(\bar{u})+G^{*}(A^{*}\bar{p}^{*})-\left\langle A^{*}\bar{p}^{*},\bar{u}\right\rangle & =0,\\ H(A\bar{u})+H^{*}(-\bar{p}^{*})+\left\langle \bar{p}^{*},A\bar{u}\right\rangle & =0,\end{aligned}\]

or equivalently,

\[\begin{aligned} A^{*}\bar{p}^{*} & \in\partial G(\bar{u}),\\ -\bar{p}^{*} & \in\partial H(A\bar{u}).\end{aligned}\]

Remark V.4. When $Y=\prod_{i=1}^{m}Y_{i}$ and $Y^{*}=\prod_{i=1}^{m}Y_{i}^{*}$ then the extremality conditions for $H$ can be further decoupled into

\[H_{i}(A_{i}\bar{u})+H_{i}^{*}(-\bar{p}_{i}^{*})+\left\langle \bar{p}_{i}^{*},A_{i}\bar{u}\right\rangle =0,\]

and

\[-\bar{p}_{i}^{*}\in\partial H_{i}(A_{i}\bar{u})\]

for $i=1,\ldots,m$.

Cone-Constrained Programming

We first derive some results for general cone-convex constraint functions.

Let ${\cal C}\subseteq V$ be nonempty closed convex, let $J:V\mapsto\overline{\mathbb{R}}$ be convex and lower semicontinuous with domain ${\cal C}$, and let ${\cal K}\subseteq Y$ to be a closed convex cone with partial ordering $\leq$ and polar cone ${\cal K}^{*}$.

Furthermore, let $B:{\cal C}\mapsto Y$ be a (possibly nonlinear) mapping such that

  • $B$ is ${\cal K}$-convex on ${\cal C}$, i.e., $B(\lambda u+[1-\lambda]v)\leq\lambda B(u)+(1-\lambda)B(v)$ for every $u,v\in{\cal C}$ and $\lambda\in(0,1)$.

  • For every $p^{*}\in Y^{*}$ where $p^{*}\geq0$, the mapping $u\mapsto\left\langle p^{*},B(u)\right\rangle$ of ${\cal C}$ into $\mathbb{R}$ is lower semicontinuous.

  • ${u\in{\cal C}:B(u)\leq0}\neq\emptyset$.

Denoting $\delta_{{\cal F}_{p}}(u)$ to be the indicator function

\[\delta_{\{\cal F\}_{p}}(u):=\begin{cases} 0, & u\in{\cal C}\text{ and }B(u)\leq p,\\ +\infty, & \text{otherwise}, \end{cases}\]

of the set ${\cal F}_{p}:={u\in V:u\in{\cal C},\ B(u)\leq p}$, we are interested in the specialization

\[\begin{aligned} F(u) & :=J(u)+\delta_{\{\cal F\}_{0}}(u),\\ \Phi(u,p) & :=J(u)+\delta_{\{\cal F\}_{p}}(u).\end{aligned}\]

Some basic properties of $\Phi$ and ${\cal F}_{p}$ are as follows.

Proposition VI.1. $\Phi\in\Gamma(V\times Y)$ and ${\cal F}_{p}$ is closed and convex in $V$ for every $p\in Y$. Moreover, the set ${\cal F}={(u,p)\in V\times Y:u\in{\cal C},\ B(u)\leq p}$ is closed convex in $V\times Y$.

Remark VI.2. The Lagrangian $L$ of ${\cal P}$ is given by

\[L(u,p^{*})=J(u)-\left\langle p^{*},B(u)\right\rangle \overset{\cdot}{-}\delta_{\{\cal K\}^{*}}(-p^{*})\]

where addition and subtraction in $\overline{\mathbb{R}}$ are given by

\[\begin{aligned} (+\infty)\overset{\cdot}{+}(-\infty) & =(+\infty)\overset{\cdot}{-}(+\infty)=+\infty,\\ (+\infty)+(-\infty) & =(+\infty)\underset{\cdot}{-}(+\infty)=-\infty.\end{aligned}\]

The next result gives a sufficient condition for the stability of ${\cal P}$. It is often called the constraint qualification hypothesis or Slater’s condition.

Proposition VI.3. Suppose $\inf\ {\cal P}$ is finite and there exists $u_{0}\in{\cal C}$ such that $B(u_{0})<0$, i.e., $-B(u_{0})\in{\rm int}\ {\cal K}=$ the interior of ${\cal K}$. Then, ${\cal P}$ is stable.

Convex Programming

Let us now specialize the previous results to convex programming and, subsequently, derive the classic KKT conditions. Suppose $V=V^{*}=\mathbb{R}^{n}$, $Y=Y^{*}=\mathbb{R}^{m}$, and ${\cal K}$ equal to the nonnegative orthant ${\mathbb{R}}_{+}^{m}$. Moreover, suppose $B$ is defined by its components $B_{1},\ldots,B_{m}:{\cal C}\mapsto\mathbb{R}$ which are assumed to be convex and lower semicontinuous.

The specialization of Slater’s condition reads: $\inf\ {\cal P}$ is finite and there exists $u_{0}\in{\cal C}$ such that $B_{i}(u_{0})$ for $i=1,\ldots,m$. On the other hand, the specialization of the Lagrangian is of the form \(L(u,p)=J(u)-\sum_{i=1}^{m}p_{i}B_{i}(u)\) for $p\leq0$ and $u\in{\cal C}$.

We now finish with the Kuhn-Tucker Theorem, and the associated KKT conditions.

Theorem VI.4. The point $\bar{u}\in{\cal C}$ is a solution of ${\cal P}$ if and only if there exists $\bar{p}\in\mathbb{R}^{m}$ such that $\bar{p}\leq0$ and

\[L(\bar{u},p)\leq L(\bar{u},\bar{p})\leq L(u,\bar{p})\]

for every $u\in{\cal C}$ and $p\geq0$. In this case, $\sum_{i=1}^{m}\langle\bar{p}_{i},B_{i}(\bar{u})\rangle=0$ which implies that for every $i=1,\ldots,m$ we have

\[\begin{cases} \text{either }B_{i}(\bar{u})<0\text{ and }\bar{p}_{i}=0,\\ \text{or }B_{i}(\bar{u})=0\text{ and }\bar{p}_{i}\leq0. \end{cases}\]

Leave a Comment