Abstract Duality in Optimization (Part III)
Published:
Difficulty: Experts & Advanced Graduate Students
We now look at some specialized applications of our previous results. This particular post covers the general framework of Rockafellar and Fenchel and derives the classic Karush-Kuhn-Tucker (KKT) conditions of Arrow and Hurwicz.
Coupled Composite Functions
Consider the specialization
\[\begin{aligned} F(u) & :=J(u,Au),\\ \Phi(u,p) & :=J(u,Au-p),\end{aligned}\]where $J$ is a function of $V\times Y$ into $\overline{\mathbb{R}}$ and $A$ is a linear operator. It is straightforward to show that
\[\Phi^{*}(0,p^{*})=J^{*}(A^{*}p^{*},-p^{*})\]where $A^{*}$ is the adjoint of $A$. Furthermore, if $J$ is convex (resp. in $\Gamma_{0}(V\times Y)$) then $\Phi$ is convex (resp. in $\Gamma_{0}(V\times Y)$).
Remark V.1. The Lagrangian associated with this specialization is
\[L(u,p^{*})=J_{u}^{*}(-p^{*})-\left\langle p^{*},Au\right\rangle\]where $J_{u}$ is the function $p\mapsto J(u,p)$ and $J_{u}^{*}$ is its conjugate function.
The next result specializes the extremality conditions in part III.
Proposition V.2. The following statements are equivalent:
$\bar{u}$ is a solution of ${\cal P}$, $\bar{p}^{*}$ is a solution of ${\cal P}^{*}$, and $\inf\ {\cal P}=\sup\ {\cal P}^{*}$.
$J(\bar{u},A\bar{u})+J^{*}(A^{*}\bar{p}^{*},-\bar{p}^{*})=0$.
$(A^{*}\bar{p}^{*},-\bar{p})\in\partial J(\bar{u},A\bar{u})$.
Decoupled Conjugate Functions
Let us further specialize $J$ as \(J(u,Au)=G(u)+H(Au).\) It is easy to verify that
\[\begin{aligned} J^{*}(u^{*},p^{*}) & =G^{*}(u^{*})+H^{*}(p^{*}),\\ {\cal P}^{*} & \equiv\sup_{p^{*}\in Y^{*}}\left\{ -G^{*}(A^{*}p^{*})-H^{*}(-p^{*})\right\} .\end{aligned}\]Furthermore, if $G$ and $H$ are convex (resp. convex and proper lower semicontinuous) then $J$ is convex (resp. convex and proper lower semicontinuous).
Remark V.3. The extremality conditions for strong duality of ${\cal P}$ can be simplified to
\[\begin{aligned} G(\bar{u})+G^{*}(A^{*}\bar{p}^{*})-\left\langle A^{*}\bar{p}^{*},\bar{u}\right\rangle & =0,\\ H(A\bar{u})+H^{*}(-\bar{p}^{*})+\left\langle \bar{p}^{*},A\bar{u}\right\rangle & =0,\end{aligned}\]or equivalently,
\[\begin{aligned} A^{*}\bar{p}^{*} & \in\partial G(\bar{u}),\\ -\bar{p}^{*} & \in\partial H(A\bar{u}).\end{aligned}\]Remark V.4. When $Y=\prod_{i=1}^{m}Y_{i}$ and $Y^{*}=\prod_{i=1}^{m}Y_{i}^{*}$ then the extremality conditions for $H$ can be further decoupled into
\[H_{i}(A_{i}\bar{u})+H_{i}^{*}(-\bar{p}_{i}^{*})+\left\langle \bar{p}_{i}^{*},A_{i}\bar{u}\right\rangle =0,\]and
\[-\bar{p}_{i}^{*}\in\partial H_{i}(A_{i}\bar{u})\]for $i=1,\ldots,m$.
Cone-Constrained Programming
We first derive some results for general cone-convex constraint functions.
Let ${\cal C}\subseteq V$ be nonempty closed convex, let $J:V\mapsto\overline{\mathbb{R}}$ be convex and lower semicontinuous with domain ${\cal C}$, and let ${\cal K}\subseteq Y$ to be a closed convex cone with partial ordering $\leq$ and polar cone ${\cal K}^{*}$.
Furthermore, let $B:{\cal C}\mapsto Y$ be a (possibly nonlinear) mapping such that
$B$ is ${\cal K}$-convex on ${\cal C}$, i.e., $B(\lambda u+[1-\lambda]v)\leq\lambda B(u)+(1-\lambda)B(v)$ for every $u,v\in{\cal C}$ and $\lambda\in(0,1)$.
For every $p^{*}\in Y^{*}$ where $p^{*}\geq0$, the mapping $u\mapsto\left\langle p^{*},B(u)\right\rangle$ of ${\cal C}$ into $\mathbb{R}$ is lower semicontinuous.
${u\in{\cal C}:B(u)\leq0}\neq\emptyset$.
Denoting $\delta_{{\cal F}_{p}}(u)$ to be the indicator function
\[\delta_{\{\cal F\}_{p}}(u):=\begin{cases} 0, & u\in{\cal C}\text{ and }B(u)\leq p,\\ +\infty, & \text{otherwise}, \end{cases}\]of the set ${\cal F}_{p}:={u\in V:u\in{\cal C},\ B(u)\leq p}$, we are interested in the specialization
\[\begin{aligned} F(u) & :=J(u)+\delta_{\{\cal F\}_{0}}(u),\\ \Phi(u,p) & :=J(u)+\delta_{\{\cal F\}_{p}}(u).\end{aligned}\]Some basic properties of $\Phi$ and ${\cal F}_{p}$ are as follows.
Proposition VI.1. $\Phi\in\Gamma(V\times Y)$ and ${\cal F}_{p}$ is closed and convex in $V$ for every $p\in Y$. Moreover, the set ${\cal F}={(u,p)\in V\times Y:u\in{\cal C},\ B(u)\leq p}$ is closed convex in $V\times Y$.
Remark VI.2. The Lagrangian $L$ of ${\cal P}$ is given by
\[L(u,p^{*})=J(u)-\left\langle p^{*},B(u)\right\rangle \overset{\cdot}{-}\delta_{\{\cal K\}^{*}}(-p^{*})\]where addition and subtraction in $\overline{\mathbb{R}}$ are given by
\[\begin{aligned} (+\infty)\overset{\cdot}{+}(-\infty) & =(+\infty)\overset{\cdot}{-}(+\infty)=+\infty,\\ (+\infty)+(-\infty) & =(+\infty)\underset{\cdot}{-}(+\infty)=-\infty.\end{aligned}\]The next result gives a sufficient condition for the stability of ${\cal P}$. It is often called the constraint qualification hypothesis or Slater’s condition.
Proposition VI.3. Suppose $\inf\ {\cal P}$ is finite and there exists $u_{0}\in{\cal C}$ such that $B(u_{0})<0$, i.e., $-B(u_{0})\in{\rm int}\ {\cal K}=$ the interior of ${\cal K}$. Then, ${\cal P}$ is stable.
Convex Programming
Let us now specialize the previous results to convex programming and, subsequently, derive the classic KKT conditions. Suppose $V=V^{*}=\mathbb{R}^{n}$, $Y=Y^{*}=\mathbb{R}^{m}$, and ${\cal K}$ equal to the nonnegative orthant ${\mathbb{R}}_{+}^{m}$. Moreover, suppose $B$ is defined by its components $B_{1},\ldots,B_{m}:{\cal C}\mapsto\mathbb{R}$ which are assumed to be convex and lower semicontinuous.
The specialization of Slater’s condition reads: $\inf\ {\cal P}$ is finite and there exists $u_{0}\in{\cal C}$ such that $B_{i}(u_{0})$ for $i=1,\ldots,m$. On the other hand, the specialization of the Lagrangian is of the form \(L(u,p)=J(u)-\sum_{i=1}^{m}p_{i}B_{i}(u)\) for $p\leq0$ and $u\in{\cal C}$.
We now finish with the Kuhn-Tucker Theorem, and the associated KKT conditions.
Theorem VI.4. The point $\bar{u}\in{\cal C}$ is a solution of ${\cal P}$ if and only if there exists $\bar{p}\in\mathbb{R}^{m}$ such that $\bar{p}\leq0$ and
\[L(\bar{u},p)\leq L(\bar{u},\bar{p})\leq L(u,\bar{p})\]for every $u\in{\cal C}$ and $p\geq0$. In this case, $\sum_{i=1}^{m}\langle\bar{p}_{i},B_{i}(\bar{u})\rangle=0$ which implies that for every $i=1,\ldots,m$ we have
\[\begin{cases} \text{either }B_{i}(\bar{u})<0\text{ and }\bar{p}_{i}=0,\\ \text{or }B_{i}(\bar{u})=0\text{ and }\bar{p}_{i}\leq0. \end{cases}\]
Leave a Comment