Publications

*You can also find my articles on my Google Scholar profile.

2024

  • Global Complexity Bound of a Proximal ADMM for Linearly-Constrained Nonseperable Nonconvex Composite Programming
    Weiwei Kong & Renato D. C. Monteiro
    SIAM Journal on Optimization 34 (2024)

    This paper proposes and analyzes a dampened proximal alternating direction method of multipliers (DP.ADMM) for solving linearly-constrained nonconvex optimization problems where the smooth part of the objective function is nonseparable. Each iteration of DP.ADMM consists of: (i) a sequence of partial proximal augmented Lagrangian (AL) updates, (ii) an under-relaxed Lagrange multiplier update, and (iii) a novel test to check whether the penalty parameter of the AL function should be updated. Under a basic Slater condition and some requirements related to the dampening factor and under-relaxation parameter, it is shown that DP.ADMM obtains a first-order stationary point of the constrained problem in $O(\varepsilon^{−3})$ iterations for a given numerical tolerance $\varepsilon>0$. One of the main novelties of the paper is that convergence of the method is obtained without requiring any rank assumptions on the constraint matrices.

2023

  • Iteration-complexity of an inner accelerated inexact proximal augmented Lagrangian method based on the classical Lagrangian function
    Weiwei Kong, Renato D. C. Monteiro, & Jefferson G. Melo,
    SIAM Journal on Optimization 33 (2023)

    This paper establishes the iteration-complexity of an inner accelerated inexact proximal augmented Lagrangian (IAIPAL) method for solving linearly-constrained smooth nonconvex composite optimization problems that is based on the classical augmented Lagrangian (AL) function. More specifically, each IAIPAL iteration consists of inexactly solving a proximal AL subproblem by an accelerated composite gradient (ACG) method followed by a classical Lagrange multiplier update. Under the assumption that the domain of the composite function is bounded and the problem has a Slater point, it is shown that IAIPAL generates an approximate stationary solution in $O(\varepsilon^{−5/2}\log^2 \varepsilon^{−1})$ ACG iterations where $\varepsilon>0$ is a tolerance for both stationarity and feasibility. Moreover, the above bound is derived without assuming that the initial point is feasible. Finally, numerical results are presented to demonstrate the strong practical performance of IAIPAL.
  • An Accelerated Inexact Dampened Augmented Lagrangian Method for Linearly-Constrained Nonconvex Composite Optimization Problems
    Weiwei Kong & Renato D. C. Monteiro
    Computational Optimization and Applications (2023)

    This paper proposes and analyzes an accelerated inexact dampened augmented Lagrangian (AIDAL) method for solving linearly-constrained nonconvex composite optimization problems. Each iteration of the AIDAL method consists of: (i) inexactly solving a dampened proximal augmented Lagrangian (AL) subproblem by calling an accelerated composite gradient (ACG) subroutine; (ii) applying a dampened and under-relaxed Lagrange multiplier update; and (iii) using a novel test to check whether the penalty parameter of the AL function should be increased. Under several mild assumptions involving the dampening factor and the under-relaxation constant, it is shown that the AIDAL method generates an approximate stationary point of the constrained problem in $O(\varepsilon^{ −5/2 } \log^2 \varepsilon^{−1})$ iterations of the ACG subroutine, for a given tolerance $\varepsilon>0$. Numerical experiments are also given to show the computational efficiency of the proposed method.
  • A Unified Fast Gradient Clipping Framework for DP-SGD
    Weiwei Kong & Andres Muñoz Medina
    Advances in Neural Information Processing Systems 37 (2023)

    A well-known numerical bottleneck in the differentially-private stochastic gradient descent (DP-SGD) algorithm is the computation of the gradient norm for each example in a large input batch. When the loss function in DP-SGD consists of an intermediate linear operation, existing methods in the literature have proposed decompositions of gradients that are amenable to fast norm computations. In this paper, we present a framework that generalizes the above approach to arbitrary (possibly nonlinear) intermediate operations. Moreover, we show that for certain operations, such as fully-connected and embedding layer computations, further improvements to the runtime and storage costs of existing decompositions can be deduced using certain components of our framework. Finally, preliminary numerical experiments are given to demonstrate the substantial effects of the aforementioned improvements.

2022

  • Accelerated Inexact Composite Gradient Methods for Nonconvex Spectral Optimization Problems
    Weiwei Kong & Renato D. C. Monteiro
    Computational Optimization and Applications 82 (2022)

    This paper presents two inexact composite gradient methods, one inner accelerated and another doubly accelerated, for solving a class of nonconvex spectral composite optimization problems. More specifically, the objective function for these problems is of the form $f_1+f_2+h$, where $f_1$ and $f_2$ are differentiable nonconvex matrix functions with Lipschitz continuous gradients, $h$ is a proper closed convex matrix function, and both $f_2$ and $h$ can be expressed as functions that operate on the singular values of their inputs. The methods essentially use an accelerated composite gradient method to solve a sequence of proximal subproblems involving the linear approximation of $f_1$ and the singular value functions underlying $f_2$ and $h$. Unlike other composite gradient-based methods, the proposed methods take advantage of both the composite and spectral structure underlying the objective function in order to efficiently generate their solutions. Numerical experiments are presented to demonstrate the practicality of these methods on a set of real-world and randomly generated spectral optimization problems.
  • Iteration-complexity of a proximal augmented Lagrangian method for solving nonconvex composite optimization problems with nonlinear convex constraints
    Weiwei Kong & Renato D. C. Monteiro
    Mathematics of Operations Research (2022)

    This paper proposes and analyzes a proximal augmented Lagrangian (NL-IAPIAL) method for solving smooth nonconvex composite optimization problems with nonlinear ${\cal K}$-convex constraints, i.e., the constraints are convex with respect to the order given by a closed convex cone ${\cal K}$. Each NL-IAPIAL iteration consists of inexactly solving a proximal augmented Lagrangian subproblem by an accelerated composite gradient (ACG) method followed by a Lagrange multiplier update. Under some mild assumptions, it is shown that NL-IAPIAL generates an approximate stationary solution of the constrained problem in $O(\log(1/\rho)/\rho^{3})$ inner iterations, where $ρ>0$ is a given tolerance. Numerical experiments are also given to illustrate the computational efficiency of the proposed method.
  • Complexity-optimal and Curvature-free First-order Methods for Finding Stationary Points of Composite Optimization Problems
    Weiwei Kong
    Submitted to SIAM Journal on Optimization (2022)

    This paper develops and analyzes an accelerated proximal descent method for finding stationary points of nonconvex composite optimization problems. The objective function is of the form $f+h$ where $h$ is a proper closed convex function, $f$ is a differentiable function on the domain of $h$, and $\nabla f$ is Lipschitz continuous on the domain of $h$. The main advantage of this method is that it is "curvature-free" in the sense that it does not require knowledge of the Lipschitz constant of $\nabla f$ or of any global topological properties of $f$. It is shown that the proposed method can obtain a $\rho$-approximate stationary point with iteration complexity bounds that are optimal, up to logarithmic terms over $\rho$, in both the convex and nonconvex settings. Some discussion is also given about how the proposed method can be leveraged in other existing optimization frameworks, such as min-max smoothing and penalty frameworks for constrained programming, to create more specialized curvature-free methods. Finally, numerical experiments on a set of nonconvex quadratic semidefinite programming problems are given to support the practical viability of the method.

2021

  • An accelerated inexact proximal point method for solving nonconvex-concave min-max problems
    Weiwei Kong & Renato D. C. Monteiro
    SIAM Journal on Optimization 31 (2021)

    This paper presents smoothing schemes for obtaining approximate stationary points of unconstrained or linearly constrained composite nonconvex-concave min-max (and hence nonsmooth) problems by applying well-known algorithms to composite smooth approximations of the original problems. More specifically, in the unconstrained (resp., constrained) case, approximate stationary points of the original problem are obtained by applying, to its composite smooth approximation, an accelerated inexact proximal point (resp., quadratic penalty) method presented in a previous paper by the authors. Iteration complexity bounds for both smoothing schemes are also established. Finally, numerical results are given to demonstrate the efficiency of the unconstrained smoothing scheme.
  • FISTA and Extensions - Review and New Insights
    Weiwei Kong, Jefferson G. Melo, & Renato D. C. Monteiro
    Technical Report (2021)

    The purpose of this technical report is to review the main properties of an accelerated composite gradient (ACG) method commonly referred to as the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA). In addition, we state a version of FISTA for solving both convex and strongly convex composite minimization problems and derive its iteration complexities to generate iterates satisfying various stopping criteria, including one which arises in the course of solving other composite optimization problems via inexact proximal point schemes. This report also discusses different reformulations of the convex version of FISTA and how they relate to other formulations in the literature.
  • Accelerated inexact first-order methods for solving nonconvex composite optimization problems
    Weiwei Kong
    Doctoral Thesis (2021)

    This thesis focuses on developing and analyzing accelerated and inexact first-order methods for solving or finding stationary points of various nonconvex composite optimization (NCO) problems. The main tools mainly come from variational and convex analysis, and the key results are in the form of iteration complexity bounds and how these bounds compare to other ones in the literature.

2020

  • An efficient adaptive accelerated inexact proximal point method for solving linearly constrained nonconvex composite problems
    Weiwei Kong, Jefferson G. Melo, & Renato D. C. Monteiro
    Computational Optimization and Applications 76 (2020)

    This paper proposes an efficient adaptive variant of a quadratic penalty accelerated inexact proximal point (QP-AIPP) method proposed earlier by the authors. Both the QP-AIPP method and its variant solve linearly set constrained nonconvex composite optimization problems using a quadratic penalty approach where the generated penalized subproblems are solved by a variant of the underlying AIPP method. The variant, in turn, solves a given penalized subproblem by generating a sequence of proximal subproblems which are then solved by an accelerated composite gradient algorithm. The main difference between AIPP and its variant is that the proximal subproblems in the former are always convex while the ones in the latter are not necessarily convex due to the fact that their prox parameters are chosen as aggressively as possible so as to improve efficiency. The possibly nonconvex proximal subproblems generated by the AIPP variant are also tentatively solved by a novel adaptive accelerated composite gradient algorithm based on the validity of some key convergence inequalities. As a result, the variant generates a sequence of proximal subproblems where the stepsizes are adaptively changed according to the responses obtained from the calls to the accelerated composite gradient algorithm. Finally, numerical results are given to demonstrate the efficiency of the proposed AIPP and QP-AIPP variants.
  • Rankmax: An Adaptive Projection Alternative to the Softmax Function
    Weiwei Kong, Walid Krichene, Nicolas Mayoraz, Steffen Rendle, & Li Zhang
    Advances in Neural Information Processing Systems 33 (2020)

    Several machine learning models involve mapping a score vector to a probability vector. Usually, this is done by projecting the score vector onto a probability simplex, and such projections are often characterized as Lipschitz continuous approximations of the argmax function, whose Lipschitz constant is controlled by a parameter that is similar to a softmax temperature. The aforementioned parameter has been observed to affect the quality of these models and is typically either treated as a constant or decayed over time. In this work, we propose a method that adapts this parameter to individual training examples. The resulting method exhibits desirable properties, such as sparsity of its support and numerically efficient implementation, and we find that it significantly outperforms competing non-adaptive projection methods. In our analysis, we also derive the general solution of (Bregman) projections onto the $(n, k)$-simplex, a result which may be of independent interest.

2019

  • Complexity of a quadratic penalty accelerated inexact proximal point method for solving linearly constrained nonconvex composite programs
    Weiwei Kong, Jefferson G. Melo, & Renato D. C. Monteiro
    SIAM Journal on Optimization 29 (2019)

    This paper analyzes the iteration complexity of a quadratic penalty accelerated inexact proximal point method for solving linearly constrained nonconvex composite programs. More specifically, the objective function is of the form $f+h$, where $f$ is a differentiable function whose gradient is Lipschitz continuous and $h$ is a closed convex function with possibly unbounded domain. The method, basically, consists of applying an accelerated inexact proximal point method for approximately solving a sequence of quadratic penalized subproblems associated with the linearly constrained problem. Each subproblem of the proximal point method is in turn approximately solved by an accelerated composite gradient (ACG) method. It is shown that the proposed scheme generates a $\rho$-approximate stationary point in at most $\mathcal{O}(\rho^{-3})$ ACG iterations. Finally, numerical results showing the efficiency of the proposed method are also given.
  • A new dog learns old tricks: RL finds classic optimization algorithms
    Weiwei Kong, Christopher Liaw, Aranyak Mehta, & D. Sivakumar
    International Conference on Learning Representations 7 (2019)

    This paper introduces a novel framework for learning algorithms to solve online combinatorial optimization problems. Towards this goal, we introduce a number of key ideas from traditional algorithms and complexity theory. First, we draw a new connection between primal-dual methods and reinforcement learning. Next, we introduce the concept of adversarial distributions (universal and high-entropy training sets), which are distributions that encourage the learner to find algorithms that work well in the worst case. We test our new ideas on a number of optimization problem such as the AdWords problem, the online knapsack problem, and the secretary problem. Our results indicate that the models have learned behaviours that are consistent with the traditional optimal algorithms for these problems.