Stochastic optimization with biased oracles and hidden convexity

Hu, Yifan

Stochastic optimization with biased oracles and hidden convexity

Hu, Yifan

Content Files

HU-DISSERTATION-2022.pdf

Permalink

https://hdl.handle.net/2142/116224

Description

Title

Stochastic optimization with biased oracles and hidden convexity

Author(s)

Hu, Yifan

Issue Date

2022-07-14

Director of Research (if dissertation) or Advisor (if thesis)

Chen, Xin
He, Niao

Doctoral Committee Chair(s)

Chen, Xin
He, Niao

Committee Member(s)

Srikant, Rayadurgam
Sun, Ruoyu

Department of Study

Industrial&Enterprise Sys Eng

Discipline

Industrial Engineering

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Stochastic Optimization, Sample Average Approximation, Gradient Descent, Nonconvex optimization

Abstract

Stochastic optimization is the engine for modern machine learning tasks. However, various modern machine learning tasks, i.e., distributionally robust optimization, invariant learning, personalized federated learning, meta-learning, and continual learning cannot be modeled as classical stochastic optimization. Consequently, it is hard to obtain unbiased zeorth-order or first-order information for these problems. For various problems, it remains unclear how to handle the potential bias and demonstrate the sample complexity and the oracle complexity for achieving approximate optimal solutions for (strongly) convex objectives and approximate stationary points for nonconvex objectives. In the first part of the thesis, we first propose the \emph{conditional stochastic optimization} (CSO). It provides strong modeling capability for a wide spectrum of applications, including portfolio selection, reinforcement learning, robust learning, causal inference, and model-agnostic meta-learning. We study the sample complexities of a modified sample average approximation and a proposed biased stochastic gradient descent method for the conditional stochastic optimization. In addition, we show the lower bounds on the expected error of the CSO problem using a biased oracle model. On various applications, we demonstrate the superior performance of the proposed methods. We further consider the \emph{stochastic optimization with biased oracles}, which is a generalization of the CSO problem. It is used to model stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs. We examine a family of multi-level Monte Carlo (MLMC) gradient methods that exploit a delicate trade-off among the bias, the variance, and the oracle cost. We provide a systematic study of their convergences and total computation complexities for strongly convex, convex, and nonconvex objectives and demonstrate their superiority over the naive biased stochastic gradient method. We show that when applying the MLMC gradient methods to conditional stochastic optimization, it significantly improves the sample complexity of the previously proposed biased gradient method. When applied to Wasserstein distributionally robust optimization and contextual stochastic optimization, the MLMC gradient methods also outperform the naive biased gradient methods. In the second part of the thesis, we consider the \emph{stochastic nonconvex optimization with hidden convexity}, i.e., there exists a convex reformulation of the original nonconvex problem via an (implicit) variable change. A notable difference comparing to the first part of the work is that we aim to find an approximate global optimal solution rather than an approximate stationary point for the nonconvex problem. In particular, we study a special case when the nonconvex objective is the expectation of a composition of a convex function and a random function, i.e., $\min_{x\in\mathcal{X}} F(x):=\EE_\xi [f(\phi(x,\xi))]$. Leveraging an (implicit) convex reformulation via a variable transformation $u=\EE[\phi(x,\xi)]$, we develop a regularized stochastic gradient method and a mirror stochastic gradient method that converge to an $\eps$-global optimal solution and establish their sample and gradient complexities. Interestingly, both methods operate only in the original $x$-space using gradient estimators of the original nonconvex objective $F$ and the mirror stochastic gradient method achieves $\tilde \cO(\eps^{-2})$ sample and gradient complexities, which matches the lower bounds for solving stochastic convex optimization problems. Under booking limits control, we formulate the air-cargo network revenue management (NRM) problem with random two-dimensional capacity, random consumption, and routing flexibility as a special case of such stochastic nonconvex optimization, where the random function $\phi(x,\xi)=x\wedge\xi$, i.e., the random demand $\xi$ truncates the booking limit decision $x$. Extensive numerical experiments demonstrate the superior performance of our proposed MSG algorithm for booking limit control with higher revenue and lower computation cost than state-of-the-art bid-price-based control policies, especially when the variance of random capacity is large.

Graduation Semester

2022-08

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/116224

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Stochastic optimization with biased oracles and hidden convexity

Hu, Yifan

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In