Withdraw
Loading…
Stochastic optimization with biased oracles and hidden convexity
Hu, Yifan
Loading…
Permalink
https://hdl.handle.net/2142/116224
Description
- Title
- Stochastic optimization with biased oracles and hidden convexity
- Author(s)
- Hu, Yifan
- Issue Date
- 2022-07-14
- Director of Research (if dissertation) or Advisor (if thesis)
- Chen, Xin
- He, Niao
- Doctoral Committee Chair(s)
- Chen, Xin
- He, Niao
- Committee Member(s)
- Srikant, Rayadurgam
- Sun, Ruoyu
- Department of Study
- Industrial&Enterprise Sys Eng
- Discipline
- Industrial Engineering
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Stochastic Optimization, Sample Average Approximation, Gradient Descent, Nonconvex optimization
- Abstract
- Stochastic optimization is the engine for modern machine learning tasks. However, various modern machine learning tasks, i.e., distributionally robust optimization, invariant learning, personalized federated learning, meta-learning, and continual learning cannot be modeled as classical stochastic optimization. Consequently, it is hard to obtain unbiased zeorth-order or first-order information for these problems. For various problems, it remains unclear how to handle the potential bias and demonstrate the sample complexity and the oracle complexity for achieving approximate optimal solutions for (strongly) convex objectives and approximate stationary points for nonconvex objectives. In the first part of the thesis, we first propose the \emph{conditional stochastic optimization} (CSO). It provides strong modeling capability for a wide spectrum of applications, including portfolio selection, reinforcement learning, robust learning, causal inference, and model-agnostic meta-learning. We study the sample complexities of a modified sample average approximation and a proposed biased stochastic gradient descent method for the conditional stochastic optimization. In addition, we show the lower bounds on the expected error of the CSO problem using a biased oracle model. On various applications, we demonstrate the superior performance of the proposed methods. We further consider the \emph{stochastic optimization with biased oracles}, which is a generalization of the CSO problem. It is used to model stochastic optimization when one only has access to biased stochastic oracles of the objective, and obtaining stochastic gradients with low biases comes at high costs. We examine a family of multi-level Monte Carlo (MLMC) gradient methods that exploit a delicate trade-off among the bias, the variance, and the oracle cost. We provide a systematic study of their convergences and total computation complexities for strongly convex, convex, and nonconvex objectives and demonstrate their superiority over the naive biased stochastic gradient method. We show that when applying the MLMC gradient methods to conditional stochastic optimization, it significantly improves the sample complexity of the previously proposed biased gradient method. When applied to Wasserstein distributionally robust optimization and contextual stochastic optimization, the MLMC gradient methods also outperform the naive biased gradient methods. In the second part of the thesis, we consider the \emph{stochastic nonconvex optimization with hidden convexity}, i.e., there exists a convex reformulation of the original nonconvex problem via an (implicit) variable change. A notable difference comparing to the first part of the work is that we aim to find an approximate global optimal solution rather than an approximate stationary point for the nonconvex problem. In particular, we study a special case when the nonconvex objective is the expectation of a composition of a convex function and a random function, i.e., $\min_{x\in\mathcal{X}} F(x):=\EE_\xi [f(\phi(x,\xi))]$. Leveraging an (implicit) convex reformulation via a variable transformation $u=\EE[\phi(x,\xi)]$, we develop a regularized stochastic gradient method and a mirror stochastic gradient method that converge to an $\eps$-global optimal solution and establish their sample and gradient complexities. Interestingly, both methods operate only in the original $x$-space using gradient estimators of the original nonconvex objective $F$ and the mirror stochastic gradient method achieves $\tilde \cO(\eps^{-2})$ sample and gradient complexities, which matches the lower bounds for solving stochastic convex optimization problems. Under booking limits control, we formulate the air-cargo network revenue management (NRM) problem with random two-dimensional capacity, random consumption, and routing flexibility as a special case of such stochastic nonconvex optimization, where the random function $\phi(x,\xi)=x\wedge\xi$, i.e., the random demand $\xi$ truncates the booking limit decision $x$. Extensive numerical experiments demonstrate the superior performance of our proposed MSG algorithm for booking limit control with higher revenue and lower computation cost than state-of-the-art bid-price-based control policies, especially when the variance of random capacity is large.
- Graduation Semester
- 2022-08
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Yifan Hu
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…