Withdraw
Loading…
Efficient and robust algorithms for training machine learning models
Thekumparampil, Kiran Koshy
Loading…
Permalink
https://hdl.handle.net/2142/117818
Description
- Title
- Efficient and robust algorithms for training machine learning models
- Author(s)
- Thekumparampil, Kiran Koshy
- Issue Date
- 2022-12-02
- Director of Research (if dissertation) or Advisor (if thesis)
- Oh, Sewoong
- Doctoral Committee Chair(s)
- Oh, Sewoong
- Hajek, Bruce
- Committee Member(s)
- Srikant, Rayadurgam
- Sun, Ruoyu
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Machine Learning
- Mathematical Optimization
- Algorithms
- Deep Learning
- Abstract
- Deep Learning (DL) models have been widely successful at solving many large-scale and challenging tasks. However, to achieve state-of-the-art performance, these models need to be extremely large, and they need to be trained with a massive amount of data. Therefore, the best DL models are computing-resource and data-hungry. This makes training such models very expensive and sometimes prohibitively so. Detrimentally, this can also lead to a high energy usage. With this as a motivation, we study whether we can improve the computational and sample complexities of Machine Learning (ML) training algorithms for a few specific problems. The computational complexity of an algorithm characterizes the number of computational operations required to run that algorithm in terms of the problem size and parameters. In the first part of this dissertation, we investigate algorithms for solving minimax optimization and constrained minimization problems. These problems have several applications in modern ML. We propose improved algorithms for solving these and prove that they achieve better computational complexity than baseline algorithms. Sample complexity of a learning algorithm and model characterizes its achievable testing loss for a given number of potentially noisy samples, i.e.~its statistical efficiency. In the second part of the dissertation, we investigate the statistical efficiency of solving some modern DL tasks. First, we propose an architecture and loss to learn unbiased conditional generative adversarial networks using noisy labeled samples and then characterize their sample complexity in terms of the noise level in the labels. Next, we study the problem of meta-representation learning of many related tasks under a few-shot learning regime, where only very few samples are available per task. We prove that a recently popular DL algorithm can faithfully learn a linear meta-representation for regression tasks with very few samples each.
- Graduation Semester
- 2022-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Kiran Koshy Thekumparampil
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…