Withdraw
Loading…
Communication efficient large scale distributed optimization with curvature acceleration
Li, Yichuan
Loading…
Permalink
https://hdl.handle.net/2142/116050
Description
- Title
- Communication efficient large scale distributed optimization with curvature acceleration
- Author(s)
- Li, Yichuan
- Issue Date
- 2022-07-11
- Director of Research (if dissertation) or Advisor (if thesis)
- Freris, Nikolaos M.
- Voulgaris, Petros G.
- Doctoral Committee Chair(s)
- Hovakimyan, Naira
- Committee Member(s)
- Stipanovic, Dusan M.
- Salapaka, Srinivasa M.
- Department of Study
- Mechanical Sci & Engineering
- Discipline
- Mechanical Engineering
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Decentralized optimization
- large scale machine learning
- Federated Learning
- Abstract
- Distributed optimization has proven to be an effective scheme to tackle numerous emerging problems that are surged by the development of digital technologies and mobile smart devices. Examples include control of robot swarms, distributed and large scale machine learning, estimation in sensor networks, and management of smart grids. In such setups, the individual interests of agents (machines) have to be collectively optimized under stringent restrictions on communication bandwidth, computation resources, and security concerns. To balance the efforts on optimizing local objectives and cooperation among the network is challenging while necessary to achieve a global solution. Many endeavors have been made to design efficient algorithms that can harness the computation power of a heterogeneous network, in terms of both machine hardware and data distribution. The majority of existing algorithms can be categorized as first-order methods, where the computation process is guided only by the objective (sub) gradient information. Such practices have found many successes due to their economical computation costs and uninvolved implementation. Nevertheless, first-order algorithms suffer from a common drawback: slow convergence rate, especially when the objective curvature is skewed. This induces large overall iteration numbers and prevents the implementation of first order algorithms to applications where accurate solutions are sought in few rounds, such as real time computation on embedded systems and communication expensive learning practices. In this dissertation, we shift our attention to second-order methods that aim to accelerate the iterative process through use of objective curvature. The main challenge of designing second-order methods in large scale networks is threefold: (i) distributed implementation, (ii) computation and communication balance, and (iii) network synchronization. These challenges are not unique to devising decentralized second-order methods, but become more pronounced due to the involvement of Hessian matrices, solving linear systems, backtracking line search, and approximation of curvature information etc. We address these challenges in several steps. Chapter \ref{chapter_acc} tackles the distributed convex composite problem by presenting two algorithms that distributedly approximate the curvature information from the perspective of Method of Multipliers and Alternating Direction Method of Multipliers. By increasing the accuracy of approximation, we show both theoretically and empirically that the proposed algorithms achieve faster convergence rate, and thus providing a tunable adjustment between convergence speed and algorithm expenditures. Chapter \ref{chapter_druid} presents a unified framework for designing algorithms that can harness curvature information with communication efficiency. Several algorithms are designed and analyzed in a unified fashion, including computation schemes using gradient descent, Newton method, and BFGS. Their differences and advantages are discussed, analyzed, and demonstrated. This offers significant flexibility for agents in choosing updating schemes that are most suitable for their local environments. Chapter \ref{chapter_asy} further extends the previous framework to the asynchronous setting, where agents participate in the computation without coordination. This eliminates the need for network wide synchronization among agents. The proposed framework imposes minimal assumptions in terms of the participating frequency and asynchrony between agents, which further broadens the applicability of the proposed algorithms. Chapter \ref{chapter_FL} studies an emerging topic termed Federated Learning, which embodies the challenges we considered in previous chapters. Namely, stringent requirement on communication costs in terms of both frequencies and message sizes, highly heterogeneous networks with massive number of agents with drastically different local environments, non-i.i.d data distributions with imbalanced data volumes, and strict limits on privacy. We present a novel algorithm that accommodates all these difficulties, and show that it achieves optimal algorithmic complexity. Extensive experiments are conducted to demonstrate its superiority over existing methods.
- Graduation Semester
- 2022-08
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Yichuan Li
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…