Toward communication-efficient and secure distributed machine learning

Xie, Cong

Toward communication-efficient and secure distributed machine learning

Xie, Cong

Permalink

https://hdl.handle.net/2142/110466

Description

Title

Toward communication-efficient and secure distributed machine learning

Author(s)

Xie, Cong

Issue Date

2021-04-14

Director of Research (if dissertation) or Advisor (if thesis)

Koyejo, Oluwasanmi
Gupta, Indranil

Doctoral Committee Chair(s)

Koyejo, Oluwasanmi
Gupta, Indranil

Committee Member(s)

Raginsky, Maxim
McMahan, Hugh Brendan

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

distributed
machine learning

Abstract

"In recent years, there is an increasing interest in distributed machine learning. On one hand, distributed machine learning is motivated by assigning the training workload to multiple devices for acceleration and better throughput. On the other hand, there are machine-learning tasks requiring distributed training locally on remote devices due to privacy concerns. Stochastic Gradient Descent (SGD) and its variants are commonly used for training large-scale deep neural networks, as well as the distributed training. Unlike machine learning on a single device, distributed machine learning requires collaboration and communication among the devices, which incur the new challenges: 1) the heavy communication overhead can be the bottleneck that slows down the training; 2) the unreliable communication and weaker control over the remote entities makes the distributed system vulnerable to systematic failures and malicious attacks. In this dissertation, we aim to find new approaches to make distributed SGD faster and more secure. We present four main parts of research. We first study approaches for reducing the communication overhead, including message compression and infrequent synchronization. Then, we investigate the possibility of combining asynchrony with infrequent synchronization. To address security in distributed SGD, we study the tolerance to Byzantine failures. Finally, we explore the possibility of combining both communication efficiency and security techniques into one distributed learning system. Specifically, we present the following techniques to improve the communication efficiency and security of distributed SGD: 1) a technique called ""error reset"" to adapt both infrequent synchronization and message compression to distributed SGD, to reduce the communication overhead; 2) federated optimization in asynchronous mode; 3) a framework of score-based approaches for Byzantine tolerance in distributed SGD; 4) a distributed learning system integrating all these three techniques. The proposed system provides communication reduction, both synchronous and asynchronous training, and Byzantine tolerance, with both theoretical guarantees and empirical evaluations."

Graduation Semester

2021-05

Type of Resource

Thesis

Permalink

http://hdl.handle.net/2142/110466

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Dept. of Computer Science

Toward communication-efficient and secure distributed machine learning

Xie, Cong

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In