Withdraw
Loading…
Accelerating distributed neural network training with network-centric approach
Yuan, Yifan
Loading…
Permalink
https://hdl.handle.net/2142/106434
Description
- Title
- Accelerating distributed neural network training with network-centric approach
- Author(s)
- Yuan, Yifan
- Issue Date
- 2019-10-17
- Director of Research (if dissertation) or Advisor (if thesis)
- Kim, Nam Sung
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- distributed training
- accelerator
- Abstract
- Distributed training of Deep Neural Networks (DNN) is an important technique to reduce the training time of large DNNs for a wide range of applications. In existing distributed training approaches, however, the communication time to periodically exchange parameters (i.e., weights) and gradients among computer nodes over the network constitutes a large fraction of the total training time. To reduce the communication time, we propose an algorithm/hardware co-design, INCEPTIONN. More specifically, observing that gradients are much more tolerant to precision loss than parameters, we first propose a gradient-centric distributed training algorithm. As designed to exchange only gradients among nodes in a distributed manner, it can transfer less information, better overlap communication with computation, and apply a more aggressive lossy compression algorithm to all the information exchanged among nodes than traditional distributed algorithms. Second, exploiting unique characteristics of gradient values, we propose a lossy compression algorithm, optimized for compressing gradients. It accomplishes high compression ratios for compressing gradients without notably affecting the accuracy of trained DNNs. Lastly, we demonstrate that compression algorithms consume a large amount of CPU time, which in turn increases total training time albeit reduced communication time. To tackle this, we propose an in-network computing approach that delegates the lossy compression task to hardware integrated with a Network Interface Card (NIC). Our experiments show that INCEPTIONN can reduce a large portion of the communication time and thus the training time of DNNs, with little degradation in accuracy of trained DNNs.
- Graduation Semester
- 2019-12
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/106434
- Copyright and License Information
- Copyright 2019 Yifan Yuan
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…