Fully-distributed transfer learning with large DNNs on micro-controllers

Zhang, Yichi

Fully-distributed transfer learning with large DNNs on micro-controllers

Zhang, Yichi

This item is only available for download by members of the University of Illinois community. Students, faculty, and staff at the U of I may log in with your NetID and password to view the item. If you are trying to access an Illinois-restricted dissertation or thesis, you can request a copy through your library's Inter-Library Loan office or purchase a copy directly from ProQuest.

Permalink

https://hdl.handle.net/2142/120443

Description

Title

Fully-distributed transfer learning with large DNNs on micro-controllers

Author(s)

Zhang, Yichi

Issue Date

2023-05-02

Director of Research (if dissertation) or Advisor (if thesis)

Kumar, Rakesh

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Keyword(s)

transfer learning
distributed computing
deep learning
neural networks
Internet of Things
edge computing

Abstract

Large convolutional neural networks have shown promising performance and reliability in recent decades. Deploying them in our daily life with the help of IoT networks is a very promising idea and has been practiced for many years. However, most IoT devices do not have the memory to host large models. As a solution to this problem, fully distributed edge learning has been receiving an incredible amount of attention in recent years due to its scalability and privacy-preserving properties. Most recent works utilize peer-to-peer (P2P) federated learning or ensemble learning techniques to train their models. However, most of these methods require either an extensive exchange of gradients during training or careful preparation of input data to ensure accuracy. We propose a design that utilizes model partition and transfer learning to enable the deployment of arbitrarily large forward convolutional neural networks onto IoT networks with many nodes. They will be able to self-adjust with on-device learning and reduce the amount of manual labor for model updates. We use the state-of-the-art model partition method as the starting point for our design and add an additional transfer learning classifier to each node. We use MobileNetV1 as our base neural network and Arduino Nano 33 BLE as our choice of hardware devices. We achieved over 10.94X speed up in terms of end-to-end inference latency compared to a theoretical node with infinite memory and 1.3X speed up compared to the state-of-the-art design. For model performance, we achieved over 90% top-5 accuracy, exceeding the original MobileNetV1 model by around 6%. We also proved that distributing the task to multiple smaller transfer learning classifiers significantly reduces our training time, and individual classifiers are resistant to concept drifts.

Graduation Semester

2023-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/120443

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Fully-distributed transfer learning with large DNNs on micro-controllers

Zhang, Yichi

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In