Withdraw
Loading…
Efficient transformer-based panoptic segmentation via knowledge distillation
Zhang, Wentao
Loading…
Permalink
https://hdl.handle.net/2142/120107
Description
- Title
- Efficient transformer-based panoptic segmentation via knowledge distillation
- Author(s)
- Zhang, Wentao
- Issue Date
- 2023-04-27
- Director of Research (if dissertation) or Advisor (if thesis)
- Gui, Liangyan
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Computer Science, Machine Learning, Deep Learning, Computer Vision, Knowledge Distillation
- Abstract
- Knowledge distillation has been applied to various models in different domains. However, knowledge distillation on panoptic segmentation has not been studied so far. In this work, we focus on the knowledge distillation on transformer-based model. More specifically, we perform thorough analysis on the Mask2Former model, which is one of the state-of-the-art models. We found that both backbone and segmentation head are bottleneck of the model performance. To build an efficient transformer-based panoptic segmentation model, one of the best practice is to direct initialize the student model with part of the teacher’s parameters. We first worked on layer parameter initialization and parameter group consistent parameter selection for initialization. We then explored different distillation matching schemes between layers and of teacher and student. Finally, we researched different distillation loss, including adaptive matching-based prediction loss, masked generative distillation-based image feature loss, standard attention distillation loss, and deformable attention distillation loss. With all distillation approaches mentioned above, we trained Mask2Former-S(hallow), Mask2Former- T(hin), and Mask2Former-ST. Our ResNet-50 based models outperformed previous strong baselines, including Panoptic Segformer, MaX-DeepLab, MaskFormer, DETR, Panoptic- DeepLab and Panoptic-FPN with far fewer parameters and GFlops on MS COCO dataset. Additionally, our ResNet-18 based model ourperformed ResNet-50 based Panoptic-DeepLab and Panoptic-FPN with only 29.3% of the parameters.
- Graduation Semester
- 2023-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2023 Wentao Zhang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…