Withdraw
Loading…
Learning-based saliency-aware compression framework
Chen, Bo
Loading…
Permalink
https://hdl.handle.net/2142/115495
Description
- Title
- Learning-based saliency-aware compression framework
- Author(s)
- Chen, Bo
- Issue Date
- 2022-04-19
- Director of Research (if dissertation) or Advisor (if thesis)
- Nahrstedt, Klara
- Doctoral Committee Chair(s)
- Nahrstedt, Klara
- Committee Member(s)
- Abdelzaher, Tarek
- Sundaram, Hari
- Yan, Zhisheng
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Video Compression
- Machine Learning
- Networking
- Video Saliency
- Abstract
- Networked vision systems are prevalent nowadays due to the dominance of video traffic over the Internet. Videos in the networked vision system are transmitted over the Internet to a vision application, which processes videos by deep neural networks (DNN), i.e., computation offloading, or presents videos to human viewers, i.e., video streaming. The video codec is an essential component in the networked vision system, which compresses videos to a small size with minimal impact on the video quality. Traditional video codecs like MPEG, AVC, and HEVC have been widely adopted in the networked vision system in the past few decades. Recently, learned video codecs built on DNNs are gaining more attention due to their superior coding efficiency than traditional ones. Nonetheless, DNNs in networked vision systems introduce rate and quality heterogeneity, which lead to some issues in the networked vision system, e.g., low processing rate and poor coding efficiency. This dissertation aims at addressing the heterogeneity in networked vision systems. The thesis statement is that the rate and quality heterogeneity in networked vision systems should be addressed by leveraging the spatiotemporal saliency with learning-based adaptation in video compression. The spatiotemporal saliency characterizes how important pixels in the video are, which by our observation varies across different networked vision systems. Specifically, certain regions of interest (ROI) or keyframes are more significant in a given networked vision system. The key intuition is to adapt how information in a video is encoded spatially and temporally based on spatiotemporal saliency, i.e., saliency-aware adaptation to improve processing rate or coding efficiency with minimal impact on other metrics. The challenge lies in properly designing and training the adaptation module in a given networked vision system. To prove the thesis, this dissertation provides a suite of saliency-aware adaptation modules in the video codec, i.e., temporal adaptation (Learned Frame Sampling), spatial adaptation (Context-aware Compression and Spatial-adaptive Filter), and spatiotemporal adaptation (Space-time-aware Entropy Module), addressing the aforementioned challenge. Our evaluation covers various video categories, diverse vision applications, and state-of-the-art compression algorithms. The result demonstrates our approaches effectively improve the processing rate or the coding efficiency in networked vision systems compared to existing solutions without harming other performance metrics.
- Graduation Semester
- 2022-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Bo Chen
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…