Withdraw
Loading…
Viewing enhancement of 360◦ videos in diverse contexts
Sarkar, Ayush Guha
Loading…
Permalink
https://hdl.handle.net/2142/115757
Description
- Title
- Viewing enhancement of 360◦ videos in diverse contexts
- Author(s)
- Sarkar, Ayush Guha
- Issue Date
- 2022-04-28
- Director of Research (if dissertation) or Advisor (if thesis)
- Nahrstedt, Klara
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- 360◦ video
- video streaming
- super-resolution
- edge computing
- bandwidth
- latency
- firefighting
- object detection
- Abstract
- 360◦ videos have revolutionized the way we perceive multimedia content. However, 2D videos still remain the standard, with many file formats, algorithms, and approaches structured for and compliant with 2D formats. 360◦ videos, in general, face difficulties when applied to these established 2D frameworks. For example, convolutional neural networks (CNNs) are trained on images taken from cameras that rely on a perspective projection model, and thus cannot be easily applied to 360◦ imagery, which relies on spherical formats. In this thesis, we look to enhance the viewing experience in two diverse contexts that rely on 360◦ experiences. We look to relax computational challenges from 360◦ formats in these diverse contexts by exploring ways in which we can bridge the gap between CNN applications tailored to the 2D space and 360◦ imagery. In the first context, we look at 360◦ video streaming scenarios. 360◦ videos are of higher resolutions than 2D videos, causing greater bandwidth consumption when streamed and straining the network capacity of cloud providers. To address this problem, we introduce L3BOU, a novel, three-tier distributed software framework that reduces cloud-edge bandwidth in the backhaul network and lowers average end-to-end latency for 360◦ video streaming applications. The L3BOU framework leverages edge-based, optimized upscaling techniques. L3BOU utilizes downscaled MPEG-DASH-encoded 360◦ video data, known as Ultra Low Resolution (ULR) data, that the L3BOU edge applies distributed super-resolution techniques on, providing a high quality video to the client. For the second context, we pivot towards the emergency response domain. Demonstrating firefighting operations in search and rescue missions through videos is a common approach to in-classroom firefighter training. Unfortunately, traditional 2D cameras have fundamental weaknesses – they can only capture a narrow field of view and miss a lot of information coming from the surroundings of the firefighter. Therefore, we propose a system combining the advantage of 360◦ videos and deep learning to automatically detect important objects in the panoramic scene, assisting firefighting instructors in classroom teaching scenarios. Specifically, we summarize the salient objects and events relevant to firefighting through an interview with an experienced firefighting instructor. Leveraging this knowledge, we investigate the detection of firefighting objects on 360◦ videos through a transfer learning approach. We report insightful results for object detectors trained on generic objects and 2D videos and discuss the next steps in designing a customized object detector.
- Graduation Semester
- 2022-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Ayush Sarkar
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…