Withdraw
Loading…
Exploring knowledge in generative models
Bhattad, Anand
Loading…
Permalink
https://hdl.handle.net/2142/124347
Description
- Title
- Exploring knowledge in generative models
- Author(s)
- Bhattad, Anand
- Issue Date
- 2024-04-20
- Director of Research (if dissertation) or Advisor (if thesis)
- Forsyth, David A
- Doctoral Committee Chair(s)
- Forsyth, David A
- Committee Member(s)
- Efros, Alexei A
- Hoiem, Derek W
- Wang, Shenlong
- Lazebnik, Svetlana
- Freeman, William T
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- visual knowledge
- generative models
- intrinsic images
- relighting
- image decomposition
- normals
- depth
- albedo
- shading
- segmentation
- Abstract
- Generative models, such as StyleGAN, have demonstrated remarkable ability in producing realistic and controllable images. However, the underlying representations and mechanisms employed by these models remain largely unexplored. This thesis delves into the intrinsic properties and manipulability of StyleGAN, focusing on image relighting and decomposition. We begin by exploring the impact of image decompositions on image-based relighting. By analyzing the role of intrinsic image components such as reflectance, shading, and normals, we gain insights into the fundamental properties that contribute to realistic relighting. This understanding lays the foundation for our subsequent investigations into StyleGAN. Building upon these insights, we introduce StyLitGAN, a method that enables StyleGAN to generate scenes with novel lighting conditions. StyLitGAN produces realistic lighting effects, including cast shadows, soft shadows, inter-reflections, and glossy effects, without requiring labeled, paired, or CGI data. Moreover, it seamlessly extends to manipulating surface properties like colors and materials. Next, we present Make It So, a near-perfect GAN inversion technique that significantly outperforms previous state-of-the-art methods. Make It So can invert and relight real scenes, including out-of-domain images, demonstrating its generalizability and robustness. Finally, we uncover hidden gems within StyleGAN, providing strong evidence that it encodes easily accessible and accurate internal representations of familiar scene properties, known as ``intrinsic images," as defined by Barrow and Tenenbaum in their seminal work from 1978. We demonstrate that StyleGAN has encodings for intrinsic images such as reflectance, shading, and normals, which can be extracted and manipulated for various applications. Through our discoveries, we shed light on the implicit understanding of worldly knowledge present within generative models like StyleGAN. Our findings pave the way for improved manipulability, understanding, and refinement of generative models, with potential applications in computer vision, computational photography, computer graphics, and machine learning. This thesis contributes to the broader goal of leveraging generative models for advanced image manipulation and scene understanding tasks.
- Graduation Semester
- 2024-05
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2024 Anand Bhattad
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…