Exploring knowledge in generative models

Bhattad, Anand

Exploring knowledge in generative models

Bhattad, Anand

Content Files

BHATTAD-DISSERTATION-2024.pdf

Permalink

https://hdl.handle.net/2142/124347

Description

Title

Exploring knowledge in generative models

Author(s)

Bhattad, Anand

Issue Date

2024-04-20

Director of Research (if dissertation) or Advisor (if thesis)

Forsyth, David A

Doctoral Committee Chair(s)

Forsyth, David A

Committee Member(s)

Efros, Alexei A
Hoiem, Derek W
Wang, Shenlong
Lazebnik, Svetlana
Freeman, William T

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

visual knowledge
generative models
intrinsic images
relighting
image decomposition
normals
depth
albedo
shading
segmentation

Abstract

Generative models, such as StyleGAN, have demonstrated remarkable ability in producing realistic and controllable images. However, the underlying representations and mechanisms employed by these models remain largely unexplored. This thesis delves into the intrinsic properties and manipulability of StyleGAN, focusing on image relighting and decomposition. We begin by exploring the impact of image decompositions on image-based relighting. By analyzing the role of intrinsic image components such as reflectance, shading, and normals, we gain insights into the fundamental properties that contribute to realistic relighting. This understanding lays the foundation for our subsequent investigations into StyleGAN. Building upon these insights, we introduce StyLitGAN, a method that enables StyleGAN to generate scenes with novel lighting conditions. StyLitGAN produces realistic lighting effects, including cast shadows, soft shadows, inter-reflections, and glossy effects, without requiring labeled, paired, or CGI data. Moreover, it seamlessly extends to manipulating surface properties like colors and materials. Next, we present Make It So, a near-perfect GAN inversion technique that significantly outperforms previous state-of-the-art methods. Make It So can invert and relight real scenes, including out-of-domain images, demonstrating its generalizability and robustness. Finally, we uncover hidden gems within StyleGAN, providing strong evidence that it encodes easily accessible and accurate internal representations of familiar scene properties, known as ``intrinsic images," as defined by Barrow and Tenenbaum in their seminal work from 1978. We demonstrate that StyleGAN has encodings for intrinsic images such as reflectance, shading, and normals, which can be extracted and manipulated for various applications. Through our discoveries, we shed light on the implicit understanding of worldly knowledge present within generative models like StyleGAN. Our findings pave the way for improved manipulability, understanding, and refinement of generative models, with potential applications in computer vision, computational photography, computer graphics, and machine learning. This thesis contributes to the broader goal of leveraging generative models for advanced image manipulation and scene understanding tasks.

Graduation Semester

2024-05

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/124347

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Exploring knowledge in generative models

Bhattad, Anand

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In