Making image generation and manipulation simple and effective

Chong, Min Jin

Making image generation and manipulation simple and effective

Chong, Min Jin

Content Files

CHONG-DISSERTATION-2022.pdf

Permalink

https://hdl.handle.net/2142/117765

Description

Title

Making image generation and manipulation simple and effective

Author(s)

Chong, Min Jin

Issue Date

2022-11-21

Director of Research (if dissertation) or Advisor (if thesis)

Forsyth, David

Doctoral Committee Chair(s)

Forsyth, David

Committee Member(s)

Schwing, Alexander
Hoiem, Derek
Wang, Yuxiong
Fidler, Sanja

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Generative Models
Generative Adversarial Networks
Stylization
Image Manipulation
Image Generation
Accelerating GANs

Abstract

In this thesis, we explore techniques that allow casual users to perform complex semantic image manipulation with ease in an intuitive way. Enabling such interactions requires a deep knowledge of images and their underlying structures. One way we can learn to model this knowledge is by using Generative Adversarial networks (GANs). GANs are neural networks that are trained to generate target data distribution via an adversarial game between the generator and discriminator. By exploiting the internal representation of a state-of-the-art StyleGAN, we expose convenient control mechanisms for complex manipulations of natural images. We specifically tackle two popular forms of image manipulation -- image editing and image stylization. For image editing, our goal is to allow users to make locally and globally consistent edits intuitively with minimal effort. These range from simple local edits such as adjusting one's smile, to complex and highly semantic edits such as pose and hairstyle changes with a click of a button. We also allow users to edit images through intuitive spatial operations such as compositing, copy-pasting, and resizing. For image stylization, our goal is to allow users to stylize themselves given a reference style image in a fast, simple, and controllable fashion. Our method enables high quality state-of-the-art face stylization using only a single reference style with under 30 seconds of train time. We have established that a well-trained GAN is very useful for image manipulation. Outside of that, they are also widely used for various problems such as dataset labeling, image restoration, and data augmentation. Most state-of-the-art GANs, however, are very expensive to train, making it impossible for most machine learning practitioners to train their own. To address this issue, we introduce a plug-and-play discriminator that aims to significantly accelerate GAN training across different frameworks. Using foundational models as feature extractors and a novel regularization loss to stabilize the training process, we successfully demonstrate an up to 22x increase in training speed for a StyleGAN2 on FFHQ. This vastly reduces the resources needed to train a state-of-the-art GAN down to a single GPU, making it accessible to most users. This thesis thus presents an entire pipeline for image manipulation, from training a GAN quickly to exploiting it to manipulate images.

Graduation Semester

2022-12

Type of Resource

Thesis

Handle URL

https://hdl.handle.net/2142/117765

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Making image generation and manipulation simple and effective

Chong, Min Jin

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In