Withdraw
Loading…
Making image generation and manipulation simple and effective
Chong, Min Jin
Loading…
Permalink
https://hdl.handle.net/2142/117765
Description
- Title
- Making image generation and manipulation simple and effective
- Author(s)
- Chong, Min Jin
- Issue Date
- 2022-11-21
- Director of Research (if dissertation) or Advisor (if thesis)
- Forsyth, David
- Doctoral Committee Chair(s)
- Forsyth, David
- Committee Member(s)
- Schwing, Alexander
- Hoiem, Derek
- Wang, Yuxiong
- Fidler, Sanja
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Generative Models
- Generative Adversarial Networks
- Stylization
- Image Manipulation
- Image Generation
- Accelerating GANs
- Abstract
- In this thesis, we explore techniques that allow casual users to perform complex semantic image manipulation with ease in an intuitive way. Enabling such interactions requires a deep knowledge of images and their underlying structures. One way we can learn to model this knowledge is by using Generative Adversarial networks (GANs). GANs are neural networks that are trained to generate target data distribution via an adversarial game between the generator and discriminator. By exploiting the internal representation of a state-of-the-art StyleGAN, we expose convenient control mechanisms for complex manipulations of natural images. We specifically tackle two popular forms of image manipulation -- image editing and image stylization. For image editing, our goal is to allow users to make locally and globally consistent edits intuitively with minimal effort. These range from simple local edits such as adjusting one's smile, to complex and highly semantic edits such as pose and hairstyle changes with a click of a button. We also allow users to edit images through intuitive spatial operations such as compositing, copy-pasting, and resizing. For image stylization, our goal is to allow users to stylize themselves given a reference style image in a fast, simple, and controllable fashion. Our method enables high quality state-of-the-art face stylization using only a single reference style with under 30 seconds of train time. We have established that a well-trained GAN is very useful for image manipulation. Outside of that, they are also widely used for various problems such as dataset labeling, image restoration, and data augmentation. Most state-of-the-art GANs, however, are very expensive to train, making it impossible for most machine learning practitioners to train their own. To address this issue, we introduce a plug-and-play discriminator that aims to significantly accelerate GAN training across different frameworks. Using foundational models as feature extractors and a novel regularization loss to stabilize the training process, we successfully demonstrate an up to 22x increase in training speed for a StyleGAN2 on FFHQ. This vastly reduces the resources needed to train a state-of-the-art GAN down to a single GPU, making it accessible to most users. This thesis thus presents an entire pipeline for image manipulation, from training a GAN quickly to exploiting it to manipulate images.
- Graduation Semester
- 2022-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Min Jin Chong
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…