Concepts from unclear textual embeddings for text-to-image synthesis
Kumar, Maghav
Loading…
Permalink
https://hdl.handle.net/2142/106392
Description
Title
Concepts from unclear textual embeddings for text-to-image synthesis
Author(s)
Kumar, Maghav
Issue Date
2019-12-10
Director of Research (if dissertation) or Advisor (if thesis)
Schwing, Alexander
Department of Study
Computer Science
Discipline
Computer Science
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
Computer Vision
deep learning
GANs
text-to-image
machine learning
Abstract
Automatically generating images based on a natural language description is a challenging problem with several key applications in the fields of retail, marketing, education and entertainment. In the last few years, some progress has been made in this direction specifically by using Generative Adversarial Networks(GANs). Although current state of the art models can generate images that roughly adhere to the textual description, there still remains a long way to go, in terms of producing high quality images that adhere to the nuances of the sentence. To this end we propose CuteGAN, our simple text-to-image generation approach that encourages the model to leverage the attribute information while also attending to more relevant words in a sentence while generating images. We perform experiments on the competitive CUB-200 and MS-COCO datasets and achieve state-of-the-art performance on standard metrics of inception score and R-precision, indicating that our method produces more photo-realistic images that are better correlated with the text.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.