Using the visual denotations of image captions for semantic inference

Young, Peter

Using the visual denotations of image captions for semantic inference

Young, Peter

Content Files

Peter_Young.pdf

Permalink

https://hdl.handle.net/2142/46633

Description

Title

Using the visual denotations of image captions for semantic inference

Author(s)

Young, Peter

Issue Date

2014-01-16T17:56:53Z

Director of Research (if dissertation) or Advisor (if thesis)

Hockenmaier, Julia C.

Doctoral Committee Chair(s)

Hockenmaier, Julia C.

Committee Member(s)

DeJong, Gerald F.
Palmer, Martha
Roth, Dan

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Date of Ingest

2014-01-16T17:56:53Z

Keyword(s)

visual denotation
natural language processing (nlp)
image caption corpus
denotation graph

Abstract

Semantic inference is essential to natural language understanding. There are two different traditional approaches to semantic inference. The logic-based approach translates utterances into a formal meaning representation that is amenable to logical proofs. The vector-based approach maps words to vectors that are based on the contexts in which the words appear in utterances. Real-valued similarities are used in place of logical inferences. We introduce the notion of the visual denotation of an utterance, which is the set of images that it describes. This notion borrows the abstract concept of a denotation of an utterance as the set of possible worlds in which the utterance is true from the logic-based approach, and instantiates possible worlds as images. In this dissertation, we also show how visual denotations can be created for descriptions of everyday entities and events. Additionally, we demonstrate that visual denotations can be used as a new model of semantic similarity, and that this model is better at identifying entailment relations between descriptions of images than traditional distributional similarities. In order to do this, we create an image caption corpus consisting of captions and images depicting everyday actions. This corpus has a number of useful features that would assist in investigating everyday events and the different ways in which they can be described. We use the captions in the corpus as the starting point for producing caption fragments with larger visual denotations. We accomplish that by creating a denotation graph, a subsumption hierarchy over the captions that links captions and images that depict them, that also allows for the visualization and navigation of the image caption corpus in an intuitive manner.

Graduation Semester

2013-12

Permalink

http://hdl.handle.net/2142/46633

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Using the visual denotations of image captions for semantic inference

Young, Peter

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In