Withdraw
Loading…
Learning and adapting visual models for multiple specialized tasks
Mallya, Arun Mohanray
Loading…
Permalink
https://hdl.handle.net/2142/101314
Description
- Title
- Learning and adapting visual models for multiple specialized tasks
- Author(s)
- Mallya, Arun Mohanray
- Issue Date
- 2018-04-15
- Director of Research (if dissertation) or Advisor (if thesis)
- Lazebnik, Svetlana
- Doctoral Committee Chair(s)
- Lazebnik, Svetlana
- Committee Member(s)
- Forsyth, David
- Hoiem, Derek
- Shakhnarovich, Gregory
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Action Recognition, Visual Relationship Detection, Image Situations, Multi Task Training
- Abstract
- A key requirement for any agent that wishes to interact with the visual world is the ability to understand the behavior of objects in the scene, primarily through visual means. We humans, through our cognitive system, are able to localize other people and objects in scenes, understand their relationship to the surrounding environment, and reason about not only their actions and attributes, but also about concepts which require knowledge beyond what is afforded by the pixels in visual input, such as possible future states, motion, a person’s motivations, and so on. In this thesis, we outline work that takes small steps towards solving this daunting task of replicating the human visual cognitive system. This dissertation presents methods for predicting actions, interactions with objects, and increasingly structured scenarios from single images. We devise simple methods that make use of a variety of cues by taking into account the structure inherent in the tasks we aim to solve. We show that by solving these tasks as an intermediate step and using their outputs as features, we can develop methods that operate on visual and language inputs to improve performance on tasks that require high-level image information, such as answering questions about images and producing captions for images. One issue that accompanies the learning of multiple tasks with separate deep networks, such as the work described above, is the need to store separate models, which increases storage requirements and affects scalability. We formulate and present two novel methods that draw inspiration from network pruning and weight quantization that can reuse parts of an existing network for learning new tasks with minimal additional overhead, without hurting performance on tasks that were learned earlier.
- Graduation Semester
- 2018-05
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/101314
- Copyright and License Information
- Copyright 2018 Arun Mallya
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…