Scene understanding with complete scenes and structured representations

Guo, Ruiqi

Scene understanding with complete scenes and structured representations

Guo, Ruiqi

Permalink

https://hdl.handle.net/2142/50564

Description

Title

Scene understanding with complete scenes and structured representations

Author(s)

Guo, Ruiqi

Issue Date

2014-09-16

Director of Research (if dissertation) or Advisor (if thesis)

Hoiem, Derek W.

Doctoral Committee Chair(s)

Hoiem, Derek W.

Committee Member(s)

Forsyth, David A.
Roth, Dan
Urtasun, Raquel

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Date of Ingest

2014-09-16T17:23:54Z

Keyword(s)

Scene Understanding
Computer Vision
Machine Learning
Computer Graphics
Image Parsing
Image Segmentation
RGB-D images

Abstract

Humans can understand scenes with abundant detail: they see layouts, surfaces, the shape of objects among other details. By contrast, many machine-based scene analysis algorithms use simple representation to parse scenes, mainly bounding boxes and pixel labels, and apply only to visible regions. We believe we should move to deeper levels of scene analysis, embracing more a comprehensive, structured representation. In this dissertation, we focus on analyzing scenes to their complete extent and structured details. First off, our work uses a structured representation that is closer to human interpretation, with a mixture of layout, functional objects and clutter. We developed annotation tools and collected a dataset of 1449 rooms annotated in detailed 3D models. Another feature of our work is that we understand scenes to their complete extent, even parts of them beyond the line of the sight. We present a simple framework to detect visible portion with appearance-based models and then infer the occluded portion with a contextual approach. We integrate contexts from surrounding regions, the spatial prior and shape regularity of background surfaces. Our method is applicable to 2D images, and can also be used to infer support surfaces in 3D scenarios. Our complete surface prediction quantitatively out-performs relevant baselines, especially when they are occluded. Finally, we present a system that interprets from single-view RGB-D images of indoor scenes into our proposed representation. Such a scene interpretation is useful for robotics and visual reasoning but difficult to produce due to the well-known challenge of segmenting objects, the high degree of occlusion, and the diversity of objects in indoor scenes. We take a data-driven approach, generating sets of potential object regions, matching them with regions in training images, and transferring and aligning associated 3D models while encouraging them to be consistent with observed depths. To the best of our knowledge, this is the first automatic system capable of interpreting scenes into 3D models with similar levels of detail.

Graduation Semester

2014-08

Permalink

http://hdl.handle.net/2142/50564

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Scene understanding with complete scenes and structured representations

Guo, Ruiqi

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In