Withdraw
Loading…
Unsupervised video segmentation and its application to activity recognition
Cheng, Hsien Ting
Loading…
Permalink
https://hdl.handle.net/2142/72891
Description
- Title
- Unsupervised video segmentation and its application to activity recognition
- Author(s)
- Cheng, Hsien Ting
- Issue Date
- 2015-01-21
- Director of Research (if dissertation) or Advisor (if thesis)
- Ahuja, Narendra
- Doctoral Committee Chair(s)
- Ahuja, Narendra
- Committee Member(s)
- Forsyth, David A.
- Hasegawa-Johnson, Mark A.
- Huang, Thomas
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- segmentation
- Video segmentation
- Unsupervised clustering
- Activity recognition
- Multiple instance learning
- Abstract
- We addressed the fundamental problem of computer vision: segmentation and recognition, in the space-time domain. With the knowledge that generic image segmentation introduces unstable regions due to illumination, com- pression, etc., we utilized temporal information to achieve consistent 3D video segmentation. By exploiting non-local structure in both spatial and temporal space, the instabilities of the segmented regions were alleviated. A segmentation tree was built within every frame, and the label consistency was enforced within each subtree (i.e. spatial clique). By roughly tracking 2D regions across each frame, temporal clique was built in which label consis- tency was enforced as well. The high-order (more than binary) Conditional Random Field (CRF) is designed and solved efficiently. Experimental results demonstrate high-quality segmentation quantitatively and qualitatively. Taking segmented 3D regions, called tubes, as input, we developed an activity recognition framework not only to determine which activity existed in a video but also to locate where it happens. A robust tube feature was extracted with photometric and shape dynamics information. Activity was described as a Parts Activity Model (PAM) with a root template and four- part template under the root. Given the nature of the activity recognition problem that only some parts on the video were used to determine the activity label, we used Multiple Instance Learning (MIL) to formulate the problem. Latent variables included a tube index and the parts location under the root template. Experiments were conducted on three well-known datasets and a state-of-the-art result was achieved.
- Graduation Semester
- 2014-12
- Permalink
- http://hdl.handle.net/2142/72891
- Copyright and License Information
- Copyright 2014 Hsien Ting Cheng
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…