Withdraw
Loading…
Efficient mining of informative descriptors from data with scarce annotations
Zhuang, Furen
This item's files can only be accessed by the Administrator group.
Permalink
https://hdl.handle.net/2142/117596
Description
- Title
- Efficient mining of informative descriptors from data with scarce annotations
- Author(s)
- Zhuang, Furen
- Issue Date
- 2022-12-02
- Director of Research (if dissertation) or Advisor (if thesis)
- Moulin, Pierre
- Doctoral Committee Chair(s)
- Moulin, Pierre
- Committee Member(s)
- Veeravalli, Venugopal V
- Schwing, Alexander G
- Wang, Yuxiong
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Metric Learning
- Retrieval
- Hashing
- Abstract
- In this thesis informative descriptors are learnt where similar data are closer than dissimilar ones. Such descriptors are commonly used in Content-Based Information Retrieval applications, where the amount of data that needs to be handled is very large. To that end, it is ideal for the descriptors to be compact, the training method able to effectively utilize scarce annotations, and the representations able to generalize to unseen data. The second chapter shows how very short hash codes can be learnt which perform well in image retrieval. It is desired for such hash codes to be similarity-preserving, balanced and pairwise uncorrelated. Similarity-preserving means the hash codes of similar images have a shorter Hamming distance than dissimilar ones. Balanced bits means each bit coordinate has uniform probability. Balanced and uncorrelated bits encourage an equal number of data items to be mapped to each code. We utilize a variational autoencoder (VAE) and show how all three ideals can be seamlessly incorporated into the VAE to directly obtain hash bits from an intermediate layer. We also extend the framework to improve generalizability, allowing the hash codes to perform well even on classes that were unseen during training. This is achieved by explicitly tolerating variation found in the hash codes of similar data, while ensuring that label-consistency is maintained. The third chapter proposes a semi-supervised metric learning method which is computationally efficient due to the use of proxies and is able to effectively harness unlabeled data by identifying far-apart similar pairs and close dissimilar pairs. In the fourth chapter, we show how this identification can be improved using the provided labels. A new mixed label propagation method is also proposed to incorporate negative edge information into label propagation.
- Graduation Semester
- 2022-12
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Furen Zhuang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…