Withdraw
Loading…
Improving utilization, granularity, and interpretability in visual representation learning
Huang, Edward Z
Loading…
Permalink
https://hdl.handle.net/2142/110564
Description
- Title
- Improving utilization, granularity, and interpretability in visual representation learning
- Author(s)
- Huang, Edward Z
- Issue Date
- 2021-04-26
- Director of Research (if dissertation) or Advisor (if thesis)
- Wang, Yuxiong
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- computer vision
- style transfer
- deep learning
- representation learning
- plca
- Abstract
- This thesis presents three works that revolve around improving the learning and usage of deep model features in computer vision. The first work is about improving style transfer, which is a generative artistic method that leverages pretrained deep model features. Style transfer boils down to a distribution matching problem, where the generated image must match the feature distribution of the style image within the same hidden layers of the pretrained model. To that end, we propose using statistical moments as metrics for assessing distribution matching. Current style transfer methods match the feature distributions using second-order statistics, which has two major limitations: 1.) they cannot match the third or higher-order moments, 2.) they cannot match the non-linear relationships between the dimensions. We propose two new methods of style transfer that address both of these limitations respectively, and significantly increase the quality in the mid-level and high-level textures of the style transfer. The second work is a semi-supervised contrastive learning method we call \textit{hierarchical contrastive learning}. The essence of contrastive learning is to differentiate between pairs of images that are deemed similar or not. There is much literature that shows contrastive learning helps deep models learn a rich set of features, which are useful for downstream tasks. Our method expands this technique on a granular level. Rather than learn a binary categorization of similar or dissimilar pairs, our method trains the model to understand a hierarchy of similarities between pairs of images. We hypothesize that such a learning scheme improves the representative quality of the features. Our analysis shows that our method outperforms current self/semi-supervised methods on transfer learning from ImageNet to other image datasets. The third work improves the interpretability of the deep model features on sparse image data. We integrate a decomposition method known as shift-invariant probabilistic latent component analysis (PLCA) into deep convolutional neural nets (CNNs). Hence we call our method Deep PLCA. Intuitively, PLCA decomposes image data into local structures (kernels), and their spatial locations (latent components). Compared to PLCA, Deep PLCA achieves the same reconstruction performance, and also has two key advantages: 1.) it generalizes to unseen data, 2.) it converges faster. All three works are open-sourced on GitHub and can be viewed through the following links: \url{https://github.com/aigagror/style-transfer-quality}, \url{https://github.com/aigagror/hiercon}, \url{https://github.com/aigagror/deep-plca}.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110564
- Copyright and License Information
- Copyright 2021 Edward Z. Huang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…