Improving utilization, granularity, and interpretability in visual representation learning

Huang, Edward Z

Improving utilization, granularity, and interpretability in visual representation learning

Huang, Edward Z

Permalink

https://hdl.handle.net/2142/110564

Description

Title

Improving utilization, granularity, and interpretability in visual representation learning

Author(s)

Huang, Edward Z

Issue Date

2021-04-26

Director of Research (if dissertation) or Advisor (if thesis)

Wang, Yuxiong

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

M.S.

Degree Level

Thesis

Date of Ingest

2021-09-17T01:11:17Z

Keyword(s)

computer vision
style transfer
deep learning
representation learning
plca

Abstract

This thesis presents three works that revolve around improving the learning and usage of deep model features in computer vision. The first work is about improving style transfer, which is a generative artistic method that leverages pretrained deep model features. Style transfer boils down to a distribution matching problem, where the generated image must match the feature distribution of the style image within the same hidden layers of the pretrained model. To that end, we propose using statistical moments as metrics for assessing distribution matching. Current style transfer methods match the feature distributions using second-order statistics, which has two major limitations: 1.) they cannot match the third or higher-order moments, 2.) they cannot match the non-linear relationships between the dimensions. We propose two new methods of style transfer that address both of these limitations respectively, and significantly increase the quality in the mid-level and high-level textures of the style transfer. The second work is a semi-supervised contrastive learning method we call \textit{hierarchical contrastive learning}. The essence of contrastive learning is to differentiate between pairs of images that are deemed similar or not. There is much literature that shows contrastive learning helps deep models learn a rich set of features, which are useful for downstream tasks. Our method expands this technique on a granular level. Rather than learn a binary categorization of similar or dissimilar pairs, our method trains the model to understand a hierarchy of similarities between pairs of images. We hypothesize that such a learning scheme improves the representative quality of the features. Our analysis shows that our method outperforms current self/semi-supervised methods on transfer learning from ImageNet to other image datasets. The third work improves the interpretability of the deep model features on sparse image data. We integrate a decomposition method known as shift-invariant probabilistic latent component analysis (PLCA) into deep convolutional neural nets (CNNs). Hence we call our method Deep PLCA. Intuitively, PLCA decomposes image data into local structures (kernels), and their spatial locations (latent components). Compared to PLCA, Deep PLCA achieves the same reconstruction performance, and also has two key advantages: 1.) it generalizes to unseen data, 2.) it converges faster. All three works are open-sourced on GitHub and can be viewed through the following links: \url{https://github.com/aigagror/style-transfer-quality}, \url{https://github.com/aigagror/hiercon}, \url{https://github.com/aigagror/deep-plca}.

Graduation Semester

2021-05

Type of Resource

Thesis

Permalink

http://hdl.handle.net/2142/110564

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Siebel School of Computer Science

Improving utilization, granularity, and interpretability in visual representation learning

Huang, Edward Z

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In