Withdraw
Loading…
Multimodal data analysis applied to a medical setting
Vaishnavi Subramanian, -
Loading…
Permalink
https://hdl.handle.net/2142/100986
Description
- Title
- Multimodal data analysis applied to a medical setting
- Author(s)
- Vaishnavi Subramanian, -
- Issue Date
- 2018-04-16
- Director of Research (if dissertation) or Advisor (if thesis)
- Do, Minh N.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- histopathology
- image
- multimodal
- data
- spatial
- correlation
- CCA
- medical
- genes
- TCGA
- cancer
- sparsity
- Abstract
- Complex diseases, such as cancer, have traditionally been studied using genetic data, or images alone. To understand the biology of such diseases, joint analysis of multiple data modalities could provide interesting insights. We propose the use of canonical correlation analysis (CCA) as a preliminary discovery tool for identifying connections across modalities, specifically between gene expression and features describing cell and nucleus shape, texture, and stain intensity in histopathological images. It is also important to capture the interaction between different types of cells, an important indicator of disease status. To that end, it is crucial to quantify and utilize the spatial distribution of various cell types within the examined tissue at different scales. We employ Ripley's K-statistic, a traditional feature employed in geographical information systems, which captures spatial distribution patterns of individual point sets and interactions between multiple point sets. We propose to improve the histopathology image features by incorporating this descriptor to capture the spatial distribution of the cells, and interactions between lymphocytes and epithelial cells. Applied to 615 breast cancer samples from The Cancer Genome Atlas, CCA revealed significant correlation of 0.736 (p approx 1e-14) and 0.471, (p approx 7e-3) for CCA and Sparse CCA, respectively, of several image features with expression of PAM50 genes, known to be linked to outcome. Sparse CCA, an extension of CCA based on sparsity, revealed associations with enrichment of pathways implicated in cancer without leveraging prior biological understanding. The utility of the Ripley's K-statistic on 710 TCGA breast invasive carcinoma (BRCA) patients' histopathology images in the context of imaging-genetics is demonstrated by its superior correlations with gene expressions. These findings affirm the utility of CCA for joint phenotype-genotype analysis of cancer, and the importance of capturing spatial features at multiple scales.
- Graduation Semester
- 2018-05
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/100986
- Copyright and License Information
- Copyright 2018 - Vaishnavi Subramanian
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…