Withdraw
Loading…
Single-cell multi-omic data analysis with mathematical and statistical methods
Zhang, Shuyi
Loading…
Permalink
https://hdl.handle.net/2142/116075
Description
- Title
- Single-cell multi-omic data analysis with mathematical and statistical methods
- Author(s)
- Zhang, Shuyi
- Issue Date
- 2022-07-13
- Director of Research (if dissertation) or Advisor (if thesis)
- Song, Jun S
- Doctoral Committee Chair(s)
- Golding, Ido
- Committee Member(s)
- Kim, Sangjin
- Zhao, Sihai Dave
- Department of Study
- Physics
- Discipline
- Physics
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Sequencing analysis
- Stochastic processes
- Information geometry
- Spectral graph theory
- Abstract
- Recent advances in next-generation sequencing-based single-cell technologies have allowed high-throughput quantitative detection of cell-surface proteins along with the transcriptome in individual cells, extending our understanding of the heterogeneity of cell populations in diverse tissues that are in different diseased states or under different experimental conditions. From the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) technology, in particular, count data of surface proteins allow for immunophenotyping of cells yet pose new computational challenges; there is currently a dearth of rigorous mathematical tools for analyzing the data. In this thesis, we seek to address three issues in data analysis for CITE-seq, namely, removing the systematic biases between samples, calling true signals from noise, and merging information from multiple modalities. First, we utilize concepts and ideas from Riemannian geometry to remove batch effects between samples. Subsequently, we develop a framework for distinguishing positive signals from background noise using statistical inference and multiple testing. Lastly, we use the ideas of Hamiltonian operators and density matrices from physics and introduce a unified graph-based learning scheme for effectively merging information from multiple modalities. The strengths of these approaches are demonstrated on CITE-seq data sets of mouse and human tissue samples. The geometrical methods for batch correction, the statistical methods for signal detection, and the graph-based methods for effectively merging the multiple modalities that we introduce in this thesis provide promising frameworks based on ideas from mathematics, statistics, and physics for analyzing the multi-omic data generated using the CITE-seq technology.
- Graduation Semester
- 2022-08
- Type of Resource
- Thesis
- Copyright and License Information
- Copyright 2022 Shuyi Zhang
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…