Withdraw
Loading…
Algorithms for infection and cancer genomics
Sashittal, Palash
Loading…
Permalink
https://hdl.handle.net/2142/113071
Description
- Title
- Algorithms for infection and cancer genomics
- Author(s)
- Sashittal, Palash
- Issue Date
- 2021-07-19
- Director of Research (if dissertation) or Advisor (if thesis)
- El-Kebir, Mohammed
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- Infection genomics
- Cancer genomics
- Combinatorial optimization
- Transcript assembly
- Doublet detection
- Abstract
- Continuous innovations and advances in sequencing technologies have led to the birth and development of several fields of research. In this thesis we propose four methods to address open problems in two such fields, infection genomics and cancer genomics. The first problem we address is reconstruction of transmission history of an outbreak using genomic and epidemiological data collected from infected hosts. It is challenging to account for all the relevant biological processes that occur during evolution and transmission of the pathogens in the outbreak while also addressing the uncertainty in the most likely solution. Our method, TiTUS, overcomes these challenges by first uniformly sampling from the set of all possible feasible transmission histories of the outbreak under a realistic model of evolution and transmission. Then, a consensus-based solution is generated that summarizes the candidate solutions in a biologically meaningful way. We show that TiTUS efficiently samples the solution space enabling accurate reconstruction of transmission history of an outbreak. The second method we introduce, Jumper, reconstructs viral transcripts using RNA-sequencing data from infected cells. In this study, we focus our attention on viruses in the Coronaviridae family, such as SARS-CoV-2, that express genes by a process of discontinuous transcription mediated by the viral RNA-dependent RNA polymerase. The viral transcriptome provides valuable information with clinical implications such as differential expression of viral genes, the host cell response to viral infection and the viral life cycle. We show that Jumper accurately infers the viral transcripts, outperforming existing transcript assembly methods, and facilitates the study of coronavirus transcriptomes under varying conditions. The third problem we address is doublet detection in single-cell DNA-sequencing data. Our method, doubletD, is the first stand-alone doublet detection method for single-cell DNA-sequencing data. We use a simple probabilistic model allowing a closed-form maximum likelihood solution that efficiently and accurately detects doublets by identifying characteristic signal in the variant allele frequency (VAF) distribution in the data. On simulations and multiple real datasets, we show that doublet identification and removal using doubletD improves downstream analysis such as genotype calling and phylogeny reconstruction. Finally, we present a new method, PACTION, which proposes a solution to the tumor phylogeny inference problem in cancer. Due to technological and methodological limitations, existing methods are restricted to identifying tumor clones and phylogenies only based on either small-scale mutations, such as single nucleotide variations (SNVs), or large-scale mutations, such as copy number aberrations (CNAs), preventing a comprehensive characterization of a tumor’s clonal composition. To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a reconciliation problem. We show that PACTION reliably identifies tumor clones and their evolutionary relationships even in the presence of noise or error in input SNVs and CNAs.
- Graduation Semester
- 2021-08
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/113071
- Copyright and License Information
- Copyright 2021 Palash Sashittal
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…