Withdraw
Loading…
Improving quality of high-throughput sequencing reads
Heo, Yun
Loading…
Permalink
https://hdl.handle.net/2142/88064
Description
- Title
- Improving quality of high-throughput sequencing reads
- Author(s)
- Heo, Yun
- Issue Date
- 2015-07-17
- Director of Research (if dissertation) or Advisor (if thesis)
- Chen, Deming
- Doctoral Committee Chair(s)
- Chen, Deming
- Committee Member(s)
- Hwu, Wen-Mei W.
- Ma, Jian
- Wong, Martin D.F.
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- Next-Generation Sequencing (NGS)
- Third-Generation Sequencing (TGS)
- error correction
- variant calling
- germline variant
- somatic variant
- Abstract
- Rapid advances in high-throughput sequencing (HTS) technologies have led to an exponential increase in the amount of sequencing data. HTS sequencing reads, however, contain far more errors than does data collected through traditional sequencing methods. Errors in HTS reads degrade the quality of downstream analyses. Correcting errors has been shown to improve the quality of these analyses. Correcting errors in sequencing data is a time-consuming and memory-intensive process. Even though many methods for correcting errors in HTS data have been developed, no one could correct errors with high accuracy while using a small amount of memory and in a short time. Another problem in using error correction methods is that no standard or comprehensive method is yet available to evaluate the accuracy and effectiveness of these error correction methods. To alleviate these limitations and analyze error correction outputs, this dissertation presents three novel methods. The first one, known as BLESS (Bloom-filter-based error correction solution for high-throughput sequencing reads), is a new error correction method that uses a Bloom filter as the main data structure. Compared to previous methods, it allows for the correction of errors with the highest accuracy at an average of 40 X memory usage reduction. BLESS is parallelized using hybrid OpenMP and MPI programming, which makes BLESS one of the fastest error correction tools. The second method, known as SPECTACLE (Software Package for Error Correction Tool Assessment on Nucleic Acid Sequences), supplies a standard way to evaluate error correction methods. SPECTACLE is the comprehensive method that can (1) do a quantitative analysis on both DNA and RNA corrected reads from any sequencing platforms and (2) handle diploid genomes and differentiate heterozygous alleles from sequencing errors. Lastly, this research analyzes the effect of sequencing errors on variant calling, which is one of the most important clinical applications for HTS data. For this, the environments for tracing the effect of sequencing errors on germline and somatic variant calling was developed. Using the environment, this research studies how sequencing errors degrade the results of variant calling and how the results can be improved. Based on the new findings, ROOFTOP (RemOve nOrmal reads From TumOr samPles) that can improve the accuracy of somatic variant calling by removing normal cells in tumor samples. A series of studies on sequencing errors in this dissertation would be helpful to understand how sequencing errors degrade downstream analysis outputs and how the quality of sequencing data could be improved by removing errors in the data.
- Graduation Semester
- 2015-8
- Type of Resource
- text
- Permalink
- http://hdl.handle.net/2142/88064
- Copyright and License Information
- Copyright 2015 Yun Heo
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…