Withdraw
Loading…
Improved GPU implementations of the Pair-HMM forward algorithm for DNA sequence alignment
Li, Enliang
Loading…
Permalink
https://hdl.handle.net/2142/110760
Description
- Title
- Improved GPU implementations of the Pair-HMM forward algorithm for DNA sequence alignment
- Author(s)
- Li, Enliang
- Issue Date
- 2021-04-30
- Director of Research (if dissertation) or Advisor (if thesis)
- Chen, Deming
- Department of Study
- Electrical & Computer Eng
- Discipline
- Electrical & Computer Engr
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- M.S.
- Degree Level
- Thesis
- Keyword(s)
- GPU
- Hardware Acceleration
- Pair-HMM
- CUDA implementation
- Computational Genomics
- Abstract
- With the rise of Next-Generation Sequencing (NGS), clinical sequencing services have become more accessible but also facing new challenges. As we discovered the closed connection between key DeoxyriboNucleic Acid (DNA) mutation spots and major diseases or conditions, the need for computational genomics has increased significantly. The surging demand motivates developments of more efficient algorithms for genome assembly, error correction, k-mer counting etc. In this thesis, we focus on DNA sequencing analysis, one of the fastest-growing markets in NGS, and its related alignment problems. In recent years, many new hardware technologies and algorithms have been researched for their potential applications in massive parallel sequencing. The emerging hardware includes GPU, FPGA and other ASICs providing parallel processing resources. In this thesis, we choose GPU as our computation platform for its massive parallel processing capabilities. The Forward Algorithm (FA) still remains one of the most commonly used methods in solving sequences alignment problems modeled as Pair-Hidden Markov Model (HMM). The Pair-HMM Forward Algorithm (FA) is not only a computation but data intensive algorithm. Multiple previous works have been done in efforts to accelerate the computation of the FA by applying massive parallelization on the workload, and in this thesis, we bring more optimizations not only by improving the computation concurrency of both initialization process and Pair-HMM FA but also by tackling the communications overhead between the host and devices. We will discuss the general principles of optimizing the Forward Algorithm on GPU and present an improved implementation of the Pair-HMM FA with native CUDA C++. Our design has shown a speedup of 25.10x over the C++ baseline on the GATK HaplotypeCaller Pair-HMM workload with a portion of the real dataset from human genome database, NA12878. This is a major improvement that beats the state-of-the-art implementation with a margin of 60%.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110760
- Copyright and License Information
- Copyright 2021 Enliang Li
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Electrical and Computer Engineering
Dissertations and Theses in Electrical and Computer EngineeringManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…