Withdraw
Loading…
Exploration of GPU acceleration for pair-HMM algorithm and its application in the DNA alignment problem
Li, Enliang
Loading…
Permalink
https://hdl.handle.net/2142/104023
Description
- Title
- Exploration of GPU acceleration for pair-HMM algorithm and its application in the DNA alignment problem
- Author(s)
- Li, Enliang
- Contributor(s)
- Chen, Deming
- Issue Date
- 2019-05
- Keyword(s)
- Hardware Acceleration
- DNA Alignment
- Pair-HMM
- Forward Algorithm
- CUDA Implementation
- Abstract
- The hidden Markov model, known as HMM, is an important type of statistical model with extensive application in estimating hidden parameters and decoding observed Markov chains. On top of the HMM, the Pair-HMM Algorithm with Halotype-Caller is developed as a popular solution for the DNA alignment problem. For two aligned sequences of DNA observations, one named as reference, and the other one named as read, there are only three possible hidden states, i.e. match (A , A), insertion (- , A), and deletion (A , -). However, what we could observe by DNA sequencing in real-life is the summation of the possibilities for match, insertion, and deletion as macrostates. In order to determine the alignment with maximum probability, we need to score each possible pairwise alignment and which leads to a computationally intensive problem that usually contributes to the most latency in a variant calling with the GATK HaplotypeCaller. In the CPU implementation of a proper Pair-HMM forward algorithm, there are 7 multiply-accumulate operations for each ( i , j ) location on the read-reference matrix. Moreover, since transitions and emission matrices are fixed throughout a single alignment process, a CUDA implementation with single-precision floating-point is proposed to accelerate the Pair-HMM forward algorithm. CUDA implementation with minibatch and states-parallelization, along with the use of float32, gives us an around 22.6x speedup compared to the CPU implementation. While it comes with a price, using single-precision instead of double-precision floating-point introduces a more serious under flow problem at the beginning of the alignment scoring process. A normalization technique is used to help fix this problem.
- Type of Resource
- text
- Language
- en
- Permalink
- http://hdl.handle.net/2142/104023
Owning Collections
Senior Theses - Electrical and Computer Engineering PRIMARY
The best of ECE undergraduate researchManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…