OpenMP-CUDA implementation of the moment method and multilevel fast multipole algorithm on multi-GPU computing systems
Guan, Jian
Loading…
Permalink
https://hdl.handle.net/2142/44803
Description
Title
OpenMP-CUDA implementation of the moment method and multilevel fast multipole algorithm on multi-GPU computing systems
Author(s)
Guan, Jian
Issue Date
2013-05-28T19:20:32Z
Director of Research (if dissertation) or Advisor (if thesis)
Jin, Jianming
Department of Study
Electrical & Computer Eng
Discipline
Electrical & Computer Engr
Degree Granting Institution
University of Illinois at Urbana-Champaign
Degree Name
M.S.
Degree Level
Thesis
Keyword(s)
CUDA
electromagnetic scattering
hybrid parallel programming model
moment method
multilevel fast multipole algorithm
multi-GPU
OpenMP
radar cross section
Abstract
In this thesis, the method of moments (MoM) and the multilevel fast multipole algorithm (MLFMA) are implemented for GPU computation based on the hybrid OpenMP-CUDA parallel programming model. The resultant algorithms are called the OpenMP-CUDA-MoM and the OpenMP-CUDA-MLFMA, respectively. Both of the proposed methods are applied to compute electromagnetic scattering by a three-dimensional conducting object.
For the OpenMP-CUDA-MoM, the multi-GPU parallelization of system matrix assembly, iterative solution, and fast evaluation of radar cross section
(RCS) are discussed in detail. The parallel efficiency versus number of devices is investigated through the calculation of a conducting sphere on different number of GPUs. The parallel efficiency of the total computation is over 87%. The total speedup for the monostatic RCS calculation of a NASA almond by 4 GPUs is between 80 and 260 times.
For the GPU accelerated MLFMA, the hierarchical parallelization strategy is employed, which ensures a high computational throughput for the GPU calculation. The resulting OpenMP-based multi-GPU implementation is capable of solving real-life problems with over 1 million unknowns with a
remarkable speedup. The RCS of a few benchmark objects are calculated to demonstrate the accuracy of the solution. The results are compared with those from the CPU-based MLFMA and measurements. The capability of the proposed method is analyzed through the examples of a sphere, an aerocraft and a missile-like object. The total speedup achieved by 4 GPUs is between 20 and 80 times.
Use this login method if you
don't
have an
@illinois.edu
email address.
(Oops, I do have one)
IDEALS migrated to a new platform on June 23, 2022. If you created
your account prior to this date, you will have to reset your password
using the forgot-password link below.