Accelerating graph pattern mining algorithms on modern graphics processing units

AlMasri, Mohammad Abed Alnaser

Accelerating graph pattern mining algorithms on modern graphics processing units

AlMasri, Mohammad Abed Alnaser

Permalink

https://hdl.handle.net/2142/115439

Description

Title

Accelerating graph pattern mining algorithms on modern graphics processing units

Author(s)

AlMasri, Mohammad Abed Alnaser

Issue Date

2022-04-22

Director of Research (if dissertation) or Advisor (if thesis)

Hwu, Wen-mei

Doctoral Committee Chair(s)

Hwu, Wen-mei

Committee Member(s)

Chen, Deming
Xiong, Jinjun
Nagu, Rakesh
Patel, Sanjay

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

GPU
CUDA
Graph
Graph Pattern Mining
Graph Pattern Matching
Clique
Maximal Clique
Triangle Counting
Subgraph Matching
Graph Analysis

Abstract

Graphical processing units (GPUs) have been widely used as a general-purpose processing unit. More recently, there has been fast growing interest in accelerating graph pattern mining tasks on GPUs. At the core of graph pattern mining, graph pattern matching (GPM) is a fundamental operation that aims to find patterns of interest in large data graphs. GPM is typically performed by traversing search trees starting at each vertex in the graph and looking for neighborhood subgraph patterns of interest. A search tree in GPM has a depth equal to the number of vertices in the pattern graph. Each path from the root of a search tree to one of the leaves defines a potential mapping from the pattern graph vertices to a group of neighboring vertices in the data graph. Patterns can be of different sizes and shapes, such as triangles, cliques, wheels, cycles, etc. The larger the pattern size is, the deeper the search tree is. Searching for patterns in graphs is known to be hard due to the exponential growth of intermediate results and the number of instances of patterns. Therefore, execution time and memory requirements grow exponentially with the pattern size. State-of-the-art GPU solutions for GPM traverse the search trees using the Breath-First Search (BFS) approach that exhibits high degrees of parallelism at the cost of high memory requirement, and presents a great challenge for processing large graphs and patterns since the memory capacity of GPUs is significantly smaller than that of CPUs. This work presents novel techniques to implement GPM tree traversal efficiently on GPUs. The tree traversal presented herein uses a hybrid breadth-first and depth-first approach where the top level(s) of the search trees are traversed in a fully parallel, breadth-first manner while each subtree is traversed in a more space-efficient, depth-first manner. The depth-first traversal of subtrees requires less memory but presents more challenges for parallel execution. The techniques presented in this work overcome the less parallel nature of depth-first traversal by exploring vertex-centric and edge-centric parallelization schemes, using multi-level hierarchical organization of threads, improving to the set intersection operation that is fundamental to graph pattern mining algorithms, improving fine-grain parallelism using sub-warp partitioning, and replacing the depth-first recursive traversal with an iterative traversal based on an explicitly-managed shared stack. Moreover, various optimizations are applied to reduce memory consumption and improve data structure reuse. The effectiveness of the proposed techniques is demonstrated using four fundamental graph pattern mining algorithms: (a) subgraph matching for exact GPM for arbitrary patterns, (b) k-clique counting, (c) maximal clique enumeration, and (d) triangle counting. The detailed evaluation results show that the proposed GPU solutions substantially improve performance compared to existing state-of-the-art CPU and GPU solutions. The implementations presented in this thesis can efficiently process the graphs that previously cannot be processed by the state-of-the-art GPU solutions due to their excessive memory requirement. Detailed evaluations of the trade-offs involved in the choice of parallelization scheme, and the incremental speedup of each optimization to provide an in-depth understanding of the optimization space, are provided. The insights from the proposed optimization flow in this work can be useful for optimizing graph mining solutions on GPUs.

Graduation Semester

2022-05

Type of Resource

Thesis

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Accelerating graph pattern mining algorithms on modern graphics processing units

AlMasri, Mohammad Abed Alnaser

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Log In