High-level automation of custom hardware design for high-performance computing

Papakonstantinou, Alexandros

High-level automation of custom hardware design for high-performance computing

Papakonstantinou, Alexandros

Permalink

https://hdl.handle.net/2142/42137

Description

Title

High-level automation of custom hardware design for high-performance computing

Author(s)

Papakonstantinou, Alexandros

Issue Date

2013-02-03T19:16:53Z

Director of Research (if dissertation) or Advisor (if thesis)

Chen, Deming

Doctoral Committee Chair(s)

Chen, Deming

Committee Member(s)

Cong, Jason
Hwu, Wen-Mei W.
Wong, Martin D.F.

Department of Study

Electrical & Computer Eng

Discipline

Electrical & Computer Engr

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

High-level synthesis
Field-Programmable Gate Array (FPGA)
CUDA
parallel programming
High Performance Computing (HPC)

Abstract

This dissertation focuses on efficient generation of custom processors from high-level language descriptions. Our work exploits compiler-based optimizations and transformations in tandem with high-level synthesis (HLS) to build high-performance custom processors. The goal is to offer a common multiplatform high-abstraction programming interface for heterogeneous compute systems where the benefits of custom reconfigurable (or fixed) processors can be exploited by the application developers. The research presented in this dissertation supports the following thesis: In an increasingly heterogeneous compute environment it is important to leverage the compute capabilities of each heterogeneous processor efficiently. In the case of FPGA and ASIC accelerators this can be achieved through HLS-based flows that (i) extract parallelism at coarser than basic block granularities, (ii) leverage common high-level parallel programming languages, and (iii) employ high-level source-to-source transformations to generate high-throughput custom processors. First, we propose a novel HLS flow that extracts instruction level parallelism beyond the boundary of basic blocks from C code. Subsequently, we describe FCUDA, an HLS-based framework for mapping fine-grained and coarse-grained parallelism from parallel CUDA kernels onto spatial parallelism. FCUDA provides a common programming model for acceleration on heterogeneous devices (i.e. GPUs and FPGAs). Moreover, the FCUDA framework balances multilevel granularity parallelism synthesis using efficient techniques that leverage fast and accurate estimation models (i.e. do not rely on lengthy physical implementation tools). Finally, we describe an advanced source-to-source transformation framework for throughput-driven parallelism synthesis (TDPS), which appropriately restructures CUDA kernel code to maximize throughput on FPGA devices. We have integrated the TDPS framework into the FCUDA flow to enable automatic performance porting of CUDA kernels designed for the GPU architecture onto the FPGA architecture.

Graduation Semester

2012-12

Permalink

http://hdl.handle.net/2142/42137

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Electrical and Computer Engineering

Dissertations and Theses in Electrical and Computer Engineering

High-level automation of custom hardware design for high-performance computing

Papakonstantinou, Alexandros

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Electrical and Computer Engineering

Log In