Withdraw
Loading…
Automatic test generation for the detection of performance bugs in code optimization
Phaosawasdi, Amarin
Loading…
Permalink
https://hdl.handle.net/2142/110412
Description
- Title
- Automatic test generation for the detection of performance bugs in code optimization
- Author(s)
- Phaosawasdi, Amarin
- Issue Date
- 2021-02-01
- Director of Research (if dissertation) or Advisor (if thesis)
- Padua, David
- Doctoral Committee Chair(s)
- Padua, David
- Committee Member(s)
- Marinov, Darko
- Gropp, William D
- Kuck, David J
- Sadayappan, Ponnuswamy
- Maleki, Saeed
- Department of Study
- Computer Science
- Discipline
- Computer Science
- Degree Granting Institution
- University of Illinois at Urbana-Champaign
- Degree Name
- Ph.D.
- Degree Level
- Dissertation
- Keyword(s)
- compiler evaluation
- performance testing
- random testing
- metamorphic testing
- differential testing
- automated testing
- test generation
- compiler optimization
- Abstract
- Software is everywhere in our daily lives, and it is important that software behaves in ways it is expected to. Testing is a widely accepted method for improving software quality. Testing detects the presence of bugs by comparing the actual outcome to the expected outcome of a computation. Testing for correctness is a well-studied problem. Testing for correctness compares the actual outcome of computation against its expected output. Typically, the expected output of a computation is unambiguous, since computations in computer software typically have clear semantics defined by the programming language. However, testing for performance is less studied. The expected outcome of a test may require context-knowledge not apparent in the test program itself. For example, by simply inspecting the code of a web server, one cannot determine what is the expected throughput. This makes performance testing for performance a challenging task. Testing compilers adds another layer of complexity. For compilers, a correctness bug during compiler optimization may introduce a bug in the resulting binary, even though the bug was not present in the source code. Similarly, a performance bug during optimization may cause inconsistencies in the runtimes of equivalent programs, where equivalent programs are defined as programs with identical outcomes but whose sources may differ through semantic-preserving transformations. Performance bugs prevent compilers from producing efficient code when they have the ability to do so. Many testing techniques have been proposed. Random testing is a powerful testing technique often associated with test generation. It allows a large testing space to be explored efficiently through sampling and is suitable for large and complex software with a large testing space, such as compilers. Random test generation for compilers has been shown to be effective in detecting correctness bugs. However, to the best of our knowledge, there is no previous study on random test generation for performance bugs in compilers. We believe one of the main reasons is the context-dependent nature when quantifying performance headroom. We propose a random test generation infrastructure for evaluating the performance of compilers. We quantify the performance headroom of tests by borrowing existing ideas from previous studies. Namely, when a set of equivalent programs is compiled by a compiler, all programs should aim to perform as well as the best-performing program. Additionally, when a program is compiled by a set of compilers, all compilers should aim to generate code that performs as well as the code generated by the best-performing compiler. We define metrics to evaluate compilers based on these ideas. We used our system to evaluate four modern compilers -- Intel's ICC, GNU's GCC, the Portland Group Inc.'s PGI compiler, and Clang -- on how well they handle loop unrolling, loop interchange, and loop unroll-and-jam. Results suggest that ICC typically performs better than the other three compilers. On the other hand, our system also identified extreme outliers for ICC where, for example, one program becomes x180000 slower after unrolling a loop. Due to the nature of random testing, we also study the methodologies required to achieve reproducible results by using statistical methods. We apply these methodologies to our compiler evaluation and provide evidence that our experiments are reproducible across different randomly generated collections of code segments.
- Graduation Semester
- 2021-05
- Type of Resource
- Thesis
- Permalink
- http://hdl.handle.net/2142/110412
- Copyright and License Information
- Copyright 2021 Amarin Phaosawasdi
Owning Collections
Graduate Dissertations and Theses at Illinois PRIMARY
Graduate Theses and Dissertations at IllinoisDissertations and Theses - Computer Science
Dissertations and Theses from the Dept. of Computer ScienceManage Files
Loading…
Edit Collection Membership
Loading…
Edit Metadata
Loading…
Edit Properties
Loading…
Embargoes
Loading…