Regression test selection: theory and practice

Gligoric, Milos Zivko

Regression test selection: theory and practice

Gligoric, Milos Zivko

Permalink

https://hdl.handle.net/2142/88038

Description

Title

Regression test selection: theory and practice

Author(s)

Gligoric, Milos Zivko

Issue Date

2015-07-15

Director of Research (if dissertation) or Advisor (if thesis)

Marinov, Darko

Doctoral Committee Chair(s)

Marinov, Darko

Committee Member(s)

Roşu, Grigore
Torrellas, Josep
Khurshid, Sarfraz
Majumdar, Rupak

Department of Study

Computer Science

Discipline

Computer Science

Degree Granting Institution

University of Illinois at Urbana-Champaign

Degree Name

Ph.D.

Degree Level

Dissertation

Keyword(s)

Regression test selection
Regression testing
Ekstazi
Distributed software histories

Abstract

Software affects every aspect of our lives, and software developers write tests to check software correctness. Software also rapidly evolves due to never-ending requirement changes, and software developers practice regression testing – running tests against the latest project revision to check that project changes did not break any functionality. While regression testing is important, it is also time-consuming due to the number of both tests and revisions. Regression test selection (RTS) speeds up regression testing by selecting to run only tests that are affected by project changes. RTS is efficient if the time to select tests is smaller than the time to run unselected tests; RTS is safe if it guarantees that unselected tests cannot be affected by the changes; and RTS is precise if tests that are not affected are also unselected. Although many RTS techniques have been proposed in research, these techniques have not been adopted in practice because they do not provide efficiency and safety at once. This dissertation presents three main bodies of research to motivate, introduce, and improve a novel, efficient, and safe RTS technique, called Ekstazi. Ekstazi is the first RTS technique being adopted by popular open-source projects. First, this dissertation reports on the first field study of test selection. The study of logs, recorded in real time from a diverse group of developers, finds that almost all developers perform manual RTS, i.e., manually select to run a subset of tests at each revision, and they select these tests in mostly ad hoc ways. Specifically, the study finds that manual RTS is not safe 74% of the time and not precise 73% of the time. These findings showed the urgent need for a better automated RTS techniques that could be adopted in practice. Second, this dissertation introduces Ekstazi, a novel RTS technique that is efficient and safe. Ekstazi tracks dynamic dependencies of tests on files, and unlike most prior RTS techniques, Ekstazi requires no integration with version-control systems. Ekstazi computes for each test what files it depends on; the files can be either executable code or external resources. A test need not be run in the new project revision if none of its dependent files changed. This dissertation also describes an implementation of Ekstazi for the Java programming language and the JUnit testing framework, and presents an extensive evaluation of Ekstazi on 615 revisions of 32 open-source projects (totaling almost 5M lines of code) with shorter- and longer-running test suites. The results show that Ekstazi reduced the testing time by 32% on average (and by 54% for longer-running test suites) compared to executing all tests. Ekstazi also yields lower testing time than the existing RTS techniques, despite the fact that Ekstazi may select more tests. Ekstazi is the first RTS tool adopted by several popular open-source projects, including Apache Camel, Apache Commons Math, and Apache CXF. Third, this dissertation presents a novel approach that improves precision of any RTS technique for projects with distributed software histories. The approach considers multiple old revisions, unlike all prior RTS techniques that reasoned about changes between two revisions – an old revision and a new revision – when selecting tests, effectively assuming a development process where changes occur in a linear sequence (as was common for CVS and SVN). However, most projects nowadays follow a development process that uses distributed version-control systems (such as Git). Software histories are generally modeled as directed graphs; in addition to changes occurring linearly, multiple revisions can be related by other commands such as branch, merge, rebase, cherry-pick, revert, etc. The novel approach reasons about commands that create each revision and selects tests for a new revision by considering multiple old revisions. This dissertation also proves the safety of the approach and presents evaluation on several open-source projects. The results show that the approach can reduce the number of selected tests over an order of magnitude for merge revisions.

Graduation Semester

2015-8

Type of Resource

text

Permalink

http://hdl.handle.net/2142/88038

Copyright and License Information

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Graduate Theses and Dissertations at Illinois

Dissertations and Theses - Computer Science

Dissertations and Theses from the Dept. of Computer Science

Regression test selection: theory and practice

Gligoric, Milos Zivko

Permalink

Description

Owning Collections

Graduate Dissertations and Theses at Illinois PRIMARY

Dissertations and Theses - Computer Science

Log In