Multiple-Implementation Testing of Supervised Learning Software

Alebiosu, Oreoluwa; Srisakaokul, Siwakorn; Astorga, Angello; Xie, Tao

Multiple-Implementation Testing of Supervised Learning Software

Alebiosu, Oreoluwa; Srisakaokul, Siwakorn; Astorga, Angello; Xie, Tao

Permalink

https://hdl.handle.net/2142/91645

Description

Title

Multiple-Implementation Testing of Supervised Learning Software

Author(s)

Alebiosu, Oreoluwa
Srisakaokul, Siwakorn
Astorga, Angello
Xie, Tao

Issue Date

2016-10-10

Keyword(s)

Multiple-Implementation Testing, Machine Learning Software, Supervised Learning Software

Abstract

Machine learning (ML) software, used to implement an ML algorithm, is widely used in many application domains such as financial, business, and engineering domains. Faults in ML software can cause substantial losses in these application domains. Thus, it is very critical to conduct effective testing of ML software to detect and eliminate its faults. However, testing ML software is difficult, especially on producing test oracles used for checking behavior correctness (such as using expected properties or expected test outputs). To tackle the test-oracle issue, in this paper, we present a novel black-box approach of multiple-implementation testing for supervised learning software. The insight underlying our approach is that there can be multiple implementations (independently written) for a supervised learning algorithm, and majority of them may produce the expected output for a test input (even if none of these implementations are fault-free). In particular, our approach derives a pseudo-oracle for a test input by running the test input on n implementations of the supervised learning algorithm, and then using the common test output produced by a majority (determined by a percentage threshold) of these n implementations. Our approach includes techniques to address challenges in multiple-implementation testing (or generally testing) of supervised learning software: definition of a test case in testing supervised learning software, along with resolution of inconsistent algorithm configurations across implementations. The evaluations on our approach show that our multiple-implementation testing is effective in detecting real faults in real-world ML software (even popularly used ones), including 5 faults from 10 NaiveBayes implementations and 4 faults from 20 k-nearest neighbor implementations.

Type of Resource

text

Language

Permalink

http://hdl.handle.net/2142/91645

Multiple-Implementation Testing of Supervised Learning Software

Alebiosu, Oreoluwa; Srisakaokul, Siwakorn; Astorga, Angello; Xie, Tao

Permalink

Description

Owning Collections

Research and Tech Reports - Computer Science PRIMARY

Multiple-Implementation Testing of Supervised Learning Software

Alebiosu, Oreoluwa; Srisakaokul, Siwakorn; Astorga, Angello; Xie, Tao

Permalink

Description

Owning Collections

Research and Tech Reports - Computer Science PRIMARY

Log In