|Authors||A. Arcuri and L. Briand|
|Title||Adaptive Random Testing: an Illusion of Effectiveness|
|Project(s)||No Simula project|
|Publication Type||Technical reports|
|Year of Publication||2010|
|Number||2010-09, Version 2|
|Publisher||Simula Research Laboratory|
Adaptive Random Testing (ART) has been proposed as an enhancement to random testing, based on assumptions on how failing test cases are distributed in the input domain. The main assumption is that failing test cases are usually grouped into contiguous regions. Several papers have been published in which ART has been described as an effective alternative to random testing when using the F-measure for evaluation, the average number of test case executions needed to find a failure. But all the work in the literature is based either on simulations or biased case studies with unreasonably high failure rates. In this paper, we report on the largest empirical analysis of ART in the literature, in which 3727 mutated programs and nearly five trillion test cases were used. Results show that ART is highly inefficient even on trivial problems when accounting for distance calculations among test cases, to an extent that probably prevents its practical use in most situations. For example, on the infamous Triangle Classification program, random testing finds failures in few milliseconds whereas ART execution time is prohibitive. Even when assuming a small, fixed size test set and looking at the probability of failure (P-measure), ART does not fare well at all, even though slightly better than random testing. We provide precise explanations of this phenomenon based on rigorous empirical analyses. For the simpler case of single-dimension input domains, we also perform formal analyses to support our claim that ART is of little use unless drastic enhancements are developed. We clearly identify which are the components of ART that need to be improved to make it a viable option in practice.