Combinatorial Interaction Testing (CIT) Benchmarks

URL

Change Log

When What
December 14th, 2015 Donated by Vivek Nair

Reference

Studies who have been using the data (in any form) are required to include the following reference:

@inproceedings{Jia:2015:LCI:2818754.2818821,
 author = {Jia, Yue and Cohen, Myra B. and Harman, Mark and Petke, Justyna},
 title = {Learning Combinatorial Interaction Test Generation Strategies Using Hyperheuristic Search},
 booktitle = {Proceedings of the 37th International Conference on Software Engineering - Volume 1},
 series = {ICSE '15},
 year = {2015},
 isbn = {978-1-4799-1934-5},
 location = {Florence, Italy},
 pages = {540--550},
 numpages = {11},
 url = {http://dl.acm.org/citation.cfm?id=2818754.2818821},
 acmid = {2818821},
 publisher = {IEEE Press},
 address = {Piscataway, NJ, USA},
}

About the Data

Overview of Data

There are five subject sets used in our experiments. The details are summarized below: [Syn-2] contains 14 pairwise (2-way) synthetic models without constraints. These are shown in the leftmost column of Table I. These models are benchmarks that have been used both to compare mathematical constructions as well as search based techniques [2], [10], [11], [18], [32]. We take these from Table 7 from the paper by Garvin et al. [2]. [Syn-3] contains 15 3-way synthetic models without constraints. These are shown in the second column of Table I. These models are benchmarks that have been used for mathematical constructions and search [10], [33], [34]. We take these from Table 7 from the paper by Garvin et al. [2]. [Syn-C2] contains 30 2-way synthetic models with constraints (see Table I, rightmost two columns). These models were designed to simulate configurations with constraints in real-world programs, generated by Cohen et al. [35] and adopted in follow-up research by Garvin et al. [2], [25]. [Real-1] contains real-world models from a recent benchmark created by Segall et al. [21], shown in Table II. There are 20 CIT problems in this subject set, generated by or for IBM customers. The 20 problems cover a wide range of applications, including telecommunications, healthcare, storage and banking systems. [Real-2] contains 6 real-world constrained subjects shown in Table II, which have been widely studied in the literature [2], [25], [30], [35], [36]. The TCAS model was first presented by Kuhn et al. [36]. TCAS is a traffic collision avoidance system from the ‘Siemens’ suite [37]. The rest of the models in this subject set were introduced by Cohen et al. [30], [35]. SPIN-S and SPIN-V are two components for model simulation and model verification. GCC is a well known compiler system from the GNU Project. Apache is a web server application and Bugzilla is a web-based bug tracking system.

Paper Abstract

The surge of search based software engineering research has been hampered by the need to develop customized search algorithms for different classes of the same problem. For instance, two decades of bespoke Combinatorial Interaction Testing (CIT) algorithm development, our exemplar problem, has left software engineers with a bewildering choice of CIT techniques, each specialized for a particular task. This paper proposes the use of a single hyperheuristic algorithm that learns search strategies across a broad range of problem instances, providing a single generalist approach. We have developed a Hyperheuristic algorithm for CIT, and report experiments that show that our algorithm competes with known best solutions across constrained and unconstrained problems: For all 26 real-world subjects, it equals or outperforms the best result previously reported in the literature. We also present evidence that our algorithm’s strong generic performance results from its unsupervised learning. Hyperheuristic search is thus a promising way to relocate CIT design intelligence from human to machine.