InfoTestSS: Information theory and Test Suite Selection

Lead Research Organisation: University College London
Department Name: Computer Science


Software testing is an important part of the software development process but typically is manual, expensive, and error prone. This has led to significant interest in automated test generation (and execution) algorithms, with these having the potential to lead to cheaper, higher-quality software. Despite the interest in automating parts of testing, there are still significant challenges, with auto-testing being mentioned as an EPSRC priority within Software Engineering.

This project will build on initial work by the PIs that has demonstrated that an important aspect of testing can be represented in terms of Quantified Information Flow. Specifically, the PIs previously looked at Failed Error Propagation (FEP), which is sometimes called coincidental correctness. In FEP, a test execution goes through a faulty part of the software, this leads to what would be regarded as a corrupted program state (i.e. the fault has an effect) but ultimately the output is correct. Although studies have shown that FEP can significantly reduce test effectiveness, there is a lack of practical techniques that address FEP. The observation made by the PIs is that FEP corresponds to a failure for information to flow from the fault in the software to output: information is lost through different values for the program state (correct and faulty values) being mapped to the same output.

The PIs have shown how FEP can be represented in terms of an information theoretic notion: Quantified Information Flow (QIF). The results of experiments were highly promising, with there being a rank correlation of over 0.95 between the frequency with which FEP was observed in software and a QIF-based metric. This remarkably strong result opens up the possibility of devising techniques that generate test cases that are less likely to suffer from FEP. In addition, we believe that it is possible to represent other important testing concepts using information theory, specifically: the 'feasibility' of a path (we do not want test automation to waste effort in trying to trigger infeasible paths), the diversity of a test suite (evidence suggests that diverse test suites are effective), and also the effectiveness of probes/oracles added to the code.

This project will develop new methods, based on information theory, for reasoning about the above factors (FEP, feasibility, diversity, and oracles). In doing so it will develop information theoretic measures that can help test automation to overcome the associated issues. It will also develop methods for estimating these measures, integrate these estimates into automated test generation, and evaluate the results on open source software and software provided by our industrial partners. The outcome will be a new theory for software testing, based on information theory, and a set of techniques that use this theory to make software testing more efficient and effective.

Planned Impact

Software increasingly underpins our contemporary world. This trend will continue for the foreseeable future. Software relies on testing to ensure that it behaves as required. Testing is perhaps less science than art and there are a number of known, persistent problems in software testing. The proposed research will address four classic problems in selecting a suitable set of test inputs; will show how solutions to these can improve two state-of-the-art approaches to selection; then will produce a prototype tool that automates the solutions. Furthermore, the research plans to do this by modelling the four classic problems using information theory, producing an underpinning theory that has the potential to develop into a general theory of software testing.

short-term impact
- Develop human capital by deepening and extending the understanding of software testing of the researchers and collaborators directly collaborating in the project, both from academia and industry.
- Within the academic software testing community, increase the understanding of software testing and the knowledge of novel techniques to solve problems within this domain; build a new research community in applications of information theory to software testing and to software engineering more generally; provide new lines of research and new research problems for both the QIF and the software testing communities.
- Improve the UK and European quality of life through cheaper, better quality software. This will, particularly and more immediately, benefit the sectors represented by our industrial partners, namely the banking and automotive sectors.
- Add to the number of options in choosing tools for automating testing.

medium-term impact
- Reduce software cost and improve time to market for all forms of software development.
- Improve the sophistication of software testing and the level of skill and knowledge of its practitioners.
- Lay foundations for an information theory based, coherent theory of software testing; make connections to a more general theory of software engineering based on the theory of information.
- Expand the connections between software testing and formal methods via information theory.
- Create a new venue for discussion, publication and dissemination of ideas about these topics.

long-term impact
- Contribute to providing laws, predictions, and an integrated theory for software testing in particular and software engineering in general.
- Improve society's confidence in software affecting many aspects of life.


10 25 50
publication icon
Clark D (2019) Normalised Squeeziness and Failed Error Propagation in Information Processing Letters

Description Have developed an open source tool that enables programmers to test assertion oracles embedded in code to improve testing.
Exploitation Route Programmers in Java can download and use this tool to assist them in testing their programs.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Communities and Social Services/Policy,Construction,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Electronics,Energy,Environment,Financial Services, and Management Consultancy,Healthcare,Leisure Activities, including Sports, Recreation and Tourism,Government, Democracy and Justice,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections,Pharmace

Description Collaboration with Fondazione Bruno Keller, Trento, Italy 
Organisation Fondazione Bruno Kessler
Country Italy 
Sector Private 
PI Contribution I have co-authored a paper with Professor Paolo Tonella and we jointly supervise a PhD student. The student is supported by the grant funding from FBK.
Collaborator Contribution FBK have supplied the funding for a PhD student, Gunel Jahangirova, whom Professor Tonella and I jointly supervise. We have collaborated extensively, producing one paper so far.
Impact Gunel Jahangirova, David Clark, Mark Harman, Paolo Tonella: Test oracle assessment and improvement. ISSTA 2016: 247-258.
Start Year 2015
Description TAROT Summer School 2018 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Summer School on research in software testing held at UCL. Organised by the InfoTestSS team at UCL and Brunel. David Clark presented a talk on InfoTestSS research. Took place over five days an involved many international experts.
Year(s) Of Engagement Activity 2018
Description Talk at Leicester Computer Science Department External Seminar series, 7 December 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Professional Practitioners
Results and Impact Talk on information theory and testing.
Year(s) Of Engagement Activity 2018
Description The 57th CREST Open Workshop - Information Theory and Software Testing 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact This was a workshop designed to introduce the Information Theory for Test Set Selection project to academics and industry. Almost 50 people attended with 40 people formally registered. The audience was drawn from different countries with the majority of the participants being from the UK.
Year(s) Of Engagement Activity 2018