GReaTest: Growing Readable Software Tests

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

Testing is a crucial part of any software development process. Testing is also very expensive: Common estimations list the effort of software testing at 50% of the average budget. Recent studies suggest that 77% of the time that software developers spend with testing is used for reading tests. Tests are read when they are generated, when they are updated, fixed, or refactored, when they serve as API usage examples and specification, or during debugging. Reading and understanding tests can be challenging, and evidence suggests that, despite the popularity of unit testing frameworks and test-driven development, the majority of software developers do not practice testing actively. Automatically generated tests tend to be particularly unreadable, severely inhibiting the widespread use of automated test generation in practice. The effects of insufficient testing can be dramatic, with large economic damage, and the potential to harm people relying on software in safety critical applications.

Our proposed solution to address this problem is to improve the effectiveness and efficiency of testing by improving the readability of tests. We will investigate which syntactic and semantic aspects make tests readable, such that we can make readability measurable by modelling it. This, in turn, will allow us to provide techniques that guide manual or automatic improvement of the readability of software tests. This is made possible by a unique combination of machine learning, crowd sourcing, and search-based testing techniques. The GReaTest project will provide tools to developers that help them to identify readability problems, to automatically improve readability, and to automatically generate readability optimised test suites. The importance of readability and the usefulness of readability improvement will be evaluated with a range of empirical studies in conjunction with our industrial collaborators Microsoft, Google, and Barclays, investigating the relation of test readability to fault finding effectiveness, developer productivity, and software quality.

Automated analysis and optimisation of test readability is novel, and traditional analyses only focused on easily measurable program aspects, such as code coverage. Improving readability of software tests has a direct impact on industry, where testing is a major economic and technical factor: More readable tests will reduce the costs of testing and increase effectiveness, thus improving software quality. Readability optimisation will be a key enabler for automated test generation in practice. Once readability of software tests is understood, this opens the doors to a new research direction on analysis and improvement of other software artefacts based on human understanding and performance.

Planned Impact

The main beneficiaries of the project outcomes will be all stakeholders involved in IT projects:

-- Software developers and testers: Improved test readability will lead to higher programmer and tester productivity, as the time necessary to understand and perform maintenance actions with tests will be reduced. Furthermore, readability optimisation will help to overcome one of the main show-stoppers preventing wide-spread application of automated test generation techniques. Thus, the possibility to automatically generate readable tests will support software engineers in achieving sufficient degrees of testing, and will allow them to maintain more tests.

-- Organisations that develop IT systems: Software testing is one of the major cost factors in software engineering, commonly estimated at around 50% of the average budget. However, missing a software bug can have an even higher economic impact, as regularly demonstrated by bugs resulting in product recalls (e.g. Toyota), system downtimes (e.g. NatWest), or even accidents (e.g. Therac 25, Ariane 5). Improving test readability will reduce the costs of testing, while at the same time improving its efficiency and increasing software quality. This will allow IT companies to deliver more value to their clients at lower costs.

-- Clients, users, and other stakeholders of IT projects will benefit from the improved software quality resulting from more efficient testing. This is particularly important as our society increasingly depends on a working information infrastructure for more and more aspects of civic, commercial, and social life, while software at the same time becomes ever more complex.
 
Description The project drives the development of the open source unit test generation tool "EvoSuite" (http://www.evosuite.org), which has users in academia and industry. In particular, EvoSuite has been used for experimentation by other researchers, the published papers have produced follow-up work by other researchers. The prototypes have also been tested by users in industry, who provided useful feedback for the further course of the project.

The project has further resulted in the Code Defenders web-based game (http://www.code-defenders.org), which has been used as a "game with a purpose" resulting in strong software tests, and it has seen applications in an educational setting.
Exploitation Route The work on Code Defenders has triggered new collaborations and is being integrated into programming education in higher education and secondary schools.
Sectors Digital/Communication/Information Technologies (including Software),Education

 
Description The Code Defenders game has been used in education at several universities, the Halmstad Summer School on Testing (http://ceres.hh.se/mediawiki/index.php/HSST_2016), and the HEADSTART summer school for Y12 students at the University of Sheffield.
First Year Of Impact 2016
Sector Digital/Communication/Information Technologies (including Software),Education
Impact Types Societal

 
Description Collaboration with the University of Calgary 
Organisation University of Calgary
Country Canada 
Sector Academic/University 
PI Contribution Prof. Hadi Hemmati's group at the University of Calgary, Canada, are applying the test generation tool developed in the "GREATEST" project in the context of several industrial research projects.
Collaborator Contribution We supported the University of Calgary in terms of data analysis, experiment conduction, and tool development.
Impact The ICSE SEIP paper "An industrial evaluation of unit test generation: finding real faults in a financial application" received the IEEE Software Best Paper Award.
Start Year 2017
 
Description Research collaboration with the University of Sao Paulo 
Organisation Universidade de São Paulo
Country Brazil 
Sector Academic/University 
PI Contribution As part of this collaboration a software framework for testing mobile apps is being developed. We have also co-authored a FAPESP grant application.
Collaborator Contribution The collaboration with the University of Sao Paulo has resulted in a FAPESP award to Prof. Marcelo Eler, who is spending 12 months at the University of Sheffield to collaborate on automated testing of mobile apps.
Impact First outcomes are currently under review at the International Symposium on the Foundations of Software Engineering.
Start Year 2016
 
Title Code Defenders 
Description Writing good software tests is difficult and not every developer's favorite occupation. Mutation testing aims to help by seeding artificial faults (mutants) that good tests should identify, and test generation tools help by providing automatically generated tests. However, mutation tools tend to produce huge numbers of mutants, many of which are trivial, redundant, or semantically equivalent to the original program; automated test generation tools tend to produce tests that achieve good code coverage, but are otherwise weak and have no clear purpose. In this paper, we present an approach based on gamification and crowdsourcing to produce better software tests and mutants: The Code Defenders web-based game lets teams of players compete over a program, where attackers try to create subtle mutants, which the defenders try to counter by writing strong tests. Experiments in controlled and crowdsourced scenarios reveal that writing tests as part of the game is more enjoyable, and that playing Code Defenders results in stronger test suites and mutants than those produced by automated tools. 
Type Of Technology Webtool/Application 
Year Produced 2016 
Impact Use in classes at the University of Sheffield and other institutions for educational purposes; paper at the International Conference of Software Engineering 2016. 
URL http://code-defenders.org
 
Description Invited tutorial at the 9th International Workshop on Search-Based Software Testing 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact An invited tutorial in which other researchers learned how to use the EvoSuite tool in their own research.
Year(s) Of Engagement Activity 2016
URL https://cse.sc.edu/~ggay/sbst2016/
 
Description Invited tutorial at the International Conference on Search-Based Software Engineering 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact 60 international researchers attended the SSBSE conference and the invited tutorial, in which participants learned how to use the EvoSuite test generation tool.
Year(s) Of Engagement Activity 2017
URL http://ssbse17.github.io/tutorials/
 
Description Keynote speaker at the First International Summer School on Search-Based Software Engineering 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Talk sparked questions and discussion afterwards
Year(s) Of Engagement Activity 2016
URL https://sbse2016.uca.es/sbse/
 
Description Speaker at the 12th International Summer School on Software Engineering 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A presentation of search-based test generation and the EvoSuite tool sparked discussions and triggered new collaborations.
Year(s) Of Engagement Activity 2016
URL http://www.sesa.unisa.it/seschool/previousEditions/2016/
 
Description Speaker at the ISSTA Summer School 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A tutorial about the EvoSuite tool was given to a crowd of researchers in co-location with the International Symposium on Software Testing and Analysis.
Year(s) Of Engagement Activity 2016
URL https://issta2016.cispa.saarland/summer-school-confirmed-speakers/