GReaTest: Growing Readable Software Tests
Lead Research Organisation:
University of Sheffield
Department Name: Computer Science
Abstract
Testing is a crucial part of any software development process. Testing is also very expensive: Common estimations list the effort of software testing at 50% of the average budget. Recent studies suggest that 77% of the time that software developers spend with testing is used for reading tests. Tests are read when they are generated, when they are updated, fixed, or refactored, when they serve as API usage examples and specification, or during debugging. Reading and understanding tests can be challenging, and evidence suggests that, despite the popularity of unit testing frameworks and test-driven development, the majority of software developers do not practice testing actively. Automatically generated tests tend to be particularly unreadable, severely inhibiting the widespread use of automated test generation in practice. The effects of insufficient testing can be dramatic, with large economic damage, and the potential to harm people relying on software in safety critical applications.
Our proposed solution to address this problem is to improve the effectiveness and efficiency of testing by improving the readability of tests. We will investigate which syntactic and semantic aspects make tests readable, such that we can make readability measurable by modelling it. This, in turn, will allow us to provide techniques that guide manual or automatic improvement of the readability of software tests. This is made possible by a unique combination of machine learning, crowd sourcing, and search-based testing techniques. The GReaTest project will provide tools to developers that help them to identify readability problems, to automatically improve readability, and to automatically generate readability optimised test suites. The importance of readability and the usefulness of readability improvement will be evaluated with a range of empirical studies in conjunction with our industrial collaborators Microsoft, Google, and Barclays, investigating the relation of test readability to fault finding effectiveness, developer productivity, and software quality.
Automated analysis and optimisation of test readability is novel, and traditional analyses only focused on easily measurable program aspects, such as code coverage. Improving readability of software tests has a direct impact on industry, where testing is a major economic and technical factor: More readable tests will reduce the costs of testing and increase effectiveness, thus improving software quality. Readability optimisation will be a key enabler for automated test generation in practice. Once readability of software tests is understood, this opens the doors to a new research direction on analysis and improvement of other software artefacts based on human understanding and performance.
Our proposed solution to address this problem is to improve the effectiveness and efficiency of testing by improving the readability of tests. We will investigate which syntactic and semantic aspects make tests readable, such that we can make readability measurable by modelling it. This, in turn, will allow us to provide techniques that guide manual or automatic improvement of the readability of software tests. This is made possible by a unique combination of machine learning, crowd sourcing, and search-based testing techniques. The GReaTest project will provide tools to developers that help them to identify readability problems, to automatically improve readability, and to automatically generate readability optimised test suites. The importance of readability and the usefulness of readability improvement will be evaluated with a range of empirical studies in conjunction with our industrial collaborators Microsoft, Google, and Barclays, investigating the relation of test readability to fault finding effectiveness, developer productivity, and software quality.
Automated analysis and optimisation of test readability is novel, and traditional analyses only focused on easily measurable program aspects, such as code coverage. Improving readability of software tests has a direct impact on industry, where testing is a major economic and technical factor: More readable tests will reduce the costs of testing and increase effectiveness, thus improving software quality. Readability optimisation will be a key enabler for automated test generation in practice. Once readability of software tests is understood, this opens the doors to a new research direction on analysis and improvement of other software artefacts based on human understanding and performance.
Planned Impact
The main beneficiaries of the project outcomes will be all stakeholders involved in IT projects:
-- Software developers and testers: Improved test readability will lead to higher programmer and tester productivity, as the time necessary to understand and perform maintenance actions with tests will be reduced. Furthermore, readability optimisation will help to overcome one of the main show-stoppers preventing wide-spread application of automated test generation techniques. Thus, the possibility to automatically generate readable tests will support software engineers in achieving sufficient degrees of testing, and will allow them to maintain more tests.
-- Organisations that develop IT systems: Software testing is one of the major cost factors in software engineering, commonly estimated at around 50% of the average budget. However, missing a software bug can have an even higher economic impact, as regularly demonstrated by bugs resulting in product recalls (e.g. Toyota), system downtimes (e.g. NatWest), or even accidents (e.g. Therac 25, Ariane 5). Improving test readability will reduce the costs of testing, while at the same time improving its efficiency and increasing software quality. This will allow IT companies to deliver more value to their clients at lower costs.
-- Clients, users, and other stakeholders of IT projects will benefit from the improved software quality resulting from more efficient testing. This is particularly important as our society increasingly depends on a working information infrastructure for more and more aspects of civic, commercial, and social life, while software at the same time becomes ever more complex.
-- Software developers and testers: Improved test readability will lead to higher programmer and tester productivity, as the time necessary to understand and perform maintenance actions with tests will be reduced. Furthermore, readability optimisation will help to overcome one of the main show-stoppers preventing wide-spread application of automated test generation techniques. Thus, the possibility to automatically generate readable tests will support software engineers in achieving sufficient degrees of testing, and will allow them to maintain more tests.
-- Organisations that develop IT systems: Software testing is one of the major cost factors in software engineering, commonly estimated at around 50% of the average budget. However, missing a software bug can have an even higher economic impact, as regularly demonstrated by bugs resulting in product recalls (e.g. Toyota), system downtimes (e.g. NatWest), or even accidents (e.g. Therac 25, Ariane 5). Improving test readability will reduce the costs of testing, while at the same time improving its efficiency and increasing software quality. This will allow IT companies to deliver more value to their clients at lower costs.
-- Clients, users, and other stakeholders of IT projects will benefit from the improved software quality resulting from more efficient testing. This is particularly important as our society increasingly depends on a working information infrastructure for more and more aspects of civic, commercial, and social life, while software at the same time becomes ever more complex.
Publications
Arcuri A
(2016)
Search Based Software Engineering
Campos J
(2017)
Search Based Software Engineering
Campos J
(2018)
An empirical evaluation of evolutionary algorithms for unit test suite generation
in Information and Software Technology
Fraser G
(2017)
EvoSuite at the SBST 2017 Tool Competition
Fraser G
(2016)
EvoSuite at the SBST 2016 tool competition
Pearson S
(2017)
Evaluating and Improving Fault Localization
Description | The project drives the development of the open source unit test generation tool "EvoSuite" (http://www.evosuite.org), which has users in academia and industry. In particular, EvoSuite has been used for experimentation by other researchers, the published papers have produced follow-up work by other researchers. The prototypes have also been tested by users in industry, who provided useful feedback for the further course of the project. The project has further resulted in the Code Defenders web-based game (http://www.code-defenders.org), which has been used as a "game with a purpose" resulting in strong software tests, and it has seen applications in an educational setting. |
Exploitation Route | The work on Code Defenders has triggered new collaborations and is being integrated into programming education in higher education and secondary schools. |
Sectors | Digital/Communication/Information Technologies (including Software) Education |
Description | The Code Defenders game has been used in education at several universities, the Halmstad Summer School on Testing (http://ceres.hh.se/mediawiki/index.php/HSST_2016), and the HEADSTART summer school for Y12 students at the University of Sheffield. |
First Year Of Impact | 2016 |
Sector | Digital/Communication/Information Technologies (including Software),Education |
Impact Types | Societal |
Description | Collaboration with the University of Calgary |
Organisation | University of Calgary |
Country | Canada |
Sector | Academic/University |
PI Contribution | Prof. Hadi Hemmati's group at the University of Calgary, Canada, are applying the test generation tool developed in the "GREATEST" project in the context of several industrial research projects. |
Collaborator Contribution | We supported the University of Calgary in terms of data analysis, experiment conduction, and tool development. |
Impact | The ICSE SEIP paper "An industrial evaluation of unit test generation: finding real faults in a financial application" received the IEEE Software Best Paper Award. |
Start Year | 2017 |
Description | Research collaboration with the University of Sao Paulo |
Organisation | Universidade de São Paulo |
Country | Brazil |
Sector | Academic/University |
PI Contribution | As part of this collaboration a software framework for testing mobile apps is being developed. We have also co-authored a FAPESP grant application. |
Collaborator Contribution | The collaboration with the University of Sao Paulo has resulted in a FAPESP award to Prof. Marcelo Eler, who is spending 12 months at the University of Sheffield to collaborate on automated testing of mobile apps. |
Impact | First outcomes are currently under review at the International Symposium on the Foundations of Software Engineering. |
Start Year | 2016 |
Title | Code Defenders |
Description | Writing good software tests is difficult and not every developer's favorite occupation. Mutation testing aims to help by seeding artificial faults (mutants) that good tests should identify, and test generation tools help by providing automatically generated tests. However, mutation tools tend to produce huge numbers of mutants, many of which are trivial, redundant, or semantically equivalent to the original program; automated test generation tools tend to produce tests that achieve good code coverage, but are otherwise weak and have no clear purpose. In this paper, we present an approach based on gamification and crowdsourcing to produce better software tests and mutants: The Code Defenders web-based game lets teams of players compete over a program, where attackers try to create subtle mutants, which the defenders try to counter by writing strong tests. Experiments in controlled and crowdsourced scenarios reveal that writing tests as part of the game is more enjoyable, and that playing Code Defenders results in stronger test suites and mutants than those produced by automated tools. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | Use in classes at the University of Sheffield and other institutions for educational purposes; paper at the International Conference of Software Engineering 2016. |
URL | http://code-defenders.org |
Description | Invited tutorial at the 9th International Workshop on Search-Based Software Testing |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | An invited tutorial in which other researchers learned how to use the EvoSuite tool in their own research. |
Year(s) Of Engagement Activity | 2016 |
URL | https://cse.sc.edu/~ggay/sbst2016/ |
Description | Invited tutorial at the International Conference on Search-Based Software Engineering |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | 60 international researchers attended the SSBSE conference and the invited tutorial, in which participants learned how to use the EvoSuite test generation tool. |
Year(s) Of Engagement Activity | 2017 |
URL | http://ssbse17.github.io/tutorials/ |
Description | Keynote speaker at the First International Summer School on Search-Based Software Engineering |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Talk sparked questions and discussion afterwards |
Year(s) Of Engagement Activity | 2016 |
URL | https://sbse2016.uca.es/sbse/ |
Description | Speaker at the 12th International Summer School on Software Engineering |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | A presentation of search-based test generation and the EvoSuite tool sparked discussions and triggered new collaborations. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.sesa.unisa.it/seschool/previousEditions/2016/ |
Description | Speaker at the ISSTA Summer School |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | A tutorial about the EvoSuite tool was given to a crowd of researchers in co-location with the International Symposium on Software Testing and Analysis. |
Year(s) Of Engagement Activity | 2016 |
URL | https://issta2016.cispa.saarland/summer-school-confirmed-speakers/ |