Testing, Verifying, and Generating Software Patches Using Dynamic Symbolic Execution

Lead Research Organisation: Imperial College London
Department Name: Computing

Abstract

A large fraction of the costs of developing and maintaining software is associated with detecting and fixing software errors. As a result, the last decade has seen a sustained research effort directed toward designing and developing techniques for automatically detecting software errors, with some of these techniques making their way into commercial and open-source tools. However, detecting an error is only the first step toward fixing it. In fact, many known errors remain unpatched due to the high cost required to diagnose and repair them, combined with the fear that patches are more likely to introduce failures compared to other types of code changes.

The goal of this research project is to address both of these problems, by devising novel techniques based on dynamic symbolic execution for:
(1) automatically testing and verifying the correctness of software patches, and
(2) (semi-)automatically generating candidate patches for software bugs.

The strength of dynamic symbolic execution lies in its ability to precisely model the behaviour of program paths using mathematical constraints. However, the cost associated with this level of precision is poor scalability. The number of paths in a program is usually exponential in the number of branches, which makes it difficult to scale the analysis to very large programs. However, by focusing the analysis on the incremental changes introduced by program patches, we hope to significantly reduce the cost of symbolic execution and significantly increase its applicability in practice. Furthermore, the ability to check software patches opens up the possibility of performing patch generation in an automatic or semi-automatic fashion. In particular, starting from the mathematical constraints gathered from a buggy execution path -- and with the potential addition of a manually-written patch template -- we plan to design techniques for generating a set of candidate patches resembling the ones that would be generated manually by developers.

Planned Impact

Key Non-Academic Beneficiaries: The primary non-academic beneficiaries of the proposed research are software companies and the open-source software community. It is well known that software bugs have a large negative impact on the global economy. For example, a report published by the US National Institute of Standards and Technology (NIST) in 2002 has found that software bugs cost annually the US economy an estimated $60 billion dollars or 0.6 percent of the gross domestic product [G. Tassey, The economic impacts of inadequate infrastructure for software testing, NIST 2002], and a similar impact is to be expected in other developed countries like the UK. A large fraction of this cost is associated with writing patches---either to fix existing bugs or to add new features---which is one of the critical software development and maintenance activities. This research project plans to investigate mechanisms for improving the quality of patches and for (partially) automating their development, which on the long term might bring significant economic benefits.

In addition to the commercial and open-source software development communities---who will directly benefit from improving the software patching process---the resulting increase in software quality will positively affect software users across a wide variety of industrial sectors. In particular, the NIST study cited above mentions the financial services and the automotive and aerospace industry as major sectors affected by software errors.

Mechanisms for Exploitation and Application: Our goal is to disseminate the results of this research from an early stage through a variety of mechanisms: (1) by applying it to open-source and commercial code to demonstrate the effectiveness of our approach; (2) by open-sourcing our initial prototype to be used directly by software developers on their code; and (3) by communicating the results of our research to the non-academic user community.

On the short-term, we will apply our technique to widely-used open-source software, and report any errors found to developers. Furthermore, we will send automatically generated patches to developers, use the feedback they provide response to these patches will be used to improve our techniques and associated prototype.

After the initial evaluation on open-source software, we plan to apply our approach to commercial code. We have already discussed our technique with Maxeler Technologies, who are interested in applying our initial prototype to their software (please see the attached letter of support). We also intend to seek further contacts with the software industry, in order to better assess the applicability of our technique, and in particular its integration into the development process.

One of the key ways in which we plan to disseminate our results to the user community is by releasing our initial prototype under an open-source license. This would allow developers from both the commercial and open-source communities to directly apply our technique to their code. Our previous experience releasing our tool KLEE (http://klee.llvm.org) as open-source software should prove beneficial in making sure that our prototype reaches a wide audience. (Since being open-sourced in June 2009, KLEE has been used and extended by a variety of users, with some of these extensions being contributed back to the main branch.)

Communicating the results of our research to a non-academic audience will play an important role in our dissemination strategy. In conjunction with open-sourcing our prototype system, we plan to set up a dedicated website for the project to introduce our research to a general audience. In a similar spirit, we plan to integrate our research into the PI's teaching activities, through group and individual projects offered to undergraduate and Master's students at Imperial College.

Publications

10 25 50
 
Description This grant enabled novel techniques for improving the quality of software updates. One of the distinguishing characteristics of software systems is that they evolve: new patches are committed to software repositories and new versions are released to users on a continuous basis. Unfortunately, many of these changes bring unexpected bugs that break the stability of the system or affect its security. Our techniques comprehensively check these changes, by trying to generate inputs that cover the lines of code added or changed by a patch, and then refining such inputs to expose the new behaviours introduced by the change. The grant has also supported the design of a portfolio-based constraint solving approach for improving some of the techniques mentioned above, and a large-scale empirical study for understanding how large software projects evolve.
Exploitation Route The techniques we designed could be used to improve software reliability, by automatically generating high-coverage test inputs and finding errors.
Sectors Digital/Communication/Information Technologies (including Software)

URL http://srg.doc.ic.ac.uk/projects
 
Description We have applied our tools to several popular open-source software systems, where we have detected several errors, which were reported to the developers of these systems. Our techniques have garnered considerable attention, facilitating new work on increasing software reliability, particularly in the area of patch testing. We qualified as finalists to the EIT ICT Labs Idea Challenge and presented our ideas to an audience of experts from industry, government and academia. We were invited to discuss our techniques in an IEEE Software Blog, which has attracted attention from the industry.
First Year Of Impact 2012
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Title Covrig data set 
Description Data associated with an empirical study examining how code and tests co-evolve in 6 popular open-source systems. We report the main characteristics of software patches, analyse the evolution of program and patch coverage, assess the impact of nondeterminism on the execution of the test suite, and investigate whether the coverage of code containing bugs and bug fixes is higher than average. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact The associated paper was published in the International Symposium on Software Testing and Analysis (ISSTA 2014). The associated artifact has won the Best Artifact Award at the conference. 
URL http://srg.doc.ic.ac.uk/projects/covrig/
 
Title KATCH 
Description Combines dynamic symbolic execution with several novel heuristics based on program analysis to automatically and comprehensively test code patches. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact Received a Distinguished Artifact award at the Joint Meeting of the European Software Engineering Conference and ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2013) 
URL http://srg.doc.ic.ac.uk/projects/katch/
 
Title KLEE-MultiSolver 
Description An extension to the KLEE symbolic execution engine that uses the metaSMT framework to add support for multiple SMT solvers (Boolector, STP and Z3). 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact Published in the proceedings of the International Conference on Computer Aided Verification (CAV 2013). Used by academic and industry teams as part of the KLEE symbolic execution framework. 
URL http://srg.doc.ic.ac.uk/projects/klee-multisolver/
 
Title Shadow 
Description A symbolic execution tool that is designed to generate test inputs that cover the new program behaviours introduced by a patch. The technique works by executing both the old and the new version in the same symbolic execution instance, with the old version shadowing the new one. During this combined shadow execution, whenever a branch point is reached where the old and the new versions diverge, we generate a test case exercising the divergence and comprehensively test the new behaviours of the new version. 
Type Of Technology Software 
Year Produced 2016 
Impact The associated research was published in the International Conference on Software Engineering (ICSE 2016) and ACM Transactions on Software Engineering and Methodology (TOSEM 2018). IEEE has invited us to write a blog post which discusses this work. The work has attracted considerable attention, being reimplemented by another research group for Java, used for hybrid differential testing, and discussed in the context of research on determining the correlation between fault revelation and various forms of coverage; combining bug detection with test suite augmentation; and genetic improvement. 
URL http://srg.doc.ic.ac.uk/projects/shadow/
 
Title ZESTI 
Description A lightweight symbolic execution tool for improving existing test suites by thoroughly checking for errors all sensitive operations executed by a test suite. 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact The associated paper was published in the proceedings of the International Conference on Software Engineering (ICSE 2012). Used in several research projects. 
URL http://srg.doc.ic.ac.uk/projects/zesti/
 
Description Co-organiser of Dagstuhl seminar on Symbolic Execution and Constraint Solving 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Co-organised a Dagstuhl seminar on Symbolic Execution and Constraint Solving, which brought together experts from both academia, research laboratories, and the industry.
Year(s) Of Engagement Activity 2014
URL https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=14442
 
Description Co-organizer of Shonan Meeting on Fuzzing and Symbolic Execution: Reflections, Challenges, and Opportunities 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Co-organized a Shonan meeting on fuzzing and symbolic execution, which brought together academic and industry participants from the fuzzing and symbolic execution communities, from across the globe.
Year(s) Of Engagement Activity 2019
URL https://shonan.nii.ac.jp/seminars/160/
 
Description IEEE Software Blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact IEEE Software blog post on our research on comprehensively testing software patches with symbolic execution.
Year(s) Of Engagement Activity 2016
URL http://blog.ieeesoftware.org/2016/09/comprehensively-testing-software.html
 
Description INVEST workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Undergraduate students
Results and Impact The Introduction to Verification and Testing (INVEST) workshop takes place in the Department of Computing, Imperial College London on an annual basis, jointly organised by Cadar, Donaldson and Gardner. The aim of the workshop is to introduce young researchers, and students potentially interested in research careers, to the fields of software verification and testing.
Year(s) Of Engagement Activity 2014,2015,2017
URL http://invest.doc.ic.ac.uk
 
Description Outreach to school groups 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Outreach talk to schools on software bugs are possible mitigations.
Year(s) Of Engagement Activity 2019
URL https://www.eventbrite.co.uk/e/software-bugs-why-they-exist-and-how-to-fight-them-tickets-6164483435...