Geometric Abstractions for Scalable Program Analyzers

Lead Research Organisation: University of Leeds
Department Name: Sch of Computing

Abstract

It is widely acknowledged that relatively small defects in software can have a substantial cost both for producers and consumers. For example, system vulnerabilities are frequently introduced by programming mistakes such as allowing out of bounds accesses to buffers, overflows in operations on native integers and other errors related to memory management. Of course, there can be other causes, such as system design flaws, but finding and certifying the absence of the low-level bugs is an important prerequisite to building secure and reliable software. The approach we use to detect and locate programming errors or certify the absence of such bugs is that of static analysis; that is, the determination of correct though approximate information about the program's values at each program step. Static analysis has its roots in compiler optimization where the analysis time has to be kept very low while the properties of interest are fixed with respect to the compiler. More recently program analyzers have been developed for program verification; however these also consider a fixed set of possible run-time errors and aim for a scalability and performance that enables them to tackle very large programs.Static analysis uses abstract domains for representing information that needs to be collected. Thus these domains have to provide a convenient but approximate representation of the accumulated information during the abstract evaluation of a program. Observe that the abstract domain component of a static analyzer has to include, not only a computer representation of the logical properties of interest, but also the operations needed to extract this information from the program's components, primitives for propagating this information forward and/or backward within the program, and operators for accelerating the analysis process and ensuring loop iterations actually terminate.Since, many program properties of interest are intrinsically numeric, there has been a considerable amount of research on how this kind of information can be represented efficiently and precisely by means of geometric domains. The problem being to get the right efficiency/precision trade-off, which is difficult since this is clearly dependent on the application. Thus many geometric domains have been proposed and researched, the majority being defined by linear (i.e., planar) bounds such as polyhedra; octagons; boxes, also known as intervals; and grids, simple forms of which are also called lattices. Such a range is needed since domains such as polyhedra, although very precise, have high complexity and exponential space requirements (relative to the number of dimensions) while simpler domains such as octagons and grids are polynomial and the non-relational domain of boxes has linear complexity.Solving this scalability problem is the main motivation for this project; here we will research new techniques for building compound geometric domains that can be constructed from several atomic ones such as those discussed above. In order to allow for varying the efficiency/precision trade-off, not only will it be parametrized on the component domains but it will also have a highly adjustable strategy for varying the kind and amount of communication between them. Thus a successful project will provide bespoke domains that are tailored for the application, allowing for both the type of property being verified and the size and complexity of the software being analyzed.
 
Description We have discovered and developed new ways of analysing programs before they are run. This has informed and further developed the Parma Polyhedral Library (http://bugseng.com/products/ppl)
Exploitation Route Our research and software is open source and can be used in a variety of applications. E.g. see http://bugseng.com/products/ppl/applications which lists applications using the Parma Polyhedral Library developed with this and other funding. A spinout (http://bugseng.com/home) has been created to help exploit the findings. The main product of the spinout developed by BUGSENG is ECLAIR, a flexible and extensible framework for the analysis of C and C++ source code. Among its many applications we have the automatic detection of violations of coding rules, either defined by the user or included in various coding standards (like MISRA C/C++, High-Integrity C++, CERT C/C++, JSF C++, Netrino Embedded C).
Sectors Digital/Communication/Information Technologies (including Software)

URL http://bugseng.com/products/ppl
 
Description Our findings have been used in the Parma Polyhedral Library (http://bugseng.com/products/ppl) which is being exploited by the spinout BUGSENG (http://bugseng.com).
First Year Of Impact 2010
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic

 
Description Parma 
Organisation University of Parma
Country Italy 
Sector Academic/University 
PI Contribution Academic collaboration including jointly developed software, research and publications.
Collaborator Contribution Academic collaboration including jointly developed software, research and publications. The software company BUGSENG SRL resulting from the collaboration is a spin-off of University of Parma.
Impact See other sections.
Start Year 2006
 
Title Parma Polyhedral Library version 1.1 
Description The Parma Polyhedra Library (PPL) provides numerical abstractions especially targeted at applications in the field of analysis and verification of complex systems. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact The PPL was originally released in 2001. 
URL http://bugseng.com/products/ppl
 
Company Name BUGSENG S.R.L. 
Description A spin-off of the University of Parma, BUGSENG is built on 15 years of research on techniques for automated software analysis and verification techniques. The main product developed by BUGSENG is ECLAIR, a flexible and extensible framework for the analysis of C and C++ source code. Among its many applications we have the automatic detection of violations of coding rules, either defined by the user or included in various coding standards (like MISRA C/C++, High-Integrity C++, CERT C/C++, JSF C++, Netrino Embedded C). 
Year Established 2010 
Impact CERN, the European Organization for Nuclear Research, one of the world's largest and most respected centres for scientific research uses ECLAIR software verification platform with a specific instantiation designed to automatically check the ROOT project coding conventions. Other customers include large multinational organizations.
Website http://bugseng.com/home