Verification of the optimizing phase of a compiler

Lead Research Organisation: University of Warwick
Department Name: Computer Science

Abstract

Computer programs are sets of instructions to make a computer perform a certain task. We write these programs in programming languages that are easy for humans to understand. Computers, however, are simple and want very simple instructions to run (for example, ones that involve just moving, adding or subtracting numbers). So to get programs to run, we need to convert our human readable programs into these simple instructions. To do this we use another computer program called a compiler.It is important that this conversion process does not go wrong - otherwise the computer will not do what we expect it to do. However, compilers are very complicated programs, sometimes having to deal with millions of instructions; it is very hard to know whether the translation is going wrong. Also, most compilers are clever and attempt to improve the program as it is translated. They try to make the program more efficient and that complicates things even further / making it even harder to know whether the translation is correct or not.Luckily, computer programs can be viewed as mathematical objects (like numbers, formulas and equations) and therefore we can prove things about them. This research aims to find ways to prove that compilers do not go wrong - they always do a correct translation. In particular, the research looks at how to prove this even when the compiler is trying to improve the program.

Publications

10 25 50
publication icon
Kalvala S (2009) Program transformations using temporal logic side conditions in ACM Transactions on Programming Languages and Systems

publication icon
Warburton R (2009) Compiler Construction

 
Description Computer programs are usually written in what are known as high-level programming languages, such as Java, C, or C++. For these to actually run on a computer, they need to first be translated into a language which can be understood by the hardware, in a process known as compilation.



Compilers are complex programs, which need to ensure not only a correct translation, but also the translation into fast and efficient code. The more one tries to design the compiler in such as a way that the generated code is fast and efficient, the more difficult it becomes to ensure that the translation will always be correct.



In this project we have addressed this issue of improving the code generated by a compiler while having some assurances that this process will not introduce new bugs. To do this we capture the program once it has already been partially translated by a compiler, into what is known as an intermediate representation. We then check very carefully whether the program satisfies some properties which will allow us to change it without changing what the program does. The way one can write down clearly what the program should do, as well as the conditions which need to be satisfied by transformations. is based on a logical, mathematical notation, specifically temporal logic. We have designed this notation to be both expressive as well as capable of allowing the transformations to be applied automatically.



We have shown that this approach actually works, by writing a tool which fits in within a compiler already in use. The SOOT Framework is a set of packages to manipulate Java programs. Our tool takes an intermediate representation of a program which is generated by SOOT, as well as a file containing the transformations we may want to apply on the program (written in the rigorous, logical language), and checks if the transformations can be applied safely, using a technology known as model checking. The model checking not only confirms whether a transformation can be applied, but it also returns information which can help in generating the new, better code. The tool then uses the output from the model checker and the original code to generate the corresponding optimized code.



The methodology we have developed is general: instead of restricting ourselves to some of the well known transformations (or optimizations), our goal was to develop a framework where software development teams could think of some transformations themselves, and then use our tools both to ensure the correctness of the transformations and to apply them to whichever programs they wished.



Our goal is to be able to semi-automatically prove that the transformations are indeed safe. While we have been able to do so for many of the transformations with which we started out this project, we have decided to also try to capture some very advanced optimizations, particularly used in transforming code to use modern, multi-core computers effectively. For this we have had to make the formal language more powerful and flexible. We are in the process of extending our language and proof tools to support these more advanced optimizations.



We have also been exploring how the methodology can be applied at another important step in program development, namely in removing bugs from the original programs. The difference here is that the methodology cannot just generate any new code as long as it is equivalent to the original one: the new code must look like the original one with the changes made easy to understand and easy to justify as fixing particular bugs. The language for specifying program properties and transformations will have to be slightly different, but we are developing it using our original results as inspiration.
Exploitation Route The main task is to integrate the methodology into existing compiler tools with greater user-base. At the moment the student who was trained in the project is working in industry and is very active in the Java Developers community, having even published a book on the subject. This will help to indirectly improve the software development technology across a large population.
Sectors Aerospace, Defence and Marine,Creative Economy,Digital/Communication/Information Technologies (including Software),Electronics

 
Description The methodology developed have taken the issue of formalizing program transformations forward and have resulted in further training of research students too.
First Year Of Impact 2011
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Economic