SEBASE: Software Engineering By Automated SEarch

Lead Research Organisation: University of York
Department Name: Computer Science


Current software engineering practice is a human-led search for solutions which meet needs and constraints under limited resources. Often there will be conflict, both between and within functional and non-functional criteria. Naturally, like other engineers, we search for a near optimal solution. As systems get bigger, more distributed, more dynamic and more critical, this labour-intensive search will hit fundamental limits. We will not be able to continue to develop, operate and maintain systems in the traditional way, without automating or partly automating the search for near optimal solutions. Automated search based solutions have a track record of success in other engineering disciplines, characterised by a large number of potential solutions, where there are many complex, competing and conflicting constraints and where construction of a perfect solution is either impossible or impractical. The SEMINAL network demonstrated that these techniques provide robust, cost-effective and high quality solutions for several problems in software engineering. Successes to date can be seen as strong pointers to search having great potential to serve as an overarching solution paradigm. The SEBASE project aims to provide a new approach to the way in which software engineering is understood and practised. It will move software engineering problems from human-based search to machine-based search. As a result, human effort will move up the abstraction chain, to focus on guiding the automated search, rather than performing it. This project will address key issues in software engineering, including scalability, robustness, reliability and stability. It will also study theoretical foundations of search algorithms and apply the insights gained to develop more effective and efficient search algorithms for large and complex software engineering problems. Such insights will have a major impact on the search algorithm community as well as the software engineering community.


10 25 50
Description The principal findings from York's work are given below. We have shown that metaheuristic search is an important and potentially very powerful technique for solving a variety of problems in software engineering:

a) It can be used to evolve programs with multiple success criteria, e.g. we can engineer tradeoffs between functional criteria and non-functional criteria. Low resource programs can be attained at the cost of functional 'correctness'. Search techniques can find improvements to programs, e.g. making changes that bring about better performance whilst preserving semantics.

b) It is possible to use evolutionary computation to find test sequences in concurrent models that find faults.

c) Search based approaches can be used to synthesise real time systems architectures that meet all timing requirements and which are also resilient to future change.

d) Search based approaches can be used to synthesise test data strategies meeting a variety of criteria.

e) Search based approaches can be used to evolve programs that define high performing RF pulse sequences (for NMR spectroscopy).

f) Search based approaches can be used to synthesise probabilistic algorithms (couched in the PRISM notation) that trade off multiple criteria.

g) We defined a new means of tuning optimisation algorithms with reference to extant experimental usage elsewhere, in particular using response surface methodologies.

h) The demonstration (winning best paper at GECCO 2011 SBSE) of how genetic programming and mutation testing can
combine to automatically synthesise useful (and indeed characteristic) invariants. This involves testing putative invariants by evaluating them on the original and also mutant programs. If a synthesised candidate invariant holds only on the original it is deemed to capture something very special and intrinsic to that program.

i) A demonstration that search can be used to falsify automatically generated candidate invariants that are not actually true. (Several techniques generate candidates that are consistent with all test traces seen. This does not mean that they are actually invariants; they need to be eliminated. )

j) A rigorous evaluation of the role played by crossover and mutation in genetic programming.
Exploitation Route There is a good deal to exploit. Academically we are already doing so via the DAASE Programme Grant (with UCL, Birmingham and Stirling).

The work can be applied to stress testing in general, with some application to security stress testing.

We believe our work lays the foundations for eventual applications in multicore testing (e.g. showing that search can severely stress underlying latencies of interconnect fabric use).

Further application to automated invariant synthesis would seem a likely promising candidate for further investigation. We are conscious that our mutation inspired notion of determining characteristicness is only one possible approach; others can surely be found.
Sectors Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software),Security and Diplomacy

Description Our testing work has been applied in a CDE project funded by DSTL (2012). We are at early stages at the moment with further exploitation. Some of the work has already been proposed to DSTL (included in October 2014) with a view to testing multicore systems using search based approaches. Our work continued under the current DAASE Propgramme Grant.
First Year Of Impact 2012
Sector Aerospace, Defence and Marine,Digital/Communication/Information Technologies (including Software)
Impact Types Economic

Description Centre for Defence Enterprise
Amount £56,361 (GBP)
Funding ID CDE22999 
Organisation Defence Science & Technology Laboratory (DSTL) 
Department Centre for Defence Enterprise
Sector Public
Country United Kingdom
Start 01/2012 
End 05/2012
Description Programme Grant
Amount £6,834,903 (GBP)
Funding ID EP/J017515/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 06/2012 
End 05/2018
Description Collaboration with Chemistry Department on NMR and Search Based Approaches 
Organisation University of York
Department Department of Chemistry
Country United Kingdom 
Sector Academic/University 
PI Contribution Sebald and I supervised Bechman (RA).
Collaborator Contribution They brought NMR expertise and I brought search based expertise. Though we had collaborated previously this really was a major fusion of two disciplines facilitated by the SEBASE award.
Impact Interdisciplinary. Paper in Journal of Magnetic Resonance A Computer Science-Chemistry collaboration. Genetic algorithms are used to generate exceptionally high performing RF pulse sequences for use in Magic Angle Spinning (MAS) NMR spectroscopy. Breaking the universally adopted assumption of rotorsynchronicity (where the sequence length must map onto a multiple of the spinning period) allows sequences with properties exceeding theoretical bounds (based on approximations via perturbation theory, the most widely analytic paradigm) to be found. Opens up a wealth of further evocomp application. Proceeds via extensive simulation AND experimental validation using suitable model compounds. Also winner of $2000 and Bronze medal at the Human Competitive Awards at GECCO 2013.
Start Year 2011
Description DSTL Staff PhD: Investigating the Reliability of Criteria 
Organisation Defence Science & Technology Laboratory (DSTL)
Country United Kingdom 
Sector Public 
PI Contribution I was supervisor to a DSTL staff member who carried out PhD Research at York on Reliability of Testing Criteria.
Collaborator Contribution The DSTL staff member carried out the research with Clark as supervisor. The work used some search based technqiues to minimise test sets to meet particular test criteria.
Impact PhD Thesis. Hadley, Mark (2013). Empirical Evaluation of the Effectiveness and Reliability of Software Testing Adequacy Criteria and Reference Test Systems. PhD thesis, University of York. URL is given above. Published paper: Mark Hadley and John A. Clark. Good Days and Bad Days: Investigating Effectiveness and Reliability of "Optimum" Test Sets. Mutation 2013.
Start Year 2007