SSI: The UK Software Sustainability Institute

Lead Research Organisation: University of Edinburgh
Department Name: Edinburgh Parallel Computing Centre

Abstract

This proposal seeks to gain support for the next five years for the UK Software Sustainability Institute (SSI) that will work in partnership with research communities to identify key software that needs to be sustained.Researchers today from many different disciplines rely on software to carry out high quality research. This software must be able to serve the changing needs of the researcher to remain relevant. It must be able to change, adapt, and travel with the researcher on the journey of exploration, innovation and discovery which embodies leading research.Software sustainability is the key to this journey; it is essential that software used in research is managed beyond the lifetime of its original funding cycle, and strengthened, adapted and customised for adoption by researchers within the founding community and outside, in other communities to create a impact which can be transferred to future generations of researchers.The sustainability of important research software requires partnership. Only through close collaboration between software developers and scientists can advances be made both in software development and domain sciences. This is not a radical view. One of the most successful aspects of Agile development methodologies is a focus on ongoing interactions between developers and customers: business people and developers must work together daily throughout the project. The benefits of Agile methods for scientific software development are greater than for business software; and our experiences through our work as OMII-UK underline this emphatically. Through programmes such as ENGAGE and eDIKT2 we have worked in partnership with researchers to improve their software. As each community's requirements are different, the system we have developed is well suited to communities and software of varying maturity: one size does not fit all. We propose a Software Sustainability Institute for the UK which will establish a national focal point - a facility for research software users and developers - based around specialist software engineering skills for driving the continued improvement and impact of research software, using as its foundation the concept of Practical Peer Partnerships: bringing together key research groups within the UK with the skilled team members from our consortium to deliver a series of genuine partnerships focused on the improvement of vital research software through consultative advice, such as test design or help with development tools; collaborative partnerships, e.g. by helping with code refactoring; and long-term engagement, for cultivating the relationships which grow communities and harness the momentum of other activities. This will build on the partners existing connections to many different research communities, and to infrastructure initiatives such as the NGS, EGI, NESSI and PRACE.Our approach will ensure that software used in research is managed beyond the lifetime of its original funding cycle: it will be strengthened, adapted and customised for adoption by researchers, within the founding community and elsewhere, to create an impact which can be built on by future generations of researchers - ultimately to deliver new high quality research.

Planned Impact

Who will benefit from this work, and how? - academic researchers will benefit from software - both their own and that developed by others - that can be relied on and can be used as the basis of their research. - researchers and developers who develop or extend software will have a facility that will assist them with the maintenance, expansion, exploitation and community development of their codes for the benefit of themselves and for others in the UK, and to make a wider impact internationally. We will explore a range of exploitation and valorisation channels and in particular open source development. Open development offers more than free software. Open development offers new collaborative possibilities for software users as well as software developers. - commercial and public sector researchers will have access to more robust software from the research sector, with the potential and incentive to contribute back. The SSI can achieve this by securing the software that underpins innovations and investigation and by maximising the ability of researchers to take-up software developed by others for the benefit of the UK as a whole. We already have significant links to industry and would pursue the exploitation of software under the remit of the SSI in order to get additional value from the commercial and public sector. - policy makers will benefit from the support of software which is used to define policy. In this proposal we show support from groups supporting decision makers in the areas of climate change, social mobility and changing populations, and transport / environmental pollution policy. Policy makers for research and innovation - national and international - will benefit from a facility that is expert at the issues underpinning software sustainability; experience in the field and a direct channel to the research community of contributors and users. - the wider public will benefit from the results of research which has a direct or indirect impact on their lives. In this proposal we show support from groups providing software that will lead to research in biofuels, climate models, cancer research, clinical trials, criminology and crop research, ultimately impacting this nation's health and wealth. Opportunity to benefit The SSI proposal is one of collaboration and partnership. Our proposed model of delivering software sustainability through practical peer partnerships between engineers and researchers is the best way to achieve value and impact. A range of schemes - consultancy, collaborative projects, support, networks, focus groups of user PALs , research nodes - built on the foundations of our experience and pre-existing collaborations are all there to ensure that researchers benefit. They will deliver high volumes of knowledge transfer and engagement to achieve wide-scale sustainability of research software. There is specific budget for the secondment of research staff in the field. We will leverage our connections with international organisations to amplify the impact of the software we help sustain. We will promote the importance of software sustainability using our strong presence in: technical standards bodies; scientific standards initiatives; major scientific networks; major international projects impacting UK communities in Europe and the USA and major international initiatives. We will build on our experience and encourage and assist key scientific software groups to adopt better development methods, identifying and eliminating duplicate activity, bring together islands of expertise to create critical mass in the community, foster the integration of similar software products and facilitate a fuller dialogue between developers and users. In doing so, the SSI will foster global economic performance, prevent wasteful reinvention, improve returns on initial research investments and ultimately improve the competitiveness of the UK by making our researchers more innovative and productive.

Publications

10 25 50
 
Description The UK Software Sustainability Institute was set up to address the concerns in 2009 of the quality of research software. Since then, it has played a major role in both understanding the issues around research software, but also addressing them. This has been done inconjunction with over 160 different collaborators to create a diverse set of resources, services, training and best practice to improve research software in the UK.
Exploitation Route Our training materials and guides are licensed under a Creative Commons licence so that others may put them to use freely, as they have through initiatives like Software Carpentry (of which we are the UK coordinators).

The guidance on policy such as software management plans, software licensing, software publishing and software sustainability evaluations have been taken forward to improve the guidelines by funders such as EPSRC, ESRC and Wellcome Trust.

Finally, we expect the research we have undertaken on understanding the scale of the use of research software in the UK in 2014 to be taken forward by other groups studying the effect of initiatives like training on the ability of the sector to develop stable, reliable software.
Sectors Aerospace, Defence and Marine,Agriculture, Food and Drink,Chemicals,Creative Economy,Digital/Communication/Information Technologies (including Software),Education,Electronics,Energy,Environment,Government, Democracy and Justice,Manufacturing, including Industrial Biotechology,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology,Transport,Other

URL http://www.software.ac.uk/
 
Description The work carried out by the Software Sustainability Institute has had three principal pathways to impact beyond academia. 1) Our work on aspects of software sustainability, software preservation, software development policy and software development process have been used by non-academic organisations, including the International Atomic Energy Agency and Mozilla, to improve the effectiveness of the development and efficiency of maintenance of specialist software. Since 2021, this has also included collaboration on training with Astra Zeneca. 2) The groups we have worked with directly to improve their development of specialist research software in turn have had societal and economic impact as a result of the new research that has been enabled, including the ability to study the effects of anti-viral drugs, biomass yield from UK woodlands, and the habits of UK city dwellers. 3) Our work to define, establish and support the role and career path of the Research Software Engineer has been picked up by industry, notably by Microsoft, and is being used to establish better career paths for those seeking to move between software development and software engineering positions in academia and industry. It has also led to the creation of the Society of Research Software Engineering, a new professional society. Additionally, our Fellowship Programme has had a significant effect on research culture. Work by Robin Wilson (2013 Fellow) on CITATION files funded as part of this grant has informed new functionality released in 2021 by GitHub to support software citation. Work supported by this grant by Stephen Eglen (2014 Fellow) has led to the creation of CODECHECK, which was used to verify COVID modelling simulation code in 2021.
First Year Of Impact 2013
Sector Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Healthcare,Government, Democracy and Justice,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology
Impact Types Cultural,Societal,Economic,Policy & public services

 
Description A Consultation on Proposals for Long-Term Capital Investment in Science & Research
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description House of Lords Select Committee on Science and Technology inquiry into Scientific Infrastructure
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
Impact Understanding the requirement adequate effort and skill resourcing for capital investment.
URL http://www.parliament.uk/business/committees/committees-a-z/lords-select/science-and-technology-comm...
 
Description Independent review of the role of metrics in research assessment
Geographic Reach National 
Policy Influence Type Contribution to a national consultation/review
 
Description Amazon Web Services in Education Research Grant
Amount $10,000 (USD)
Organisation Amazon.com 
Sector Private
Country United States
Start 03/2012 
End 02/2014
 
Description Microsoft Azure Research Award
Amount $20,000 (USD)
Organisation Microsoft Research 
Sector Private
Country Global
Start 03/2015 
End 03/2016
 
Description Rapport: Robust Application Porting for HPC in the Cloud.
Amount £87,077 (GBP)
Funding ID EP/I034246/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 02/2011 
End 10/2011
 
Description SI2-CHE: Development and Deployment of Chemical Software for Advanced Potential Energy Surfaces
Amount £361,289 (GBP)
Funding ID EP/K040138/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 04/2013 
End 04/2016
 
Description Standard Research
Amount £80,263 (GBP)
Funding ID EP/N028902/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 01/2016 
End 01/2019
 
Description The Software Sustainability Institute: Phase 2
Amount £3,511,602 (GBP)
Funding ID EP/N006410/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 06/2015 
End 05/2019
 
Description The UK Software Sustainability Institute: Phase 3
Amount £6,599,477 (GBP)
Funding ID EP/S021779/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Public
Country United Kingdom
Start 12/2018 
End 11/2023
 
Title Supplemental material for: Morphological phylogenetics evaluated using novel evolutionary simulations 
Description Evolutionary inferences require reliable phylogenies. Morphological data has traditionally been analysed using maximum parsimony, but recent simulation studies have suggested that Bayesian analyses yield more accurate trees. This debate is ongoing, in part, because of ambiguity over modes of morphological evolution and a lack of appropriate models. Here we investigate phylogenetic methods using two novel simulation models - one in which morphological characters evolve stochastically along lineages and another in which individuals undergo selection. Both models generate character data and lineage splitting simultaneously: the resulting trees are an emergent property, rather than a fixed parameter. Standard consensus methods for Bayesian searches (Mki) yield fewer incorrect nodes and quartets than the standard consensus trees recovered using equal weighting and implied weighting parsimony searches. Distances between the pool of derived trees (most parsimonious or posterior distribution) and the true trees - measured using Robinson-Foulds (RF), subtree prune and regraft (SPR), and tree bisection reconnection (TBR) metrics - demonstrate that this is related to the search strategy and consensus method of each technique. The amount and structure of homoplasy in character data differs between models. Morphological coherence, which has previously not been considered in this context, proves to be a more important factor for phylogenetic accuracy than homoplasy. Selection-based models exhibit relatively lower homoplasy, lower morphological coherence, and higher inaccuracy in inferred trees. Selection is a dominant driver of morphological evolution, but we demonstrate that it has a confounding effect on numerous character properties which are fundamental to phylogenetic inference. We suggest that the current debate should move beyond considerations of parsimony versus Bayesian, towards identifying modes of morphological evolution and using these to build models for probabilistic search methods. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL http://datadryad.org/stash/dataset/doi:10.5061/dryad.4b8gtht8h
 
Title UK Research Software Survey 2014 
Description This spreadsheet contains the anonymised data collected as part of a survey of UK researchers in their use of research software. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact Substantial public coverage of results of survey. Additional research looking at gender and diversity issues relating to data. 
URL https://zenodo.org/record/14809?ln=en#.VuAtV5OLTeQ
 
Title UK Research Software Survey 2014 
Description This spreadsheet contains the anonymised data collected as part of a survey of UK researchers in their use of research software. We asked people specifically about "research software" which we defined as: "Software that is used to generate, process or analyse results that you intend to appear in a publication (either in a journal, conference paper, monograph, book or thesis). Research software can be anything from a few lines of code written by yourself, to a professionally developed software package. Software that does not generate, process or analyse results - such as word processing software, or the use of a web search - does not count as 'research software' for the purposes of this survey." We contacted 1,000 randomly selected researchers at each of 15 Russell Group universities. From the 15,000 invitations to complete the survey, we received 417 responses - a rate of 3% which is fairly normal for a blind survey. We used Google Forms to collect responses. The responses have good representation from across the disciplines, seniorities and genders. This is a statistically significant number of responses that can be used to represent the views of people in research-intensive universities in the UK. An overview of the data is available on the worksheet "Summary data". Responses to questions are ordered by unique respondent ID. Please read the "README" worksheet for additional information about the collection and processing of this data. This survey data is licensed under a Creative Commons by Attribution licence. Copyright resides with The University of Edinburgh on behalf of the Software Sustainability Institute. 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
 
Title hapbin: An efficient program for performing haplotype based scans for positive selection in large genomic datasets 
Description These files contain genome-wide integrated haplotype scores (iHS) for each of the 26 populations in the phase 3 release of the 1000 genomes project. iHS were calculated using the hapbin program that can be downloaded from https://github.com/evotools/hapbin. The 1000 genomes phased haplotypes were obtained from mathgen.stats.ox.ac.uk/impute and hapbin was run with default parameters. The iHS are provided in two formats; BED and bedGraph. For each SNP the unstandardised iHS was calculated as ln(iHH1/iHH0) and these values normalised using hapbin's default parameters. If the normalised iHS was negative the absolute value is reported and this is indicated by a 1 in the fourth column following the ":" of the BED format file. The bedGraph formatted data can be easily viewed along the genome at the UCSC genome browser by specifying the URL to the corresponding file at http://genome-euro.ucsc.edu/cgi-bin/hgCustom?clade=mammal&org=Human&db=hg37. 
Type Of Material Database/Collection of data 
Year Produced 2015 
Provided To Others? Yes  
 
Description Data Carpentry 
Organisation Data Carpentry
Country United States 
Sector Charity/Non Profit 
PI Contribution Coordination of Data Carpentry training events in the UK. Training of Data Carpentry instructors. Contribution of training material.
Collaborator Contribution Production of training materials. Provision of central administrative infrastructure.
Impact Multi-disciplinary. Training of hundreds of researchers in basic data management and analysis skills.
Start Year 2015
 
Description Software Carpentry Foundation 
Organisation Software Carpentry Foundation
Country United States 
Sector Charity/Non Profit 
PI Contribution We are acting as the UK coordinators for Software Carpentry courses to teach researchers computing skills. Neil Chue Hong and Carole Goble were invited to join the board of the SCF.
Collaborator Contribution The SCF provides materials and organises instructor training. We have therefore benefitted from the resources developed by partners within the Software Carpentry Foundation.
Impact Over 30 workshops and 1000 learners trained in the UK across multiple disciplines.
Start Year 2012
 
Title Software Evaluation Service 
Description An online tool which gives a software author the opportunity to review the main issues that affect the sustainability of their software. At the end of the evaluation, a report is generated and emailed to them with tailored sustainability advice. 
Type Of Technology Webtool/Application 
Year Produced 2013 
Impact Over 100 researchers have conducted evaluations of their software using the tool. 
URL http://www.software.ac.uk/online-sustainability-evaluation
 
Title Software Management Plans 
Description An extension of the DMPOnline webtool to allow for the creation and management of software management plans. 
Type Of Technology Webtool/Application 
Year Produced 2016 
Impact This has enabled the widely referenced DMPOnline tool developed and hosted by the Digital Curation Centre to be applied to software, and forms the basis of upcoming guidance for research funder software calls. 
URL https://www.software.ac.uk/software-management-plans