SSI: The UK Software Sustainability Institute
Lead Research Organisation:
University of Edinburgh
Department Name: Edinburgh Parallel Computing Centre
Abstract
This proposal seeks to gain support for the next five years for the UK Software Sustainability Institute (SSI) that will work in partnership with research communities to identify key software that needs to be sustained.Researchers today from many different disciplines rely on software to carry out high quality research. This software must be able to serve the changing needs of the researcher to remain relevant. It must be able to change, adapt, and travel with the researcher on the journey of exploration, innovation and discovery which embodies leading research.Software sustainability is the key to this journey; it is essential that software used in research is managed beyond the lifetime of its original funding cycle, and strengthened, adapted and customised for adoption by researchers within the founding community and outside, in other communities to create a impact which can be transferred to future generations of researchers.The sustainability of important research software requires partnership. Only through close collaboration between software developers and scientists can advances be made both in software development and domain sciences. This is not a radical view. One of the most successful aspects of Agile development methodologies is a focus on ongoing interactions between developers and customers: business people and developers must work together daily throughout the project. The benefits of Agile methods for scientific software development are greater than for business software; and our experiences through our work as OMII-UK underline this emphatically. Through programmes such as ENGAGE and eDIKT2 we have worked in partnership with researchers to improve their software. As each community's requirements are different, the system we have developed is well suited to communities and software of varying maturity: one size does not fit all. We propose a Software Sustainability Institute for the UK which will establish a national focal point - a facility for research software users and developers - based around specialist software engineering skills for driving the continued improvement and impact of research software, using as its foundation the concept of Practical Peer Partnerships: bringing together key research groups within the UK with the skilled team members from our consortium to deliver a series of genuine partnerships focused on the improvement of vital research software through consultative advice, such as test design or help with development tools; collaborative partnerships, e.g. by helping with code refactoring; and long-term engagement, for cultivating the relationships which grow communities and harness the momentum of other activities. This will build on the partners existing connections to many different research communities, and to infrastructure initiatives such as the NGS, EGI, NESSI and PRACE.Our approach will ensure that software used in research is managed beyond the lifetime of its original funding cycle: it will be strengthened, adapted and customised for adoption by researchers, within the founding community and elsewhere, to create an impact which can be built on by future generations of researchers - ultimately to deliver new high quality research.
Planned Impact
Who will benefit from this work, and how? - academic researchers will benefit from software - both their own and that developed by others - that can be relied on and can be used as the basis of their research. - researchers and developers who develop or extend software will have a facility that will assist them with the maintenance, expansion, exploitation and community development of their codes for the benefit of themselves and for others in the UK, and to make a wider impact internationally. We will explore a range of exploitation and valorisation channels and in particular open source development. Open development offers more than free software. Open development offers new collaborative possibilities for software users as well as software developers. - commercial and public sector researchers will have access to more robust software from the research sector, with the potential and incentive to contribute back. The SSI can achieve this by securing the software that underpins innovations and investigation and by maximising the ability of researchers to take-up software developed by others for the benefit of the UK as a whole. We already have significant links to industry and would pursue the exploitation of software under the remit of the SSI in order to get additional value from the commercial and public sector. - policy makers will benefit from the support of software which is used to define policy. In this proposal we show support from groups supporting decision makers in the areas of climate change, social mobility and changing populations, and transport / environmental pollution policy. Policy makers for research and innovation - national and international - will benefit from a facility that is expert at the issues underpinning software sustainability; experience in the field and a direct channel to the research community of contributors and users. - the wider public will benefit from the results of research which has a direct or indirect impact on their lives. In this proposal we show support from groups providing software that will lead to research in biofuels, climate models, cancer research, clinical trials, criminology and crop research, ultimately impacting this nation's health and wealth. Opportunity to benefit The SSI proposal is one of collaboration and partnership. Our proposed model of delivering software sustainability through practical peer partnerships between engineers and researchers is the best way to achieve value and impact. A range of schemes - consultancy, collaborative projects, support, networks, focus groups of user PALs , research nodes - built on the foundations of our experience and pre-existing collaborations are all there to ensure that researchers benefit. They will deliver high volumes of knowledge transfer and engagement to achieve wide-scale sustainability of research software. There is specific budget for the secondment of research staff in the field. We will leverage our connections with international organisations to amplify the impact of the software we help sustain. We will promote the importance of software sustainability using our strong presence in: technical standards bodies; scientific standards initiatives; major scientific networks; major international projects impacting UK communities in Europe and the USA and major international initiatives. We will build on our experience and encourage and assist key scientific software groups to adopt better development methods, identifying and eliminating duplicate activity, bring together islands of expertise to create critical mass in the community, foster the integration of similar software products and facilitate a fuller dialogue between developers and users. In doing so, the SSI will foster global economic performance, prevent wasteful reinvention, improve returns on initial research investments and ultimately improve the competitiveness of the UK by making our researchers more innovative and productive.
Publications
Aleksic J
(2015)
An Open Science Peer Review Oath
in F1000Research
Basham M
(2015)
Data Analysis WorkbeNch (DAWN).
in Journal of synchrotron radiation
Baxter R
(2011)
Tracking community intelligence with Trac.
in Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
Bergel G
(2020)
Sustaining Digital Humanities in the UK
Budd A
(2015)
A quick guide for building a successful bioinformatics community.
in PLoS computational biology
Description | The UK Software Sustainability Institute was set up to address the concerns in 2009 of the quality of research software. Since then, it has played a major role in both understanding the issues around research software, but also addressing them. This has been done inconjunction with over 160 different collaborators to create a diverse set of resources, services, training and best practice to improve research software in the UK. |
Exploitation Route | Our training materials and guides are licensed under a Creative Commons licence so that others may put them to use freely, as they have through initiatives like Software Carpentry (of which we are the UK coordinators). The guidance on policy such as software management plans, software licensing, software publishing and software sustainability evaluations have been taken forward to improve the guidelines by funders such as EPSRC, ESRC and Wellcome Trust. Finally, we expect the research we have undertaken on understanding the scale of the use of research software in the UK in 2014 to be taken forward by other groups studying the effect of initiatives like training on the ability of the sector to develop stable, reliable software. |
Sectors | Aerospace Defence and Marine Agriculture Food and Drink Chemicals Creative Economy Digital/Communication/Information Technologies (including Software) Education Electronics Energy Environment Government Democracy and Justice Manufacturing including Industrial Biotechology Culture Heritage Museums and Collections Pharmaceuticals and Medical Biotechnology Transport Other |
URL | http://www.software.ac.uk/ |
Description | The work carried out by the Software Sustainability Institute has had three principal pathways to impact beyond academia. 1) Our work on aspects of software sustainability, software preservation, software development policy and software development process have been used by non-academic organisations, including the International Atomic Energy Agency and Mozilla, to improve the effectiveness of the development and efficiency of maintenance of specialist software. Since 2021, this has also included collaboration on training with Astra Zeneca. 2) The groups we have worked with directly to improve their development of specialist research software in turn have had societal and economic impact as a result of the new research that has been enabled, including the ability to study the effects of anti-viral drugs, biomass yield from UK woodlands, and the habits of UK city dwellers. 3) Our work to define, establish and support the role and career path of the Research Software Engineer has been picked up by industry, notably by Microsoft, and is being used to establish better career paths for those seeking to move between software development and software engineering positions in academia and industry. It has also led to the creation of the Society of Research Software Engineering, a new professional society. Additionally, our Fellowship Programme has had a significant effect on research culture. Work by Robin Wilson (2013 Fellow) on CITATION files funded as part of this grant has informed new functionality released in 2021 by GitHub to support software citation. Work supported by this grant by Stephen Eglen (2014 Fellow) has led to the creation of CODECHECK, which was used to verify COVID modelling simulation code in 2021. |
First Year Of Impact | 2013 |
Sector | Agriculture, Food and Drink,Chemicals,Digital/Communication/Information Technologies (including Software),Education,Energy,Environment,Healthcare,Government, Democracy and Justice,Culture, Heritage, Museums and Collections,Pharmaceuticals and Medical Biotechnology |
Impact Types | Cultural Societal Economic Policy & public services |
Description | A Consultation on Proposals for Long-Term Capital Investment in Science & Research |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Description | House of Lords Select Committee on Science and Technology inquiry into Scientific Infrastructure |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Impact | Understanding the requirement adequate effort and skill resourcing for capital investment. |
URL | http://www.parliament.uk/business/committees/committees-a-z/lords-select/science-and-technology-comm... |
Description | Independent review of the role of metrics in research assessment |
Geographic Reach | National |
Policy Influence Type | Contribution to a national consultation/review |
Description | Amazon Web Services in Education Research Grant |
Amount | $10,000 (USD) |
Organisation | Amazon.com |
Sector | Private |
Country | United States |
Start | 03/2012 |
End | 02/2014 |
Description | Microsoft Azure Research Award |
Amount | $20,000 (USD) |
Organisation | Microsoft Research |
Sector | Private |
Country | Global |
Start | 03/2015 |
End | 03/2016 |
Description | Rapport: Robust Application Porting for HPC in the Cloud. |
Amount | £87,077 (GBP) |
Funding ID | EP/I034246/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 02/2011 |
End | 10/2011 |
Description | SI2-CHE: Development and Deployment of Chemical Software for Advanced Potential Energy Surfaces |
Amount | £361,289 (GBP) |
Funding ID | EP/K040138/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2013 |
End | 04/2016 |
Description | Standard Research |
Amount | £80,263 (GBP) |
Funding ID | EP/N028902/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2016 |
End | 01/2019 |
Description | The Software Sustainability Institute: Phase 2 |
Amount | £3,511,602 (GBP) |
Funding ID | EP/N006410/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 05/2015 |
End | 05/2019 |
Description | The UK Software Sustainability Institute: Phase 3 |
Amount | £6,599,477 (GBP) |
Funding ID | EP/S021779/1 |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 12/2018 |
End | 11/2023 |
Title | Supplemental material for: Morphological phylogenetics evaluated using novel evolutionary simulations |
Description | Evolutionary inferences require reliable phylogenies. Morphological data has traditionally been analysed using maximum parsimony, but recent simulation studies have suggested that Bayesian analyses yield more accurate trees. This debate is ongoing, in part, because of ambiguity over modes of morphological evolution and a lack of appropriate models. Here we investigate phylogenetic methods using two novel simulation models - one in which morphological characters evolve stochastically along lineages and another in which individuals undergo selection. Both models generate character data and lineage splitting simultaneously: the resulting trees are an emergent property, rather than a fixed parameter. Standard consensus methods for Bayesian searches (Mki) yield fewer incorrect nodes and quartets than the standard consensus trees recovered using equal weighting and implied weighting parsimony searches. Distances between the pool of derived trees (most parsimonious or posterior distribution) and the true trees - measured using Robinson-Foulds (RF), subtree prune and regraft (SPR), and tree bisection reconnection (TBR) metrics - demonstrate that this is related to the search strategy and consensus method of each technique. The amount and structure of homoplasy in character data differs between models. Morphological coherence, which has previously not been considered in this context, proves to be a more important factor for phylogenetic accuracy than homoplasy. Selection-based models exhibit relatively lower homoplasy, lower morphological coherence, and higher inaccuracy in inferred trees. Selection is a dominant driver of morphological evolution, but we demonstrate that it has a confounding effect on numerous character properties which are fundamental to phylogenetic inference. We suggest that the current debate should move beyond considerations of parsimony versus Bayesian, towards identifying modes of morphological evolution and using these to build models for probabilistic search methods. |
Type Of Material | Database/Collection of data |
Year Produced | 2020 |
Provided To Others? | Yes |
URL | http://datadryad.org/stash/dataset/doi:10.5061/dryad.4b8gtht8h |
Title | UK Research Software Survey 2014 |
Description | This spreadsheet contains the anonymised data collected as part of a survey of UK researchers in their use of research software. |
Type Of Material | Database/Collection of data |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | Substantial public coverage of results of survey. Additional research looking at gender and diversity issues relating to data. |
URL | https://zenodo.org/record/14809?ln=en#.VuAtV5OLTeQ |
Title | UK Research Software Survey 2014 |
Description | This spreadsheet contains the anonymised data collected as part of a survey of UK researchers in their use of research software. We asked people specifically about "research software" which we defined as: "Software that is used to generate, process or analyse results that you intend to appear in a publication (either in a journal, conference paper, monograph, book or thesis). Research software can be anything from a few lines of code written by yourself, to a professionally developed software package. Software that does not generate, process or analyse results - such as word processing software, or the use of a web search - does not count as 'research software' for the purposes of this survey." We contacted 1,000 randomly selected researchers at each of 15 Russell Group universities. From the 15,000 invitations to complete the survey, we received 417 responses - a rate of 3% which is fairly normal for a blind survey. We used Google Forms to collect responses. The responses have good representation from across the disciplines, seniorities and genders. This is a statistically significant number of responses that can be used to represent the views of people in research-intensive universities in the UK. An overview of the data is available on the worksheet "Summary data". Responses to questions are ordered by unique respondent ID. Please read the "README" worksheet for additional information about the collection and processing of this data. This survey data is licensed under a Creative Commons by Attribution licence. Copyright resides with The University of Edinburgh on behalf of the Software Sustainability Institute. |
Type Of Material | Database/Collection of data |
Year Produced | 2015 |
Provided To Others? | Yes |
Title | hapbin: An efficient program for performing haplotype based scans for positive selection in large genomic datasets |
Description | These files contain genome-wide integrated haplotype scores (iHS) for each of the 26 populations in the phase 3 release of the 1000 genomes project. iHS were calculated using the hapbin program that can be downloaded from https://github.com/evotools/hapbin. The 1000 genomes phased haplotypes were obtained from mathgen.stats.ox.ac.uk/impute and hapbin was run with default parameters. The iHS are provided in two formats; BED and bedGraph. For each SNP the unstandardised iHS was calculated as ln(iHH1/iHH0) and these values normalised using hapbin's default parameters. If the normalised iHS was negative the absolute value is reported and this is indicated by a 1 in the fourth column following the ":" of the BED format file. The bedGraph formatted data can be easily viewed along the genome at the UCSC genome browser by specifying the URL to the corresponding file at http://genome-euro.ucsc.edu/cgi-bin/hgCustom?clade=mammal&org=Human&db=hg37. |
Type Of Material | Database/Collection of data |
Year Produced | 2015 |
Provided To Others? | Yes |
Description | Data Carpentry |
Organisation | Data Carpentry |
Country | United States |
Sector | Charity/Non Profit |
PI Contribution | Coordination of Data Carpentry training events in the UK. Training of Data Carpentry instructors. Contribution of training material. |
Collaborator Contribution | Production of training materials. Provision of central administrative infrastructure. |
Impact | Multi-disciplinary. Training of hundreds of researchers in basic data management and analysis skills. |
Start Year | 2015 |
Description | Software Carpentry Foundation |
Organisation | Software Carpentry Foundation |
Country | United States |
Sector | Charity/Non Profit |
PI Contribution | We are acting as the UK coordinators for Software Carpentry courses to teach researchers computing skills. Neil Chue Hong and Carole Goble were invited to join the board of the SCF. |
Collaborator Contribution | The SCF provides materials and organises instructor training. We have therefore benefitted from the resources developed by partners within the Software Carpentry Foundation. |
Impact | Over 30 workshops and 1000 learners trained in the UK across multiple disciplines. |
Start Year | 2012 |
Title | Software Evaluation Service |
Description | An online tool which gives a software author the opportunity to review the main issues that affect the sustainability of their software. At the end of the evaluation, a report is generated and emailed to them with tailored sustainability advice. |
Type Of Technology | Webtool/Application |
Year Produced | 2013 |
Impact | Over 100 researchers have conducted evaluations of their software using the tool. |
URL | http://www.software.ac.uk/online-sustainability-evaluation |
Title | Software Management Plans |
Description | An extension of the DMPOnline webtool to allow for the creation and management of software management plans. |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | This has enabled the widely referenced DMPOnline tool developed and hosted by the Digital Curation Centre to be applied to software, and forms the basis of upcoming guidance for research funder software calls. |
URL | https://www.software.ac.uk/software-management-plans |