EPSRC CDT in Sustainable Approaches to Biomedical Science: Responsible and Reproducible Research - SABS:R^3

Lead Research Organisation: University of Oxford
Department Name: SABS IDC

Abstract

Building upon our existing flagship industry-linked EPSRC & MRC CDT in Systems Approaches to Biomedical Science (SABS), the new EPSRC CDT in Sustainable Approaches to Biomedical Science: Responsible and Reproducible Research - SABS:R^3 - will train a further five cohorts, each of 15 students, in cutting-edge systems approaches to biomedical research and, uniquely within the UK, in advanced practices in software engineering. Our renewed goal is to bring about a transformation of the research culture in computational biomedical science.

Computational methods are now at the heart of biomedical research. From the simulation of the behaviour of complex systems, through the design and automation of laboratory experiments, to the analysis of both small and large-scale data, well-engineered software has proved capable of transforming biomedical science. Biomedical science is therefore dependent as never before on research software.

Industries reliant on this continued innovation in biomedical science play a critical role in the UK economy. The biopharmaceutical and medical technology industrial sectors alone generate an annual turnover of over £63 billion and employ 233,000 scientists and staff. In his foreword to the 2017 Life Sciences Industrial Strategy, Sir John Bell noted that, "The global life sciences industry is expected to reach >$2 trillion in gross value by 2023... there are few, if any, sectors more important to support as part of the industrial strategy." The report identifies the need to provide training in skills in "informatics, computational, mathematical and statistics areas" as being of major concern for the life sciences industry.

Over the last 9 years, the existing SABS CDT has been working with its consortium of now 22 industrial and institutional partners to meet these training needs. Over this same period, continued advances in information technology have accelerated the shift in the biomedical research landscape in an increasingly quantitativeand predictive direction. As a result, computational and hence software-driven approaches now underpin all aspects of the research pipeline. In spite of this central importance, the development of research software is typically a by-product of the research process, with the research publication being the primary output. Research software is typically not made available to the research community, or even to peer reviewers, and therefore cannot be verified. Vast amounts of research time is lost (usually by PhD students with no formal training in software development) in re-implementing already-existing solutions from the literature. Even if successful, the re-implemented software is again not released to the community, and the cycle repeats. No consideration is made of the huge benefits of model verification, re-use, extension, and maintainability, nor of the implications for the reproducibility of the published research. Progress in biomedical science is thus impeded, with knock-on effects into clinical translation and knowledge transfer into industry.

There is therefore an urgent need for a radically different approach. The SABS:R^3 CDT will build on the existing SABS Programme to equip a new generation of biomedical research scientists with not only the knowledge and methods necessary to take a quantitative and interdisciplinary approach, but also with advanced software engineering skills. By embedding this strong focus on sustainable and open computational methods, together with responsible and reproducible approaches, into all aspects of the new programme, our computationally-literate scientists will be equipped to act as ambassadors to bring about a transformation of biomedical research.

Planned Impact

The UK's world-leading position in biomedical research is critically dependent upon training scientists with the cutting-edge research skills and technological know-how needed to drive future scientific advances. Since 2009, the EPSRC and MRC CDT in Systems Approaches to Biomedical Science (SABS) has been working with its consortium of 22 industrial and institutional partners to meet this training need.

Over this period, our partners have identified a growing training need caused by the increasing reliance on computational approaches and research software. The new EPSRC CDT in Sustainable Approaches to Biomedical Science: Responsible and Reproducible Research - SABS:R^3 will address this need. By embedding a sustainable approach to software and computational model development into all aspects of the existing SABS training programme, we aim to foster a culture change in how the computational tools and research software that now underpin much of biomedical research are developed, and hence how quantitative and predictive translational biomedical research is undertaken.

As with all CDT Programmes, the future impact of SABS:R^3 will be through its alumni, and by the culture change that its training engenders. By these measures, our existing SABS CDT is already proving remarkably successful. Our alumni have gone on to a wide range of successful careers, 21 in academic research, 19 in industry (including 5 in SABS partner companies) and the other 10 working in organisations from the Office of National Statistics to the EPSRC. SABS' unique Open Innovation framework has facilitated new company connections and a high level of operational freedom, facilitating 14 multi-company, pre-competitive, collaborative doctoral research projects between 11 companies, each focused on a SABS student.

The impact of sustainable and open computational approaches on biomedical research is clear from existing SABS' student projects. Examples include SAbDab which resulted from the first-ever co-sponsored doctorate in SABS, by UCB and Roche. It was released as open source software, is embedded in the pipelines of several pharmaceutical companies (including UCB, Medimmune, GSK, and Lonza) and has resulted in 13 papers. The SABS student who developed SAbDab was initially seconded to MedImmune, sponsored by EPSRC IAA funding; he went on to work at Roche, and is now at BenevolentAI. Similarly, PanDDA, multi-dataset X-ray crystallographic software to detect ligand-bound states in protein complexes is in CCP4 and is an integral part of Diamond Light Source's XChem Pipeline. The SABS student who developed PanDDA was awarded an EMBO Fellowship.

Future SABS:R^3 students will undertake research supported by both our industrial partners and academic supervisors. These supervisors have a strong track record of high impact research through the release of open source software, computational tools, and databases, and through commercialisation and licensing of their research. All of this research has been undertaken in collaboration with industrial partners, with many examples of these tools now in routine use within partner companies.

The newly focused SABS:R^3 will permit new industrial collaborations. Six new partners have joined the consortium to support this new bid, ranging from major multinationals (e.g. Unilever) to SMEs (e.g. Lhasa). SABS:R^3 will continue to make all of its research and teaching resources publicly available and will continue to help to create other centres with similar aims. To promote a wider cultural change, the SABS:R^3 will also engage with the academic publishing industry (Elsevier, OUP, and Taylor & Francis). We will explore novel ways of disseminating the outputs of computational biomedical research, to engender trust in the released tools and software, facilitate more uptake and re-use.

Publications

10 25 50