High throughput analysis of cell growth data from phenotype arrays

Lead Research Organisation: University of Nottingham
Department Name: Sch of Biosciences

Abstract

Fifty people died as a result of the recent E. coli outbreak in Germany. Four thousand people were infected. With a growing global human population, how do we ensure that we all have access to safe food? Fossil fuels will run out, and the recent Fukushima disaster highlighted the risks of nuclear energy. How do we provide sustainable sources of fuel to meet our energy and transport needs in the context of a population that is not just growing, but also developing?

These are major challenges, and a key strategy for coming them is the study of microbes. In the case of E. coli the disease is caused by harmful bacteria, and we need to understand how harmful bacteria survive in farms, soil, food production, storage and preparation facilities, as well as in animal and human hosts. In the case of fuels, microbes provide an opportunity for a new generation of biofuels. Biofuels are carbon neutral technologies, but conventional biofuels need similar materials or land that could otherwise be used for food. We are now seeking to develop biofuels from plant matter that cannot be used for food and is currently wasted. To do this, we need to find new strains of yeast that can convert this plant matter into fuel.

In recent years, new technologies have been developed that enable us to read the full genome sequence of a microbe in just a day. This is indeed remarkable, but the genome sequence is a set of instructions in a language that we can only begin to understand. What really matters is how a microbe behaves in different environments: on what foods does it thrive, on what foods does it starve? What potential toxins can it survive and what toxins kill it? These questions are essential for understanding how we can combat harmful food-borne bacteria, or develop new bioenergy producing agents. And if we can link these answers to the genome sequence, we have a powerful way of decoding the language of the genes.

This proposal is focussed on a technology, called Biolog Phenotype Microarrays, that precisely measure how well microbes thrive in thousands of conditions, including different food sources and potential toxins. The arrays generate time courses that plot each condition at a regular point in time, with several hundred measurements of cell activity during the course of an experiment. Each time course encodes a wealth of information: how long does it take before the microbes start to become active? How quickly do they grow? Are they able to use more than one food source, and if so, is one better than the other? How much do they grow? Remarkably, there are no analysis methods available that allow users of Biolog arrays to obtain this information from the Biolog output: instead, users typically use a single datum, such as the end-point, or total growth, and discard most of the valuable information.

The aim of this proposal is to bridge this gap. To do so, we intend to build mathematical models that describe cell activity in Biolog arrays; these need to reflect the details of the technology, as well as the complexity of the conditions in which the cells are grown. We propose to develop automated ways of working out which model best fits any given set of data, and identify the key parameters describing microbial behaviour. Automation is essential, because a single experiment can generate 2000 microbial time courses. The methods have to be accessible to the wider scientific community, not just mathematicians, so we need to develop user-friendly interfaces to the methods we develop, and provide training for Biolog users in these methods.

Finally, in our established research programmes, we have generated vast quantities of Biolog data on survival of harmful E. coli strains, microbial soil contamination and the development of new yeast strains for producing biofuel from non-food plant material. We will directly address the food safety and bioenergy challenges by applying our methods to these data.

Technical Summary

Biolog phenotype microarrays are unique tools for high throughput analysis of phenotypic responses of organisms to diverse conditions. They are highly sensitive, rapidly responsive and non-destructive to the samples, enabling repeated measurements over the timescale of an experiment. They are increasingly widely used for analysis of new pathogens, evaluating new drugs, toxicology testing, functional genomics, optimizing growth and secondary metabolite production, cell and enzyme based assays.

The software provided with the system summarizes the time-course as a single datum. Thus, currently, most of the information generated during the experiment is not used. In fact, the time courses generated can be extremely varied: some are simple, and can be fitted by simple models (such as logistic growth); others are complex, with multi-stage lag phases, or exhibit diauxic switching. These time courses often contain important and detailed information, both qualitative (shape of curve) and quantitative (values of parameters) about the response of the organism of study to the environmental conditions. As a consequence, there is a big research gap: how best to effectively and robustly ascertain the key qualitative and quantitative phenotypic output from high throughput Biolog PM experiments.

We aim to create improved mathematical models combined with sophisticated inference and model choice techniques that will allow users to derive maximum information from the Biolog output. Moreover, we will produce user-friendly open-source software to allow laboratory users to carry out these analyses in high throughput. These aims will be achieved by building on our existing Systems Biology and Biostatistics programmes. We will apply the methods developed to data already obtained in our food safety and bioenergy research programmes, so this work will have specific impacts in these areas, as well as general benefit for all users of this technology.

Planned Impact

This work will have direct and/or indirect impacts on academic beneficiaries, industry, the general public, the public sector and schools.

Academic beneficiaries for this research will include microbiologists, systems biologists and cell biologists who are using phenotype microarrays as tools to understand how prokaryotic and eukaryotic cells respond to nutrition, environment and inhibitory compounds, who are characterizing novel pathogens, and characterizing strains that can be used to produce future fuels, chemicals and pharmaceuticals. This includes the major facilities at the AHVLA, the Sanger Institute and the BBSRC-funded TGAC. This research will allow them to extract more information from the data already generated, and further enhance multi-disciplinarity approaches to understanding the response and function of organisms. This impact will be direct.

Direct impact to industry will come to companies in the pharmaceutical and biotechnology sectors that use Biolog devices. Industrial applications include analysis of new pathogens, evaluating new drugs, toxicology testing, functional genomics, optimizing growth and secondary metabolite production, cell and enzyme based assays. Improved methods for analysis of these data will have impacts on companies' return on investment in the technology.

Indirect impact to industry will come through identification of yeast strains with potential commercial value for bioenergy production. When identified, these will be exploited by researchers in the UK (BBSRC Sustainable Bioenergy Centre) and our collaborators in the USA (Energy Biosciences Institute, Joint Bioenergy Institute) and Brazil (EMBRAPA). Furthermore it is anticipated that this data will also permit our LACE industrial partners (BP, British Sugar, DSM, SABMiller, Coors, Lallemand) to interrogate the data sets more effectively and select strains for deployment in commercial fermentation scenarios.

Indirect impacts will be enjoyed by the general public in terms of food safety and bioenergy. In health, the rapid phenotypic characterization of new pathogens and understanding their survival, persistence and resistance could improve our understanding, prevention and treatment of foodborne disease outbreaks. In bioenergy, this research can contribute towards environmental sustainability, and reduction in reliance on imported fuels, reduced use of food crops for fuel, and a reduction in emissions.

Indirect impact to the public sector will come as a result of our applications to food safety and bioenergy. With regards food safety, increased understanding of the persistence of pathogenic E. coli strains in food production and the persistence and remediation of pathogens from soil will, in the long term, aid government intervention and decision making. Similarly, a second generation of bioenergy agents will lead to potential for increased bioenergy production, that can impact on government decisions of overall energy production strategy.

Direct impacts on schools will come through a programme of direct engagement with Science Clubs and G&T schemes in local and regional schools. We will use food safety and bioenergy as ways of engaging young people with microbiology. We already have on-going contact with Wilesthorpe School (Long Eaton), St Matthews School (Duddeston), and Prince Albert J.I. School (Aston), the latter two being in areas of social and economic deprivation, as well as with the Association of Science Educators.

This research will also foster training of skilled multidisciplinary individual and cross-disciplinary training.

Publications

10 25 50
publication icon
Gerstgrasser M (2016) A Bayesian approach to analyzing phenotype microarray data enables estimation of microbial growth parameters. in Journal of bioinformatics and computational biology

 
Description We have found the best models to extract relevant information from experiments using Biolog Phenotype Microarrays. We have found that by using GPU technologies we can do the analysis rapidly and effectively. We have developed software that allows for analysis and visualization, with an attractive GUI. This is being prepared for release. We have already run training courses and presented at a conference which have provided very valuable feedback for our first sofware release.
Exploitation Route We will complete the software and ensure that it is readily available (through research publications and submission to suitable repositories). We are also completing the research article.
Sectors Agriculture

Food and Drink

Education

Environment

Healthcare

Manufacturing

including Industrial Biotechology

Pharmaceuticals and Medical Biotechnology

 
Title Biolog software - first round 
Description First release of Biolog software - mainly developed prior to award of grant but has involvement of Mike Stout from the Biolog grant. This will be superseded by next version. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact None yet 
URL https://github.com/dovstekellab/mcmc-pma
 
Description Biolog conference (Florence) 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presented HiPerFit software from Biolog grant at Biolog conference in Florence, September 2016. ~100 participants. Positive feedback on software as it is being developed.
Year(s) Of Engagement Activity 2015
 
Description Biolog workshops (Universities of Leicester and Surry) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact We ran two workshops demonstrating new software developed from our Biolog grant, both during summer 2015. Approximately 10 people attended each workshop, including industry representatives (Biolog), postgraduate students, post-doctoral researchers and academic staff.
Year(s) Of Engagement Activity 2015
 
Description Meeting with Biolog/Technopath 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Industry/Business
Results and Impact Focussed meeting with CEO of Biolog Inc. and other staff from Biolog and their distributors Technopath in order to demonstrate the software we are developing from the grant.
Year(s) Of Engagement Activity 2015