SANDPIT: Evolution as an Information Dynamic System

Lead Research Organisation: Middlesex University
Department Name: School of Science and Technology

Abstract

What changes when an organism evolves and how? At the most basic biological level, the answer is 'the DNA sequence', and at the most functional level the answer is 'Darwinian fitness'. At a mathematical level, information theory deals with codes which map from alphabets (such as the DNA alphabet of 'A', 'C', 'G' and 'T'), sequences and their probability distributions to numerical functions (such as Darwinian fitness). So information theory is a natural framework in which to consider biological evolution. The relationship between DNA sequences and Darwinian fitness is rather poorly served by current theoretical models of 'fitness landscapes', which are simultaneously too complex (the number of possible genetic make-ups is too vast to be experimentally accessible in any real biological system) and too simple - fitness landscapes are static concepts, whereas in reality fitness is relative to an ever-changing environment of physical and other organisms, as popularised in the 'Red Queen' idea - organisms have to 'run' (evolve) to stay in the same place in terms of fitness. Therefore, the most appropriate branch of information theory to develop for understanding biological evolution is in the context of dynamics of information about the environment and fitness. This project will develop cutting-edge information dynamic theory in a way that makes it applicable to biological evolution.This project aims to make solid connections between state of the art mathematics (information dynamics and geometry) and real biological evolution. However, that link is not straightforward due to the complexity of real organisms, the difficulty to monitor them, let alone control over evolutionary time-scales. This project will therefore use a series of experimentally evolving systems with different levels of experimental control and biological realism that relate to one another as well as to the theory. Thus, we shall consider the evolution of a 'simple' biochemical interaction between DNA and another molecule and different levels of complexity in evolving systems of computer 'organisms' as well as the evolution of complete biological organisms, microbes, in the laboratory. There will be close feedback between mathematics and experiment in each of these systems, not only to validate the theoretical advances, but to test the nature and differences in the information dynamics and geometry in each of the evolving systems.The primary level at which these mathematical developments will be tested in experimental systems is in terms of the dynamics of evolutionary operators, such as mutation, recombination, selection and so on, the processes by which the transitions from one generation to the next occur and pass the information. In the computational and biochemical systems we have complete control of these operators, and in the biological evolution system we are able to manipulate mutation rates and monitor the effects by sequencing the organisms' DNA. It will therefore be possible to test how manipulation of these operators relates to changes in the fitness scores. At the mathematical level this will require development of novel 'transition kernels' corresponding to different forms of biologically realistic operators, considering various measures of information and the analysis of its dynamics. By constructive interaction between the mathematics and the different levels of abstraction of biological systems, we shall both develop an important field of mathematics and provide a novel, widely applicable and relevant framework for thinking about testing and understanding biological evolution.

Planned Impact

Overview of impact: This project will impact the world in providing a fundamental formal mathematical understanding of biological evolution. This in itself has the potential for wide impact in understanding the processes of adaptation of organisms to changes in the environment, such as adaptation to climate change or pathogens' adaptation to changing control regimes. At a direct technological level, it will allow in vitro evolution approaches to be optimised a priori, giving much more efficient drug development. More generally the project will have an immediate impact on the design of adaptive algorithms for complex optimisation problems, such as timetabling, scheduling, combinatorial optimisation among others; and the potential for significant impact in wider sections of society concerned with change in complex systems in general, as well as evolved biological systems (including ourselves) in particular. Impact on Optimisation: In operations research, evolutionary algorithms have been one of the most important techniques to tackle some of the hardest optimisation problems, and the development of the theory of information dynamics in such systems will have a significant industrial impact. In addition, the work here will develop a new understanding of how nature solves optimisation and adaptation problems. Impact on Medicine and biotechnology: Both medicine and biotechnology throw up specific optimisation problems, to which this project has particular relevance. Most importantly, the in vitro system used here is in fact a route to producing aptamers-nucleic acids that bind specifically to a particular molecule of another class (such as a protein as used here). Aptamers have a variety of uses in biotechnology and a high potential as therapeutics. This impact will be realised specifically by developing and disseminating understanding of the in vitro aptamer evolution system. More generally, via the in vivo system, this project will derive a new understanding of how nature adjusts whole organisms genetically in evolution. Specifically this is a process whereby nature solves the problem of changing an aspect of the organism's phenotype without undue detrimental effects on overall organism function. This is precisely the challenge that medicine faces when it attempts to influence the human organism by specific drug intervention. Therefore, the understanding we derive of how evolution meets this challenge will be relevant to target selection in drug development. Impact on and via the PhD student and PDRA: We anticipate that there will be particular societal impact both on and via the personnel employed via this grant (the PDRA and PhD student). This project embodies a rare close synergy of cutting-edge mathematics, computer science and biology. This will give these individuals unique expertise that will enhance their career prospects, since the sort of cross-disciplinary approaches in which they will become experienced are increasingly sought and required for addressing societal issues, notably at the policy level (as witnessed by the appointment of scientific advisers in a wide and increasing range of government departments). Whatever their subsequent career paths, this project will help equip them to communicate with a wider audience about the reality of the interface of mathematics and biology. Impact on the wider public: This year's Darwin anniversary celebrations have highlighted the great public interest in the processes by which we, and the rest of biology, have evolved. This project aims to provide a new and fundamental understanding the evolutionary processes by which organisms adapt, which connects to many specific examples of public interest, such as organisms adapting to climate change or pathogens adapting to changing control regimes. The possibility that all these processes can be understood in a new, fundamental and useful way in this project is something with potential to engage people.
 
Description 1) Derived mathematical expressions for probability of transitions between spheres around an optimum in the Hamming space of sequences as a function of a single mutation rate parameter. This allows one to compute exactly the probability of beneficial mutation of DNA sequences under simple point mutation. 2) In contrast to classical models, our work showed that probability of beneficial mutations can be increased by varying mutation rates as a function of fitness. This is because the geometry of sequence spaces, such as the Hamming space, is different from the geometry of Euclidean space used in traditional geometric analysis of adaptation (originally due to R. Fisher). 3) Formulated and solved several problems of optimal control of mutation rates based on fitness feedback from monotonic landscapes. In particular, these are problems of maximising expected fitness of future generations at different time horizons, maximising an increase of expected fitness, maximising cumulative expected fitness and maximising expected fitness subject to constraints on information divergence between populations at different generations. Evolutionary systems controlling mutation rates based on the solution of the latter problem evolve along an information geodesic (shortest `information distance') on a statistical manifold of all possible populations, and have the property of minimal fitness variance within the populations. 4) Studied mathematical properties of optimal evolutionary trajectories under constraints on generalised information distances. It was proved, under broad assumptions on information functional, that optimal trajectories are always confined to the interior of the statistical manifold (e.g. the set of all possible probability distributions over species). This implies that optimal transitions are always stochastic, or, in biological terms, with a non-zero probability of mutation. 5) Developed a 'meta-GA' platform for parameter optimisation in evolutionary algorithms for an abstract or biological landscape. The platform uses many-core GPGPU technology to speed up experiments from 1.4 years (maximum, per run) to 3 days. 6) Developed theory of fitness-distance communication in weakly monotonic landscapes. Proved that all landscapes are weakly monotonic around a global optimum, and therefore an optimal control of mutation rate derived for a monotonic landscape should also be beneficial in some neighbourhood of a global optimum in all other landscapes. These predictions have been verified on 108 complete landscapes of transcription factor bindings. 7) Developed experimental techniques (based on fluctuation tests) for estimation of mutation rates in bacterium Eschericia coli evolving under different experimental conditions. 8) Discovered that mutation rates in the bacterium Eschericia coli are significantly related to estimators of their fitness, in particular to population densities. Furthermore, we have been able to show that this variation of mutation rates is regulated via a quorum sensing system, which is a well characterised mechanism of communication of density information between bacteria. This suggests that improving adaptation by a control of mutation rates is not just a theoretical idea, but it is possibly an essential trait of biological organisms.
Exploitation Route Findings 1-6 can be used to improve the performance of population-based search algorithms, which are applied to solve many hard optimisation problems. Findings 7-8 can be used to develop novel techniques to control mutation in bacteria or other cells.
Sectors Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology,Transport

URL http://www.eis.mdx.ac.uk/staffpages/rvb/eids/
 
Description Adaptive landscapes of antibiotic resistance: population size and 'survival-of-the-flattest'
Amount £32,352 (GBP)
Funding ID BB/M021106/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 08/2015 
End 07/2018
 
Description The theory and practice of evolvability: Effects and mechanisms of mutation rate plasticity
Amount £465,242 (GBP)
Funding ID BB/L009579/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 04/2014 
End 01/2017