An integrated approach to genome evolution using the Drosophila model

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Biological Sciences

Abstract

Developed over the past hundred years, 'population genetics' (the study of how genes and genetic material are inherited in populations) has provided a rigorous mathematical basis for evolutionary theory.

More recently, as genome-scale DNA sequencing has moved from being merely possible to routine, 'population genetics' has become 'population genomics', and our power to link theory with data has grown beyond the wildest dreams of early evolutionary biologists. We now, almost routinely, have access to hundreds or even thousands of genomes -- at least from a few key species such as humans, mice, and the classical laboratory fruit fly, Drosophila melanogaster. These data allow us deep insight into the mechanisms of genome evolution in these species, and start to allow us a general picture of the way genomes evolve.

From a genetic perspective, theory says that -- when single populations are considered -- four processes matter. These are mutation, which creates new variants in the population, recombination, which creates new combinations of variants, natural selection, which determines whether variants become more common or are lost from the population, and genetic drift, which describes how variants change in frequency when selection is weak or absent. By estimating the rates at which each of these processes occur, how they vary within genomes and among species, and how they correlate with each other, we can ultimately hope to provide a complete genetic view of genome evolution.

However, while there are good estimates for some rates (such as recombination rate) from many species, and good estimates of all rates from a very few species, there is no group of related plant or animal species for which all rates have been estimated. In particular, there are very few direct experimental estimates of mutation rate from plants or animals, so we have little idea of how this varies among species or how it correlates with the other processes.

Our research aims to fill this gap in our knowledge by quantifying all four processes in a single set of 23 related Drosophila species. The most recent ancestor of these flies lived more than 30 million year ago, the species come from all over the world, and they vary in their size, what they eat, how long they live, their optimum temperature, and even how susceptible they are to disease. What they have in common is that we can breed them in the lab, and that we already know enough about their biology to benefit from studying it further.

In this project we will directly estimate mutation and recombination rates and how they vary within and among species, and we will use population-genetic theory to indirectly estimate the strength of selection and the long-term rate of genetic drift. We will do this by sequencing 7 genomes (two parents and 5 offspring) from 5 families from each of the 23 species of Drosophila. We will directly identify new mutations and new recombination events that are present in the offspring but absent from their parents to estimate the rate that these processes occur. Then, using the parental genomes (and incorporating information from the offspring), we will use indirect (population-genetic) methods to infer the historic strength of selection and rates of genetic drift.

These data will allow us to describe genome evolution in this group of important and well-studied species more completely that ever before. And, by doing this, we will not only gain a better understanding of the evolutionary process, but also provide data to further test theory and drive its ongoing development. Our data will also be available to many other researchers for future work on other topics, beyond this proposal.

Technical Summary

Four population genetic parameters are crucial determinants of genome evolution: mutation rate, recombination rate, the strength of selection, and the rate of genetic drift. In principle, almost every aspect of genome evolution within populations can be captured by modelling variation in, and interactions between, these parameters. However, while evolutionary theory provides a sophisticated framework to understand genome evolution, empirical data on these parameters and how they vary among multicellular eukaryotes is lacking.

In particular, we lack an integrated perspective of these parameters for any clade of related plants or animals. This means, for example, that whereas we know that genetic diversity varies by three orders of magnitude, we have little idea what proportion of this variation is due to demographic history or other factors affecting Ne (likely the majority), or what proportion is due to differences mutation rate. Unless all parameters are estimated from the same set of (related) species, we have no possibility of testing whether they are correlated, as theory predicts. To ultimately link variation in the parameters of evolution to variation in other traits, we will require systematic and integrated estimates of all parameters in related species that differ in some, but not all, other traits.

Here we will directly (experimentally) estimate mutation and recombination rates using parent-offspring sequencing of multiple outbred unrelated families in each of 23 species of Drosophila. Then, using population-genetic approaches, we will estimate effective population size (including recent demographic changes and variation among sites in the genome) and the distribution of fitness effects for these species. Finally, using phylogenetic mixed models, we will decompose variance in these parameters into within-species, among-species, and phylogenetic components. We will also test for correlations between these parameters, both within and among species.

Planned Impact

The nature of the proposed project is fundamental in character, so the opportunities for commercialization or policy implications are expected to be limited. Instead, we hope to use this proposal as a vehicle to contribute to the "DrosAfrica" project (http://drosafrica.org/home), which aims to help build long-term research and higher education capacity in Africa.

-- [ Research and education capacity building in Africa ] --

Research in Africa has historically been biased toward immediately applicable biomedical and public health sciences. However, as more Sub-Saharan African countries go on to develop competitive research and higher-education sectors, there will be an increasing need for researchers trained in 'blue sky' science, and for study systems that can be used for training and research.

The DrosAfrica project aims to help develop Drosophila (a ubiquitous and economical model species that originated in Sub Saharan Africa) in this role, by offering training workshops to African Scientists (http://drosafrica.org/home). DrosAfrica is a registered charity in England and Wales, and its objectives are (1) "To create a well-connected community of researchers working with Drosophila melanogaster in Africa, and to study the possibilities of establishing collaborations with such a community" and (2) "To create Drosophila melanogaster biomedical units in high-quality research facilities that will allow African researchers to run projects with a high impact in biomedical sciences". To this end, DrosAfrica has run 4 workshops between 2013 and 2017 (2 in Uganda, 1 in Kenya, one in Nigeria), and is running a 5th in 2019 in conjunction with "The First African Society of Drosophilists Scientific Meeting".

In collaboration with DrosAfrica, we propose a 5-day workshop (provisionally entitled "Mutations, Variation, and Genetic disease") to take place in 2021 or 2022. This will provide training in the practical use of Drosophila, along with the bioinformatic and statistical approaches needed to understand (and in the future teach) current research practices associated with biomedical population genetics. This will include an overview of mutation and recombination as molecular processes, the population genetics of diversity and deleterious alleles (heterozygosity, recessive lethals, inbreeding), and the fundamentals of genome-wide association studies. Our expectation is that such a workshop, by introducing a tractable practical model in Drosophila, will provide an improved opportunity for local academics to engage and train their students in this increasingly important research area.

-- [ Public understanding ] --

Both because of its role in disease, and because of the prominence of 'mutation' in popular culture (e.g. movies, science fiction, health scares), we believe this work provides an ideal opportunity to engage with the public on the topic of mutations, what they (rally) are, and what their implications are for biology. We will do this via the Edinburgh International Science Festival, via the School of Biological sciences 'drop-in' week activity (historically this has attracted several hundred visitors per day over the course of a week). This will focus both on the meaning of 'mutants' in biology, and on the utility of model organisms for research on mutation and diversity.
 
Description It has shown that:
(1) Mutation rates can differ between different populations of fruit flies (Drosophila melanogaster)
(2) Mutation rates are higher in male fruit flies and female fruit flies
(3) Transposable element ('jumping gene') insertion rates are higher than simple base-change mutation rates in fruit flies (Drosophila melanogaster)
(4) Active transposable elements differ between different populations of fruit flies (Drosophila melanogaster)
Exploitation Route It can be used to improve our understanding of how evolution happens, by improving models and parameter estimates for evolution in Drosophila
Sectors Other

URL https://www.biorxiv.org/content/10.1101/2022.09.12.507595v1
 
Description Collaboration with Dr. Rashidatu Abdulazeez (DrosAfrica and Ahmadu Bello University) 
Organisation Ahmadu Bello University
Country Nigeria 
Sector Academic/University 
PI Contribution Sharing of wild-collected Drosophila for research
Collaborator Contribution Collection and identification of wild-caught Drosophila for research
Impact "Variation in mutation, recombination, and transposition rates in Drosophila melanogaster and Drosophila simulans" Yiguan Wang, Paul McNeil, Rashidatu Abdulazeez, Marta Pascual, Susan E. Johnston, Peter D. Keightley, Darren J. Obbard doi: https://doi.org/10.1101/2022.09.12.507595
Start Year 2021
 
Description Darwin Tree of Life Collaboration 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Collection of wild flies for sequencing
Collaborator Contribution Sequencing and genome assembly of wild-caught flies
Impact NA
Start Year 2021
 
Description Petrov Lab Drosophila Genomes 
Organisation Stanford University
Country United States 
Sector Academic/University 
PI Contribution Collection of wild-caught Drosophila
Collaborator Contribution Sequencing of wild-caught Drosophila
Impact NA
Start Year 2021
 
Description University research 'Drop-in' at the Edinburgh Science Festival 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Five day contribution to the Edinburgh Science Festival. Sharing details of our research (Drosophila as a model; Evolution) in 1:1 discussion with the general public, and engaging children with science based activities (observing flies and larvae with microscopes; matching flies with their food). Total footfall for the science drop-in was 1223 people over 5 days.
Year(s) Of Engagement Activity 2022