Uncovering the Distribution of Fitness Effects for new mutations using Saccharomyces cerevisiae as a model system
Lead Research Organisation:
University of Edinburgh
Department Name: Inst of Evolutionary Biology
Abstract
Humans are full of polymorphisms: these are what makes us individuals. Every individual has a different set, which is added to by random mutation each generation. Genetic diseases are often caused by single polymorphisms in genes. Some polymorphisms are positive, some are neutral and others have varying strengths of negative effects. Most have a weak negative effect on fitness. It is these polymorphisms that may make a major contribution to disease susceptibility.
It is important to understand how often a polymorphism falls into the categories described above. This is called the Distribution of Fitness Effects, or DFE. The problem in finding this information lies with the lack of data available for humans. Often only polymorphisms with very weak, or neutral effects are found in population surveys, and ones with large negative effects or smaller but significant effects are missed.
I plan to estimate the DFE for baker‘s yeast, for which there is a huge amount of relevant information. These findings can be applied to the more limited information we have about human polymorphisms and how these are linked to complex genetic diseases.
It is important to understand how often a polymorphism falls into the categories described above. This is called the Distribution of Fitness Effects, or DFE. The problem in finding this information lies with the lack of data available for humans. Often only polymorphisms with very weak, or neutral effects are found in population surveys, and ones with large negative effects or smaller but significant effects are missed.
I plan to estimate the DFE for baker‘s yeast, for which there is a huge amount of relevant information. These findings can be applied to the more limited information we have about human polymorphisms and how these are linked to complex genetic diseases.
Technical Summary
It has been estimated that humans experience on average 1.8 new amino acid mutations per generation, resulting in a large reservoir of variation in protein sequences between individuals. How many of these mutations are neutral, advantageous or deleterious is, as yet, unknown. Furthermore, approximately 60,000 amino-acid mutations have reached fixation between humans and our nearest surviving relative, the chimpanzee. How many of these have been fixed by natural selection and how many are simply fixed by random drift, is also unknown at present.
The distribution of fitness effects of new mutations (DFE) is defined as the frequency of origination of new mutations in the population as a function of their selection coefficients (their effects on fitness relative to the fitness of non-mutant individuals). Understanding the shape of this distribution would allow the numbers and effects of mutations to be estimated accurately from limited data sets. There is evidence to suggest that while most polymorphisms have a small impact on fitness, in total they may make a large contribution to disease susceptibility. Determining the shape of the distribution is key to understanding a great many other evolutionary processes, which have direct implications for human wellbeing. Examples include the biology of complex traits, genomic decay rates due to Hill-Robertson effects, the maintenance of genetic variation, and the evolution of antibiotic resistance. At present studies into the DFE have been limited by a lack of data. With the abundance of newly sequenced genomes of many strains of Saccharomyces cerevisiae and its nearest relative, Saccharomyces paradoxus, these yeasts offer an exceptional opportunity to make progress in this area.
I propose to expand bioinfomatic and population genetics methodology, previously applied to human and Drosophila populations, to infer the shape of the DFE in yeast, by using publicly available sequence data. As with previous sequence analyses, this will allow the shape of the distribution for polymorphisms undergoing weak selection to be determined. Combined with abundant data regarding strongly selected polymorphisms and deletions in differing environments for S.cerevisiae a more accurate distribution can be calculated. Population genetic analyses of the yeast data will allow an unprecedented level of information to be included in the inferences about the DFE, creating a benchmark estimate of the DFE. This will provide a platform for comparison with results from other systems, and will contribute to the development of methods that can be applied to human sequence data.
The distribution of fitness effects of new mutations (DFE) is defined as the frequency of origination of new mutations in the population as a function of their selection coefficients (their effects on fitness relative to the fitness of non-mutant individuals). Understanding the shape of this distribution would allow the numbers and effects of mutations to be estimated accurately from limited data sets. There is evidence to suggest that while most polymorphisms have a small impact on fitness, in total they may make a large contribution to disease susceptibility. Determining the shape of the distribution is key to understanding a great many other evolutionary processes, which have direct implications for human wellbeing. Examples include the biology of complex traits, genomic decay rates due to Hill-Robertson effects, the maintenance of genetic variation, and the evolution of antibiotic resistance. At present studies into the DFE have been limited by a lack of data. With the abundance of newly sequenced genomes of many strains of Saccharomyces cerevisiae and its nearest relative, Saccharomyces paradoxus, these yeasts offer an exceptional opportunity to make progress in this area.
I propose to expand bioinfomatic and population genetics methodology, previously applied to human and Drosophila populations, to infer the shape of the DFE in yeast, by using publicly available sequence data. As with previous sequence analyses, this will allow the shape of the distribution for polymorphisms undergoing weak selection to be determined. Combined with abundant data regarding strongly selected polymorphisms and deletions in differing environments for S.cerevisiae a more accurate distribution can be calculated. Population genetic analyses of the yeast data will allow an unprecedented level of information to be included in the inferences about the DFE, creating a benchmark estimate of the DFE. This will provide a platform for comparison with results from other systems, and will contribute to the development of methods that can be applied to human sequence data.