Leveraging the genome sequences of two Arabidopsis relatives for evolutionary and ecological genomics

Lead Research Organisation: University of Edinburgh
Department Name: Sch of Biological Sciences


The question of how changes in DNA sequence result in novel adaptations and in the formation of new species is at the heart of evolutionary biology, and approaches developed from natural changes are also important in studying evolutionary changes during crop domestication. Comparing the genomic DNA of related species does not identify which sequence differences were selected. Statistical population genetic approaches using sequence differences between and within species can pinpoint regions of the DNA that may underlie adaptive changes. However, these techniques are only effective if the genome sequences to be compared are neither too similar nor too dissimilar, and few pairs of genome sequences suitable for the analyses are yet available in the animal or plant kingdoms, or even in fungi. The impending completion of the genome sequences of the two Brassicaceae Arabidopsis lyrata and Capsella rubella, together with the available genome sequence of A. thaliana, offers opportunities to study such questions in plant species at the intermediate evolutionary distances that are ideal for computational studies of evolutionary processes; these three species are also suitable for functional studies. We will exploit this resource by studying sequence evolution on a genome-wide scale and by studying the molecular basis of evolution in two well-characterized and ecologically relevant traits, flowering time and self-incompatibility (SI). We will first generate sequence alignments of all three species and compile all sets of orthologous genes, i.e. descended from the same gene in the species' common ancestor. As the genetic maps of the three species are known to be very similar, large orthologous stretches of genome can be identified. The alignments will immediately allow us to detect the especially interesting category of genes that are present/absent in individual species, allowing study of genome evolution. We will next estimate rates of synonymous sequence changes (not changing the amino-acid sequence of the proteins encoded) and non-synonymous changes between pairs of genes present in each species pair. Comparing these rates across all genes can answer several important questions, including whether rates of non-synonymous substitutions are similar between genes or regions, and if not, whether variation is systematic across large genomic regions. Candidates for having evolved adaptively and/or contributed to speciation will be genes with unusually high rates of non-synonymous substitutions, relative to polymorphism levels within populations (which we shall estimate from a large set of loci to serve as 'controls'). Other interesting candidate genes can be identified from unusually high or low differentiation between natural populations. The sequence analyses will provide a foundation for functional studies of two adaptive traits, flowering time and SI. Genes affecting flowering time will be identified with two complementary approaches. First, variation in flowering time in naturally occurring A. lyrata populations will be correlated with sequence changes in orthologues of known A. thaliana flowering time regulators. Second, we will identify A. lyrata genomic regions with large effects on flowering time by genetic mapping, and then study candidate genes in these regions by manipulating their activity. We will use the self-incompatible species A. lyrata to study the transition to self-compatibility (SC) in some natural populations, and will do similar studies in C. rubella (SC) and its self-incompatible sister species C. grandiflora, including establishing an immortalized mapping population from a cross of the species to map genes associated with SC/SI, and other traits of evolutionary significance, such as flower size, etc. Together, our studies are expected to answer several interesting evolutionary and genome evolution questions, and should also advance breeding programmes in crops.

Technical Summary

The molecular basis of adaptation and species formation are fundamental questions in evolutionary biology, with relevance also to crop biology. Specifically, tools and approaches developed for natural populations are valuable for studying crop domestication. The impending completion of the genome sequences of the two Brassicaceae Arabidopsis lyrata and Capsella rubella, together with the available sequence of A. thaliana, offers opportunities to study adaptation using molecular evolutionary approaches complemented by functional analyses. Our consortium will exploit this resource to study variation underlying differences, including two ecologically important traits, flowering time and self-incompatibility (SI). We will generate genomic sequence alignments of the three species and compile sets of orthologous genes. Non-synonymous substitution estimates per site (Ka) across the orthologous gene set and across syntenic genome regions, together with information on within-species variation, will identify genomic regions with unusually high Ks relative to synonymous divergence (Ks), and rapidly evolving sequences that may have undergone directional selection. Genes lost or gained are also interesting, and will be further analyzed. Genes contributing to flowering time variation will be isolated by two complementary approaches. First, natural phenotypic variation amongst A. lyrata populations will be compared with allele frequencies at orthologues to known A. thaliana flowering time regulators. Second, we will map QTL in A. lyrata and Capsella populations and analyze candidate genes. We will also use self-compatible populations of the usually self-incompatible A. lyrata, and of C. rubella with its self-incompatible, but inter-fertile sister C. grandiflora, to identify and characterize genes associated with the transition from SI to self-compatibility. Together, these studies will answer important evolutionary questions that are of strategic relevance to crop breeding.


10 25 50
Description I explained all this last year. The grant ended some time ago and there have been no new findings.
Exploitation Route They were used for other projects
Sectors Other

Description Published papers, research talks (seminars and meetings)
First Year Of Impact 2009
Sector Education,Environment
Impact Types Cultural

Description Analysis of Efficacy of Natural Selection on Codon Usage Bias in Selfing Plant Species 
Organisation University of Toronto
Country Canada 
Sector Academic/University 
PI Contribution We shared data, and the analyses were done by my group
Collaborator Contribution Data were contributed for our analysis
Impact Publications (Qiu, S, Zeng, K, Slotte, T, Wright, S & Charlesworth, D 2011, 'Reduced Efficacy of Natural Selection on Codon Usage Bias in Selfing Arabidopsis and Capsella Species' Genome Biology and Evolution, vol 3, pp. 868-880., 10.1093/gbe/evr085), talks at scientific meetings
Start Year 2010
Description Patterns of Polymorphism and Demographic History in Natural Populations of a plant 
Organisation University of California
Country United States 
Sector Academic/University 
PI Contribution We provided data and I helped write the paper
Collaborator Contribution They also provided data and analyses and helped write the paper
Impact Paper: Ross-Ibarra, J, Wright, SI, Foxe, JP, Kawabe, A, DeRose-Wilson, L, Gos, G, Charlesworth, D & Gaut, BS 2008, 'Patterns of Polymorphism and Demographic History in Natural Populations of Arabidopsis lyrata' PLoS One, vol 3, no. 6, e2411, pp. -., 10.1371/journal.pone.0002411
Start Year 2007