Efficient computational technologies to resolve the Timetree of Life: from ancient DNA to species-rich phylogenies

Lead Research Organisation: University of Bristol
Department Name: Earth Sciences

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Bayesian methodologies are the state-of-the-art for inference of evolutionary timescales but they are computationally expensive and are limited in the analysis of large datasets. This challenging situation is problematic because sequencing consortia such as the Earth BioGenome or the Darwin Tree of Life, as well as ancient DNA sequencing studies, have resulted in the release of thousands of new genomes that cannot be analysed with the latest Bayesian methods. Asymptotic studies show uncertainty in time estimates can be greatly reduced with genome-scale alignments, and thus precise evolutionary timelines based on large phylogenomic datasets, and with ancient DNA when available, are key to reconstruct detailed speciation histories that can be correlated with the climatic and geological history of our planet. Therefore, efficient MCMC approaches that can tackle the large forthcoming genomic datasets, and that can integrate the specific patterns of degradation in ancient DNA, are urgently needed. In this project we will (i) implement models of sequence errors to account for DNA degradation in ancient DNA within the multi-species coalescent model with introgression, (ii) implement cross-bracing node calibrations based on gene duplications and lateral gene transfer events for dating species divergences in deep phylogenies, (iii) implement efficient MCMC samplers (based on mirror and HMC moves) for analysis of species-rich phylogenomic datasets, and (iv) use our new methods to resolve evolutionary timelines in three species-rich phylogenomic datasets (hominids, tetrapods and eukaryotes). Our new methods will be implemented in our popular BPP and MCMCtree software packages, which will be used by a wide range of academic beneficiaries seeking to estimate precise evolutionary timelines using the latest genomic datasets.

Publications

10 25 50