Efficient Bayesian phylogenomic dating with new models of trait evolution and rich diversities of living and fossil species

Lead Research Organisation: University of Bristol
Department Name: Earth Sciences

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Timetrees provide much richer information about patterns of species diversification and associations with past climate and major geological events. Bayesian relaxed-clock dating is the method of choice for deriving time trees as it naturally integrates information from the molecules and fossils. However, the method replies on stochastic MCMC sampling of the posterior distribution of times and rates and is computationally too expensive for large datasets. Recently, several large-scale genome sequencing projects have been announced, such as the 66,000 UK eukaryotic genomes project. This genomic revolution is now accompanied by a computer tomography (CT) revolution that is generating vast amounts of scan data for thousands of museum specimens. Thus, methods that can integrate analysis of genomic data from high-throughput sequencing projects with trait data from CT scans are urgently needed. The main advantage of CT-scan data is that the rich diversities of fossil species in museum collections can now be integrated in the dating analysis, providing a more robust calibration of the molecular clock and improving the amount of information in timetrees about past diversification events. The two main aims of this project are: (i) to improve the computational efficiency of MCMC sampling in timetree inference in large phylogenies and (ii) developing new models of trait evolution for co-analysis of genomic and trait data. We will achieve (i) by developing new proposal algorithms to improve the mixing efficiency of the MCMC, and by improving the C code in the MCMCtree software through vectorization and parallelisation. Our preliminary data indicate we can reduce computing time by 2-5 folds, making analysis of thousands of species within rich. The new software and models will be tested in several high-profile real data analysis.

Planned Impact

We will implement the methods and algorithms to be developed in this project in the MCMCTREE program in the PAML software package, and distribute it at its web site, free of charge to academics. We aim to disseminate our new models and software as required in accordance to the Data Driven Biology and System Approaches to the Biosciences BBSRC priority areas. In particular, our new software will allow the analysis of very large datasets from complex phylogenetic ensembles. We will champion integration of rich fossil diversities together with high-throughput sequencing data to infer evolutionary timelines in large phylogenies, thus providing the tools urgently required to analyse the explosive amounts of sequence and phenotype trait data now available.

We will attend national and international meetings to present our research results. Methodological advances will be disseminated in this way, as well as through teaching in the world-leading MSc Palaeobiology at Bristol, and the advanced workshop on Computational Molecular Evolution (funded by the Wellcome Trust and EMBO) that is organized and co-instructed by Yang. These courses will provide much needed training to our academic beneficiaries on how to use our software and models. We will apply for funds from the Royal Society to run a 2-day Discussion Meeting in London (which is open to scientists and the general public) and an associated satellite workshop at the Royal Society's Chicheley Hall. The focus will be on integrating biological and geological timescales to elucidate the co-evolution of Earth and Life. The Chicheley Hall workshop will have the aim of training evolutionary biologists, bioinformaticians, palaeontologists and Earth System modellers to conduct molecular clock dating (and Earth System Modelling) using cutting-edge methods, showcasing the new models and new algorithms to be developed in this project. We will research and design a school's outreach module on the tree of life, evolutionary timescales and evolutionary rates, to be delivered through GeoBus and the Bristol Dinosaur Project STEM engagement projects, as well as making the teaching materials freely available to science teachers. We will engage the broader public in our science and its deliverables by transmitting our science through a science-art collaboration, achieved by hosting an artist in residence at Bristol University and a touring display of their work.

Publications

10 25 50
 
Title Data from: Evolution of fungal phenotypic disparity 
Description Organismal grade multicellularity has been achieved only in animals, plants, and fungi. All three kingdoms manifest phenotypically disparate body plans, but their evolution has only been considered in detail for animals. Here we seek to test the general relevance of hypotheses on the evolution of animal body plans by characterising the evolution of fungal phenotypic variety (disparity). The distribution of living fungal form is defined by four distinct morphotypes: flagellated, zygomycetous, sac-bearing, and club-bearing. The discontinuity between morphotypes is a consequence of the extinction of phylogenetic intermediates, indicating that a complete record of fungal disparity would present a much more homogeneous distribution of form. Fungal phenotypic variety gradually expands through time for the most part but sharply increases with the emergence of multicellular body plans. Simulations show these temporal trends to be decidedly non-random, and at least partially shaped by hierarchical contingency. Fungal phenotypic distance is decoupled from changes in gene number, genome size, and taxonomic diversity. Only differences in organismal complexity, the number of traits that constitute an organism, at the cellular and multicellular levels present a meaningful relationship with fungal disparity. Both animals and fungi exhibit a gradual increase in disparity through time, resulting in distributions of form made discontinuous by the extinction of phylogenetic intermediates. These congruences hint at a common mode of multicellular body plan evolution. 
Type Of Material Database/Collection of data 
Year Produced 2022 
Provided To Others? Yes  
URL http://datadryad.org/stash/dataset/doi:10.5061/dryad.wwpzgmsm9