Rooting the eukaryotic radiation with new models of gene and genome evolution

Lead Research Organisation: University of Bristol
Department Name: Biological Sciences


The origin of eukaryotes from their prokaryotic progenitors was one of the most formative transitions in the history of life, catalysing the blossoming of eukaryotic biodiversity into the astonishing range of forms we see today, from the largest organisms on our planet - blue whales, giant sequoias, fungal networks extending for miles underground - to microscopic plankton that jostle with bacteria in the world's oceans. Explaining the leap in cellular complexity during the prokaryote-to-eukaryote transition is one of the outstanding challenges in 21st-century biology.

The common structure of all eukaryotic cells testifies to their shared ancestry, but our understanding of the kind of cell that ancestral eukaryote was - where it lived, what it ate, the kinds of biochemical reactions it could perform - is in disarray. Whole-genome data have enabled us to resolve the more recent divergences in eukaryotic evolution, but we still have a very poor understanding of the deeper relationships between the main groups at the base of the evolutionary tree. In particular, the root of the tree - the starting point of the eukaryotic radiation - remains mired in controversy and debate.

The problem is that traditional rooting methods rely on the use of an outgroup: to find the root of the tree of mammals, for example, we might include birds in the analysis, and then use our a priori knowledge to place the root on the branch between the two groups. This approach breaks down when applied to the eukaryotic radiation: including our closest prokaryotic relatives greatly reduces the proportion of the eukaryotic genome that can be analysed, and the enormous evolutionary distance to the prokaryotic outgroup obscures the relationships among the different eukaryotic lineages. As a result, recent analyses of the eukaryotic root disagree strongly on its position, despite using similar datasets and analytical approaches.

In this project, we will tackle these difficulties head-on to definitively resolve the root of the eukaryotic tree by applying new outgroup-free rooting approaches, including some pioneered by members of the project team, to the most up-to-date, representative sampling of eukaryotic genomic diversity yet assembled. We will use the resulting phylogenomic framework to map the points in evolutionary history at which the unique cellular and genomic traits of modern eukaryotes first evolved, establishing a timescale for the evolution of key eukaryotic innovations. By mapping these traits onto the tree, we will reconstruct a detailed cellular and genomic model of the ancestral eukaryote - an organism which may have lived up to two billion years ago - in order to establish its lifestyle, ecology, and metabolism, and to test hypotheses of how that founding lineage gave rise to the staggering diversity of eukaryotic life we see today.

The work we are proposing is fundamental discovery science: the ultimate goal is to understand our own origins, to bring clarity to a poorly-understood period in the history of life vitally important for making sense of the biodiversity we see around us today, and in doing so to establish a new state-of-the-art for phylogenetic rooting with broad applicability to other major evolutionary transitions across the tree of life. But there is also real potential for broader socio-economic impact. Some of the groups that branch near the base of eukaryotic tree are parasitic, and so establishing how these evolved from their free-living ancestors will provide new, much-needed insights into the adaptation of eukaryotic parasites such as Trypanosoma (sleeping sickness) and Giardia to their hosts. As part of the research programme, we will host summer internships for motivated students on biohacking (DIY computational biology), providing a taste of scientific discovery and teaching the crucial computational, statistical and scientific skills needed to identify and nurture the next generation of scientific leaders.

Planned Impact

The track record of the project team to date demonstrates our shared commitment to public outreach and testifies to our philosophy that academics have an important role to play in wider society. A recent highlight of PI Williams' public engagement programme was to co-organise a workshop on microbial evolution and antibiotic resistance at the 2013 British Science Festival in partnership with Corylus Learning, a leading educational company. We will maintain and develop our commitment to outreach and broad societal impact over the three year-period of this grant, and we envisage that the proposed research will directly benefit two main groups outside academia: members of the general public, and motivated secondary school and undergraduate students.

(i) Members of the general public: PI Williams' research to date on the evolution of eukaryotes and viruses has received broad interest from the general public, and has been covered by a number of popular science outlets including National Geographic and Discover magazine. He already has extensive experience communicating the excitement of this work to non-specialists and the general public, from a workshop at the 2013 British Science Festival to participation in Google's Science Foo Camp 2014, which involved discussing and disseminating work on early evolution to journalists, artists and entrepreneurs. We will build on this experience with a series of public lectures on eukaryotic origins delivered as part of Bristol University's "Twilight Talks" programme, the PDRA will deliver a Nature Live lecture on microbial eukaryotic diversity at the Natural History Museum, and we will also coordinate with Bristol's Centre for Public Engagement to participate in major science festivals such as the Bristol Festival of Nature and the Cheltenham Science Festival, in order to communicate the joy of discovery science, and the particular appeal of origins research, most effectively to the broadest possible audience.

(ii) Motivated secondary school and undergraduate students: Biohacking, or do-it-yourself bioinformatics, is a hugely exciting but under-explored way to engage young people in science, teach essential computational, statistical and biological skills, and help identify and train the next generation of research leaders. I will host summer internships for four motivated students (two per year over the second and third years of the grant) to explore the potential of biohacking for elucidating the origin of key eukaryotic genes during the earliest period of their evolutionary history. Lab internships and work experience were an important part of my scientific development, and I will be delighted to provide the same opportunities to the next generation of young research leaders. These internships will be mutually beneficial, enabling the most talented students to develop their scientific skills and ideas while also providing them with a genuine opportunity to contribute to progress in this fundamental research area. The PDRA will directly supervise one of these students each year, which will also provide them with valuable supervisory experience as part of a broader training and career development programme.

We will maximise the reach and effectiveness of these impacts by engaging fully with NERC's new public engagement strategy - taking part in upcoming training opportunities when announced - and we will monitor the ongoing success of our impact programme throughout the funding period through benchmarking against a series of impact milestones, detailed in Pathways to Impact.


10 25 50
Title Phylogenetic relevance vector machine for comparative biology 
Description As part of analyses aimed at inferring ancestral states for eukaryotic cells, we have developed a new Bayesian method for phylogenetic regressions that is inspired by some techniques in machine learning. The tool (implemented as an R package, and for which a publication is in preparation) uses automatic relevance determination (from sparse Bayesian learning) to weight the importance of a potentially large number of observational independent variables in order to model a dependent variable. We are using it for prediction of continuous features of early eukaryotes (such as optimal growth temperature, pH preferences), but the package could be applied much more broadly in comparative biology as an alternative (or complement) to traditional approaches such as phylogenetic least squares. It may help to solve the problem of determining which variables, among a large set of possible candidates, are most important for predicting a particular output. 
Type Of Material Improvements to research infrastructure 
Year Produced 2019 
Provided To Others? No  
Impact So far, we have used it to address research questions such as inference of the optimal growth temperature of the last universal common ancestor, and we are testing it by attempting to predict the (known) temperature and pH preferences of some modern taxa. 
Description Bristol Dinosaur Project & Bristol Museum & Art Gallery - Workshop "Introduction to Palaeontology" to Primary School children. 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact "The Bristol Dinosaur Project is an outreach wing of the Earth Sciences Department of the University of Bristol, using palaeontology as an ambassador to communicate science to all ages. Taking students and staff from the university, active in their field of study, we communicate the latest developments and core science concepts through fun and engaging workshops and talks. We are working mostly with schools either through visits or in collaboration with the Bristol Museum & Art Gallery during discovery workshops."

I took part on 2 workshops with the BDP and the Bristol Museum in November 2018. The workshop is designed for Primary School children, where they are discovering palaeontology through 4 different activities, led by BDP volunteers in the Natural History part of the Bristol Museum. The children have the opportunity to visit a highly cultural place they won't necessarily would have the possibility to visit otherwise, engage with Scientists and museum staff, and of course, learn about Biology and Palaeontology. They are encouraged to ask question and interact with the volunteers as much as they want. Most of them are very keen to participate, and the feedback from the teachers have been excellent, with demands for further booking next year and promise to advertise the workshop to their colleagues. The BDP is opening them the doors of a national museum in a very privileged way, shows them that science and Natural History is fascinating, and give to some of them a personal opportunity to interact with professional scientists, including a large number of women. The social and ethnic background of the children involved being very diverse is also a great opportunity for us to share our passion of science. I believe that the impact on children, if not immediate, will be massive in the long term, discovering that STEM are interesting, museums fun and scientist open-minded will stick with them.
Year(s) Of Engagement Activity 2018
Description DigiLocal Coding Club - Introduction to coding & computational thinking in community center (Bedminster Library, Bristol, UK) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Public/other audiences
Results and Impact "DigiLocal® is supporting communities in running tech clubs for their young people.
These provide regular, positive activities with tech. It encourages problem solving and what's sometimes called 'computational thinking' in schools.
We work closely with community centres and schools for activities on a weekly basis. We mostly use Scratch and Python on Linux environments as these are easiest to pick up quickly, but both provide high levels of sophistication once you master the basics."
I took part of a DigiLocal coding club, every Wednesday night between November 2018 and February 2019, working with the community centre of Bedminster, Bristol, in the Bedminster Library. The activity consisted to guide a group of about 10 to 15 children between 8 and 13 years old. through the basis (and less basic!) of coding, first through the software Scratch and then in Python. This activity is an amazing opportunity for children to discover Linux environment and coding in a fun way. If the main goal is for them to first fiddle with "computational thinking", it is a great time to engage with them on the importance of coding in my own professional practice and talk to them how STEM can be fun and useful. As the activity is a recurrent one, I think that the willingness of the children to come back week after weeks, and for some of them, so continue to work on their projects at home, is the sign of success.
On another note, this experience was an opportunity for me to engage with children from different origins and social background, in a part of the city the university is not reaching easily. The group of children I was working with, even if male dominated, was including several girls, a population hard to keep involved in STEM activities. I believe that my presence as a female volunteer had a positive impact for all children, showing that coding was for everyone.
Year(s) Of Engagement Activity 2018