Architects of genomic change: the evolutionary dynamics of transposable elements

Lead Research Organisation: University of Exeter
Department Name: Biosciences

Abstract

Recently there have been great breakthroughs in computing and molecular biology. In combination, these have led to a vastly improved ability to generate and analyse large volumes of genetic data. Consequently, near-complete genome sequences are now available for a large variety of organisms. This genomic revolution has revealed many fascinating insights, but one of the most unexpected relates to the abundance of transposable elements (TEs) discovered within the genome.

TEs are short DNA sequences with the ability to move around in the genome via a process called transposition. Because of this property, TEs are sometimes referred to as jumping genes. Other names applied to TEs are selfish DNA, parasitic DNA, or even junk DNA, reflecting their perceived lack of contribution to host fitness. To become fixed in the hosts evolutionary lineage, TEs must invade the host germline (i.e. reproductive cells). This has been occurring over great evolutionary periods, leading to the abundance of TE sequences observable in sequenced genomes, the majority of which exist as genomic fossils that have become inactivated due to an accumulation of mutations.

Recently, it has emerged that TE sequences have been repeatedly utilised by host genomes for their own purposes during evolution. Indeed, it appears that TEs have played a significant role in the evolution of host genomic complexity via various mechanisms, including direct acquisition of coding sequence, genomic rearrangement, and gene regulatory modification.

Despite the widespread abundance of TEs and their important evolutionary contributions across the diversity of life, many questions concerning TE biology remain unanswered. However, the wealth in recently sequenced genomes now provides an exciting opportunity to perform novel large-scale systematic analyses of TE evolution to elucidate on poorly understood aspects of TE biology. In this proposal I will undertake such an analysis to examine the following four important aims:

1 Evolution of the LTR retrotransposons. A particularly diverse and abundant group of TEs with significant impacts on the genomes of a great diversity of organisms are the Long Terminal Repeat (LTR) retrotransposons. Until recently, it was very difficult to estimate evolutionary relationships in this group for methodological reasons, constraining advances. However, I have developed a new method to overcome this problem, offering the possibility to estimate evolutionary history and address questions of key significance in the group, which also includes highly important vertebrate viruses such as HIV.

2 Persistence of TEs in the genome. A major question is how active selfish elements persist in host genomes, while having no direct selective benefit to the host. I will quantify patterns in the proliferation of TEs and their spread across host diversity to elucidate on this long-standing problem.

3 Transposable elements and the evolution of host genomic complexity. I will explore the features that predispose TEs to being harnessed for host purposes, and examine how TEs interact to contribute to host genomic complexity.

4 Role of transposable elements in speciation. Hosts can evolve resistance mechanisms against TEs, but recently invading TEs are typically able to replicate more freely. Consequently, poor-repression of TEs is predicted to result in hybrids between two diverging lineages suffering negative fitness consequences due to increased TE activity, which consequently reinforces reduced gene flow. I will test these ideas to explore the role of TEs as promoters of speciation.

Study of the LTR retrotranspsons offers an opportunity to provide insights of relevance to combat disease, since the group contains infectious viruses such as HIV. Meanwhile, given the widespread utilisation of TE sequences for diverse host purposes during evolution, greater knowledge of TE biology will provide insights of potential applied and medical benefit more widely.

Technical Summary

Genomic data have revealed the great extent to which transposable elements (TEs) have infiltrated eukaryotic genomes. Concurrently, a major shift in perspective has occurred, from a view of TEs as mere junk or parasitic DNA to recognition of the considerable roles they have played in the evolution of host genomic complexity. However, understanding of many fundamental aspects of TE biology remains relatively poor. In particular, systematic analyses based within a robust evolutionary framework are required to elucidate on broad-scale TE evolutionary dynamics. I will capitalise on the recent accumulation of eukaryotic genomes, in combination with a novel phylogenetic approach I have developed, to examine the following four important areas of TE biology:

1 Phylogeny and evolution of LTR retrotransposons: I will address outstanding questions of key significance concerning evolution of long terminal repeat TEs that originate from the family Retroviridae, which includes highly important vertebrate viruses such as HIV, and the closely related family Metaviridae.

2 Dynamics of transposable element persistence: A major question in TE biology is how active selfish elements persist in host genomes, while having no direct selective benefit to the host. I will quantify patterns in TE proliferation and host usage to elucidate on this long-standing problem.

3 Transposable elements and the evolution of host genomic complexity: I will explore the features that predispose TEs to being harnessed for host purposes, and examine how TEs interact to contribute to host genomic complexity.

4 Role of transposable elements in speciation: The role of TEs in speciation remains relatively untested. Current developments in genomics offer an opportunity to test the broad hypothesis that gene-flow between host lineages is associated with a reduced capacity for TE repression, reinforcing host reproductive isolation and promoting speciation.

Planned Impact

The proposed project will increase understanding of the biology of transposable elements (TEs). This is a topic of considerable interest to scientists and the general public, and contains substantial promise for a wide range of applications. As a result, findings from the project will contribute towards the knowledge and understanding required to move towards a bio-based economy.

TEs make up a large proportion of the human genome, and are implicated in a wide range of diseases as well as in normal functioning. Consequently, the findings of this project hold great potential health relevance. In addition, further significance comes from the importance of retroviral disease, and the current vast global HIV epidemic. Research questions examining retroviral evolution, host usage, and the retroviral envelope gene may lead to new insights into retrovirus biology that could be used in new forms of treatment and drug development. Thus, research findings from this project have the capacity to enhance both health and quality of life

Results from this project also have the potential to foster economic performance and contribute to the economic competitiveness of the UK. These prospective contributions come predominantly from two sectors: agriculture and biotechnology. TEs are involved with widespread phenotypic traits, a considerable number of which probably influence production traits in domesticated animals. Identifying such inserts offers great scope for improving yields and adding to competitiveness in the farming sector. Meanwhile, a range of retroviruses exert harmful effects on agricultural livestock including poultry, cattle, and sheep. Conducting research that may lead to new methods to control these diseases carries benefits for animal welfare, improving yield and economic margins, and bolstering the resilience of the farming industry. Additionally, novel research on TEs may directly contribute to the development of new tools and methods in the industrial biosciences, which are currently a global growth area.

Furthermore, as set out in the pathways to impact, considerable efforts will be made to disseminate research findings among the public, thus fostering enthusiasm and understanding, and a general fluency in science and technology.

Publications

10 25 50

publication icon
Blasco-Costa I (2021) Next-generation cophylogeny: unravelling eco-evolutionary processes in Trends in Ecology & Evolution

publication icon
Castledine M (2022) Greater Phage Genotypic Diversity Constrains Arms-Race Coevolution. in Frontiers in cellular and infection microbiology

publication icon
Galbraith JD (2023) The influence of transposable elements on animal colouration. in Trends in genetics : TIG

publication icon
Hayward A (2017) Origin of the retroviruses: when, where, and how? in Current opinion in virology

publication icon
Hayward A (2022) Transposable elements. in Current biology : CB

publication icon
Lear L (2022) Bacterial colonisation dynamics of household plastics in a coastal environment. in The Science of the total environment

publication icon
Mackintosh A (2022) The genome sequence of the scarce swallowtail, Iphiclides podalirius in G3 Genes|Genomes|Genetics

publication icon
Mackintosh A (2019) The determinants of genetic diversity in butterflies in Nature Communications

publication icon
Panini M (2021) Transposon-mediated insertional mutagenesis unmasks recessive insecticide resistance in the aphid Myzus persicae. in Proceedings of the National Academy of Sciences of the United States of America

 
Description Assessing the role of transposable elements in the evolution of host genomic complexity
Amount £0 (GBP)
Funding ID 2072124 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 10/2018 
End 09/2022
 
Description Evaluating the contribution of transposons to agricultural domestication
Amount £0 (GBP)
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 09/2020 
End 09/2024
 
Title The Earl Grey Transposable Element Annotation Pipeline 
Description A transposable element annotation pipeline for annotating repeats in genome data, designed to improve on current gold standard approaches, and be easily usable for non-specialists. 
Type Of Material Technology assay or reagent 
Year Produced 2022 
Provided To Others? Yes  
Impact My PhD student Toby Baril handles the GitHub page, and has told me the program has been downloaded thousands of times, and that he regularly receives emails from users. We have not received many citations yet, but we are planning to submit a manuscript on the method to a major journal in the next couple of months, and hopefully users will cite this! The method outperforms all other tools in comparative tests (even commercial tools). We are also planning to release the 'TE-strainer' tool shortly, which identifies repetitive host genes annotated as TEs in genome annotations, which fills a key methodological gap currently, which is leading to problems (host genes labelled as TEs) in user uploaded data in online reference databases such as DFAM. 
URL https://github.com/TobyBaril/EarlGrey
 
Title Dataset for ELE_EV_ELE13757 
Description Methods: A literature search was performed using Google Scholar on 19th March 2019, which identified 368 citations of the original paper for TreeMap (Page 1994), and 332 citations of the original paper for Parafit (Legendre et al. 2002), resulting in a total of 700 articles that were screened to extract metrics for inclusion in our meta-analysis. Articles that did not contain cophylogenetic analyses were immediately excluded. Studies focussing at the population level were also excluded, as these do not represent true cophylogenetic analyses at the macroevolutionary level. Additionally, studies that included less than four taxa were excluded from consideration, as these do not provide sufficient power for inclusion in the meta-analysis. Studies that did not report the test statistic for congruence were also necessarily excluded. A short citation of each study was recorded under 'authors', and the year of publication was recorded in 'year'. Hosts and symbionts were classified broadly according to Linnean taxonomy for 'host_tax_broad' and 'symbiont_tax_broad' as either: invertebrate, vertebrate, plant or microbe (i.e. microscopic symbionts such as fungi, protozoa, bacteria, viruses). We adopted the mode of symbiosis and mode of transmission between host species specified by the authors in each individual study for 'symbiosis' and 'mode_of_transmission_broad'. In cases where either mode of symbiosis or mode of transmission were not directly specified by authors, we consulted the literature for clarification. In a small number of studies restricted to bacterial intracellular symbionts, the mutualism-parasitism distinction was not defined by the authors and either no further information was available, or a symbiont was cited in the literature as being both a mutualist or a parasite, depending on which study was considered. The nature of the relationship between bacterial intracellular symbionts and their hosts is complex, and in some cases they may display both beneficial and detrimental effects simultaneously. In a few cases of conflict or where authors did not explicitly state mode of transmission for bacterial intracellular symbionts, we assumed a mode of transmission in line with the majority of available references. We only encountered one study where authors categorised the mode of symbiosis as commensalism. On the continuum of symbioses from pure parasitism (fitness losses for the host) to mutualism (fitness gains for the host), commensalism represents a single point where losses and gains for the host precisely equal zero. Consequently, commensalism is an unlikely and unstable state, easily tipped to one side or the other with any small change in external conditions. Thus, the lack of widely recognized groups of commensals is the likeliest explanation for the scarcity of studies on commensalism in our data (note that we did not include this category, commensalism, in our analyses). The total number of host tips that were linked to a symbiont taxon were summed to provide 'host_tips_linked', which in a very few cases was corrected to remove multiple sampling of the same host species, to provide 'host_tips_linked_corrected'. The total number of symbiont tips with a link to a host taxon were summed to provide 'symbiont_tips_linked', while the total number of individual links between hosts and symbionts was recorded as 'total_host_symbiont_links'. If all symbionts in a phylogeny were strict specialists, such that each one had a single link to a single host, 'total_host_symbiont_links' would simply equal 'symbiont_tips_linked'. However, because symbionts are often associated with more than one host, the value of 'total_host_symbiont_links' was often higher than the total number of symbionts included in a study. Thus, a measure of symbiont generalism was captured using 'host_range_link_ratio', defined as 'total_host_symbiont_links' divided by 'symbiont_tips_linked', providing the mean number of host-symbiont links observed per symbiont taxon, with the measure increasing with increasing generalism. An alternative estimate of symbiont host specificity was captured using 'host_range_taxonomic_breadth', which considers Linnean taxonomic rank, and was calculated by assigning an incremental score to successive host taxonomic ranks per symbiont in turn (i.e. single host species = 1, multiple host species in the same genus = 2, multiple host genera = 3, multiple host families = 4, multiple host orders = 5), summing the total score across all symbionts, and dividing by 'symbiont_tips_linked' (i.e. the total number of symbionts). Consequently, 'host_range_taxonomic_breadth' increases with symbiont generalism, such that symbiont phylogenies containing symbionts capable of infecting hosts from a wide range of taxonomic ranks are assigned a greater score. The number of phylogenetic permutations performed by authors during cophylogenetic analyses was recorded as 'no_randomizations', which poses a unique problem in our meta-analysis (discussed in the section 'Publication bias and sensitivity analysis'). The resultant p value from each study was recorded as 'p_value', whereby observed p values decrease with a decreasing likelihood of observing host-symbiont cophylogeny by chance alone (i.e., as calculated during permutation tests performed by authors during TreeMap or ParaFit analyses). File '2021-09-01-source-data-dat.txt' is in tab-delimited text format.File 'Supporting_Information.Rmd' is accompanying R code used for analysis of the source data. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://figshare.com/articles/dataset/Dataset_for_ELE_EV_ELE13757/14393309
 
Title Dataset for ELE_EV_ELE13757 
Description Methods: A literature search was performed using Google Scholar on 19th March 2019, which identified 368 citations of the original paper for TreeMap (Page 1994), and 332 citations of the original paper for Parafit (Legendre et al. 2002), resulting in a total of 700 articles that were screened to extract metrics for inclusion in our meta-analysis. Articles that did not contain cophylogenetic analyses were immediately excluded. Studies focussing at the population level were also excluded, as these do not represent true cophylogenetic analyses at the macroevolutionary level. Additionally, studies that included less than four taxa were excluded from consideration, as these do not provide sufficient power for inclusion in the meta-analysis. Studies that did not report the test statistic for congruence were also necessarily excluded. A short citation of each study was recorded under 'authors', and the year of publication was recorded in 'year'. Hosts and symbionts were classified broadly according to Linnean taxonomy for 'host_tax_broad' and 'symbiont_tax_broad' as either: invertebrate, vertebrate, plant or microbe (i.e. microscopic symbionts such as fungi, protozoa, bacteria, viruses). We adopted the mode of symbiosis and mode of transmission between host species specified by the authors in each individual study for 'symbiosis' and 'mode_of_transmission_broad'. In cases where either mode of symbiosis or mode of transmission were not directly specified by authors, we consulted the literature for clarification. In a small number of studies restricted to bacterial intracellular symbionts, the mutualism-parasitism distinction was not defined by the authors and either no further information was available, or a symbiont was cited in the literature as being both a mutualist or a parasite, depending on which study was considered. The nature of the relationship between bacterial intracellular symbionts and their hosts is complex, and in some cases they may display both beneficial and detrimental effects simultaneously. In a few cases of conflict or where authors did not explicitly state mode of transmission for bacterial intracellular symbionts, we assumed a mode of transmission in line with the majority of available references. We only encountered one study where authors categorised the mode of symbiosis as commensalism. On the continuum of symbioses from pure parasitism (fitness losses for the host) to mutualism (fitness gains for the host), commensalism represents a single point where losses and gains for the host precisely equal zero. Consequently, commensalism is an unlikely and unstable state, easily tipped to one side or the other with any small change in external conditions. Thus, the lack of widely recognized groups of commensals is the likeliest explanation for the scarcity of studies on commensalism in our data (note that we did not include this category, commensalism, in our analyses). The total number of host tips that were linked to a symbiont taxon were summed to provide 'host_tips_linked', which in a very few cases was corrected to remove multiple sampling of the same host species, to provide 'host_tips_linked_corrected'. The total number of symbiont tips with a link to a host taxon were summed to provide 'symbiont_tips_linked', while the total number of individual links between hosts and symbionts was recorded as 'total_host_symbiont_links'. If all symbionts in a phylogeny were strict specialists, such that each one had a single link to a single host, 'total_host_symbiont_links' would simply equal 'symbiont_tips_linked'. However, because symbionts are often associated with more than one host, the value of 'total_host_symbiont_links' was often higher than the total number of symbionts included in a study. Thus, a measure of symbiont generalism was captured using 'host_range_link_ratio', defined as 'total_host_symbiont_links' divided by 'symbiont_tips_linked', providing the mean number of host-symbiont links observed per symbiont taxon, with the measure increasing with increasing generalism. An alternative estimate of symbiont host specificity was captured using 'host_range_taxonomic_breadth', which considers Linnean taxonomic rank, and was calculated by assigning an incremental score to successive host taxonomic ranks per symbiont in turn (i.e. single host species = 1, multiple host species in the same genus = 2, multiple host genera = 3, multiple host families = 4, multiple host orders = 5), summing the total score across all symbionts, and dividing by 'symbiont_tips_linked' (i.e. the total number of symbionts). Consequently, 'host_range_taxonomic_breadth' increases with symbiont generalism, such that symbiont phylogenies containing symbionts capable of infecting hosts from a wide range of taxonomic ranks are assigned a greater score. The number of phylogenetic permutations performed by authors during cophylogenetic analyses was recorded as 'no_randomizations', which poses a unique problem in our meta-analysis (discussed in the section 'Publication bias and sensitivity analysis'). The resultant p value from each study was recorded as 'p_value', whereby observed p values decrease with a decreasing likelihood of observing host-symbiont cophylogeny by chance alone (i.e., as calculated during permutation tests performed by authors during TreeMap or ParaFit analyses). File '2021-09-01-source-data-dat.txt' is in tab-delimited text format.File 'Supporting_Information.Rmd' is accompanying R code used for analysis of the source data. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://figshare.com/articles/dataset/Dataset_for_ELE_EV_ELE13757/14393309/1
 
Title Supporting data for "A draft genome sequence of the elusive giant squid, Architeuthis dux" 
Description The giant squid (Architeuthis dux; Steenstrup, 1857) is an enigmatic giant mollusk with a circumglobal distribution in the deep ocean, except in the high Arctic and Antarctic waters. The elusiveness of the species makes it difficult to study. Thus, having a genome assembled for this deep-sea dwelling species will allow unlocking several pending evolutionary questions. We present a draft genome assembly that includes 200 Gb of Illumina reads, 4 Gb of Moleculo synthetic long-reads and 108 Gb of Chicago libraries, with a final size matching the estimated genome size of 2.7 Gb, and a scaffold N50 of 4.8 Mb. We also present an alternative assembly including 27 Gb raw reads generated using the Pacific Biosciences platform. In addition, we sequenced the proteome of the same individual and RNA from three different tissue types from three other species of squid species (Onychoteuthis banksii, Dosidicus gigas, and Sthenoteuthis oualaniensis) to assist genome annotation. We annotated 33,406 protein coding genes supported by evidence and the genome completeness estimated by BUSCO reached 92%. Repetitive regions cover 49.17% of the genome. This annotated draft genome of A. dux provides a critical resource to investigate the unique traits of this species, including its gigantism and key adaptations to deep-sea environments. 
Type Of Material Database/Collection of data 
Year Produced 2020 
Provided To Others? Yes  
URL http://gigadb.org/dataset/100676
 
Description Collaboration with 10X Genomics 
Organisation 10X Genomics, Inc
Country United States 
Sector Private 
PI Contribution We provided specimens
Collaborator Contribution They arranged for library preparations to be made ahead of the queue and for free, and subsequently assembled the resultant data.
Impact Four butterfly draft genome sequences
Start Year 2017
 
Description Collaboration with Charlie Cornwallis at the University of Lund 
Organisation Lund University
Country Sweden 
Sector Academic/University 
PI Contribution Annotation of transposable elements in genomes sampled from across algal diversity
Collaborator Contribution Sequencing a large number of algal genomes
Impact Comparative genomics paper planned
Start Year 2020
 
Description Collaboration with Dr Jon Mulley at Bangor University 
Organisation Bangor University
Country United Kingdom 
Sector Academic/University 
PI Contribution Collaborations on gerbil genome evolution (submitted), and snake genome evolution (in prep.)
Collaborator Contribution Comparative transposon analyses
Impact One preprint (submitted to a journal), one manuscript in prep.
Start Year 2022
 
Description Collaboration with Dr Pablo Orozco Ter Wengel at Cardiff University 
Organisation Cardiff University
Country United Kingdom 
Sector Academic/University 
PI Contribution I coordinated the BBSRC DTP appllication.
Collaborator Contribution Dr Ter Wengel is the second supervisor of my BBSRC DTP student Ryan Biscocho. He will assist in analyses of the role of transposons in livestock domestication.
Impact Planned: Biscocho ER, Baril T, Orozco-Terwengel P, Hui JHL, Ferrier DEK, Hayward A. (In preparation for Molecular Biology and Evolution) The influence of transposable elements on Hox gene evolution in molluscs
Start Year 2020
 
Description Collaboration with Professor Chris Bass at Exeter University 
Organisation University of Exeter
Country United Kingdom 
Sector Academic/University 
PI Contribution Professor Bass and myself collaborate on questions relating to the evolution of insecticide resistance in insects, notably specifically in relation to aphids and the role that transposable elements play in this capacity.
Collaborator Contribution Planning and conducting the majority of the research
Impact Published: ----Dupeyron M, Singh KS, Bass C, Hayward A (2019) Evolution of Mutator transposable elements across eukaryotic diversity. Mobile DNA, 10, 12. ----Dupeyron M, Baril T, Bass C, Hayward A (2020) An evolutionary analysis of the Tc1-mariner superfamily reveals the unexplored diversity of pogo-like elements. MobileDNA, 11, 21. ----Singh KS, Troczka BJ, Duarte A, Balabanidou V, Trissi N, Paladino LZC, Nguyen P, Zimmer CT, Papapostolou K, Randall E, Mallott V, Marec F, Mazzoni E, Williamson M, Hayward A, Nauen R, Vontas J, Bass C (2020) The genetic architecture of a host shift: an adaptive walk protected an aphid and its endosymbiont from plant chemical defences. Science Advances, 6, eaba1070. ----In review: ----Singh KS, Cordeiro EMG, Troczka BJ, Pym A, Mackisack J, Mathers TC, Duarte A, Legeai F, Robin S, Bielza P, Burrack HJ, Charaabi K, Denholm I, Figueroa CC, ffrench-Constant RH, Jander G, Margaritopoulos JT, Mazzoni E, Nauen R, Ren G, Stepanyan I, Umina PA, Voronova NV, Vontas J, Williamson M, Wilson ACC, Xi-Wu G, Youn Y-N, Zimmer CT, Simon J-C, Hayward A, Bass C (In resubmission at Communications Biology) Global patterns in genomic diversity reveal the molecular and ecological processes underpinning the evolution of insecticide resistance in the crop pest Myzus persicae. ----Panini M, Chiesa O, Troczka BJ, Mallott M, Manicardi GC, Cassanelli S, Cominelli F, Hayward A, Mazzoni E, Bass C (Submitted to PNAS) Silencing susceptibility: transposon-mediated insertional mutagenesis unmasks recessive insecticide resistance. ----In preparation: ----Troczka BJ, Hayward A, Bass C (In preparation for Pest Management Science) Molecular innovations underlying resistance to natural and synthetic xenobiotics in Myzus persicae. ----Planned: ----Baril T, Bass C, Hayward A (In preparation for Molecular Biology and Evolution) Population genomics of transposable elements for 100 aphid genomes. ----Baril T, Singh KS, Bass C, Hayward A. A comparative analysis of aphid transposable elements: the impact of transposons on the evolution of host processes under strong selective pressure.
Start Year 2017
 
Description Collaboration with Professor Juan Antonio Balbuena and Dr Isa Blasco at the University of Valencia and Muséum d'histoire naturelle Genève 
Organisation Natural History Museum of Geneva
Country Switzerland 
Sector Public 
PI Contribution I contributed to writing of the review manuscript
Collaborator Contribution My partners led on this review project, securing an invitation to submit to the prestigious journal Trends in Ecology and Evolution
Impact In preparation: Blasco-Costa I, Hayward A, Poulin R, Balbuena JA (In preparation for Trends in Ecology and Evolution) Next-generation cophylogeny: integrating eco-evolutionary interactions.
Start Year 2020
 
Description Collaboration with Professor Juan Antonio Balbuena and Dr Isa Blasco at the University of Valencia and Muséum d'histoire naturelle Genève 
Organisation University of Valencia
Country Spain 
Sector Academic/University 
PI Contribution I contributed to writing of the review manuscript
Collaborator Contribution My partners led on this review project, securing an invitation to submit to the prestigious journal Trends in Ecology and Evolution
Impact In preparation: Blasco-Costa I, Hayward A, Poulin R, Balbuena JA (In preparation for Trends in Ecology and Evolution) Next-generation cophylogeny: integrating eco-evolutionary interactions.
Start Year 2020
 
Description Collaboration with Professor Robert Poulin and Professor Shinichi Nakagawa at University of Otago and University of New South Wales 
Organisation University of New South Wales
Country Australia 
Sector Academic/University 
PI Contribution I collected the data, contributed to study design, and wrote the manuscript.
Collaborator Contribution Project design, meta-analysis
Impact Hayward A, Poulin R, Nakagawa S (In resubmission at Ecology Letters) A broadscale test of host-symbiont cophylogeny reveals the key drivers of phylogenetic congruence. This is the first ever quantitative evaluation of the extent to which symbionts codiverge with their hosts, which is a major mechanism underlying global biodiversity, with general connotations for host-pathogen evolution, such as host-shifts.
Start Year 2013
 
Description Collaboration with Professor Robert Poulin and Professor Shinichi Nakagawa at University of Otago and University of New South Wales 
Organisation University of Otago
Country New Zealand 
Sector Academic/University 
PI Contribution I collected the data, contributed to study design, and wrote the manuscript.
Collaborator Contribution Project design, meta-analysis
Impact Hayward A, Poulin R, Nakagawa S (In resubmission at Ecology Letters) A broadscale test of host-symbiont cophylogeny reveals the key drivers of phylogenetic congruence. This is the first ever quantitative evaluation of the extent to which symbionts codiverge with their hosts, which is a major mechanism underlying global biodiversity, with general connotations for host-pathogen evolution, such as host-shifts.
Start Year 2013
 
Description Collaboration with the Wellcome Sanger Institute via the Darwin Tree of Life Project 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Provision of butterfly samples, advice for genomic sequencing, and analysis of transposons in butterfly genomes, as part of the wider Darwin Tree of Life initiative.
Collaborator Contribution Wellcome Sanger Institute, via Professor Mark Blaxter, are sequencing very high quality British butterfly and moth genomes. UPDATE 2021: I participated in the DToL Lep2020 group meeting, providing an oral presentation on methods for repeat annotation and analysis in Lepidoptera. I am expecting to provide the repeat anaysis for the flagship Lepidoptera 100 genome study, that will lead results from the Darwin Tree of Life Initiative.
Impact -Open access butterfly genomes for the scientific community. -significant increase in transposon analyses for butterfly genomes. UPDATE 2021: A publication on the Lepidoptera 100 genomes project is planned, with likely companion papers exploring specific aspects, such as detailed repeat analyses.
Start Year 2019
 
Description Collabortion with Professor Jerome Hui at Chinese University of Hong Kong 
Organisation Chinese University of Hong Kong
Country Hong Kong 
Sector Academic/University 
PI Contribution We provide guidance, project contributions, and transposon analysis to genome projects. UPDATED 2021: This collaboration remains active and we are currently collaborating on genome projects involving multiple myriapod species, gastropods, and butterflies.
Collaborator Contribution Professor Hui's group are sequencing a large number of genomes to very high quality, which are ideal for transposon analyses.
Impact Published: ----Nong W, Law STS, Wong AYP, Baril T, Swale T, Chu LM, Hayward A, Lau DTW, Hui JHL (2020) A chromosomal-level reference genome of the incense tree Aquilaria sinensis. Molecular Ecology Resources, 20, 971-979. ----Li Y, Nong W, Baril T, Yip HY, Swale T, Hayward A, Ferrier DEK, Hui JHL (2020) Reconstruction of ancient homeobox gene linkages inferred from a new high-quality assembly of the Hong Kong oyster (Magallana hongkongensis) genome. BMC Genomics, 21, 713. ----Nong W, Qu Z, Li Y, Barton-Owen T, Wong AYP, Yip HY, Lee HT, Narayana S, Baril T, Swale T, Cao J, Chan TF, Kwan HS, Ming NS, Panagiotou G, Qian P, Qiu J, Yip KY, Ismail N, Pati S, John A, Tobe SS, Bendena WG, Cheung SG, Hayward A, Hui JHL (2021) Horseshoe crab genomes reveal evolutionary fates of genes and microRNAs after three rounds (3R) of whole genome duplication in invertebrates. ----Qu Z, Nong W, Yu Y, Baril T, Yip HY, Hayward A, Hui JHL (2020) Genome of the four-finger threadfin Eleutheronema tetradactylum (Perciforms: Polynemidae). BMC Genomics, 21, 726. ----Qu Z, Nong W, So HWL, Barton-Owen T, Li Y, Li C, Leung TCN, Baril T, Wong AYP, Swale T, Chan TF, Hayward A, Ngai SM, Hui JHL (2020) Millipede genomes reveal unique adaptations during myriapod evolution. PLoS Biology, 18 (9), e3000636. UPDATE 2021: At least three further publications are planned this year.
Start Year 2019
 
Description Primary collaboration with Dr Konrad Lohse at the University of Edinburgh 
Organisation University of Edinburgh
Department Institute of Evolutionary Biology
Country United Kingdom 
Sector Academic/University 
PI Contribution I sequenced 2 butterfly genomes using the PacBio Sequal platform. I also arranged for free library preparations and assembly for 4 butterfly species using 10X Genomics. We shared fieldwork efforts during the collection of butterfly specimens in Iberia.
Collaborator Contribution Konrad sequenced the 10X library preparations on the Illumina platform in Edinburgh, has generated transcriptomes for 66 of our target butterfly species, shared fieldwork efforts, and has a new postdoc who is assembling our PacBio data. UPDATE 2021: This collaboration remains active and we are planning several further papers during 2021.
Impact Four new butterfly genomes sequenced using 10X genomics, two of which have also been sequenced using PacBio Sequel. 66 new butterfly transcriptomes. ----Published: ----Mackintosh A, Laetsch DR, Hayward A, Charlesworth B, Waterfall M, Vila R, Lohse K (2019) The determinants of genetic diversity in butterflies - Lewontin's paradox revisited. Nature Communications, 10, 3466. UPDATE 2021: ----In resubmission: ----Ebdon S, Laetsch D, Dapporto L, Hayward A, Ritchie MG, Dinca V, Vila R, Lohse K (In resubmission at Molecular Ecology) The Pleistocene species pump past its prime: evidence from European butterfly sister species. ----In preparation: ----Baril T, Laetsch DR, Mackintosh A, Vila R, Lohse K, Hayward A. (In preparation for Nature Ecology and Evolution) Host-TE interactions across 50 high-quality butterfly genomes. ----Genome release papers for 20 high quality genomes generated in collaboration. ----Planned: ----Population genetic analyses for at least one system species pair, ----repeat/introgression analyses across phylogeny.
Start Year 2017
 
Title TobyBaril/EarlGrey: Earl Grey v1.2 
Description For those that cannot install RepeatModeler and RepeatMasker on their systems, we now provide a Docker container (with instructions) that will enable Earl Grey to run within a virtual environment. 
Type Of Technology Software 
Year Produced 2021 
Open Source License? Yes  
Impact There have been hundreds of downloads of the pipeline from its GitHub site. 
URL https://zenodo.org/record/5718734
 
Description Genomics outreach activity at the Royal Cornwall Show 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact I ran a practical activity where members of the public were offered the chance to extract DNA from Cornish strawberries. During the activity, the participants were told interesting facts about DNA and informed about the value of genomics research. Over 1,000 members of the public participated over 3 days. The participants were mainly school children between the ages of 5-16, and their teachers/guardians. However, other adult members of the public also participated. A team of six volunteers helped me with the demonstration.
Year(s) Of Engagement Activity 2017
 
Description International workshop - Butterflies as genomic models in ecology and evolution 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I held a three day international workshop on butterfly genomics at my campus. This was attended by top international researchers in the field, as well as postgraduate students, and undergraduate students from my institution. Additionally, members of the NGO Butterfly Conservation participated, and members of the general public also attended some sessions. The intended purpose was networking and for me to gain an introduction to the field of butterfly genomics. Lastly, several representatives from major sequencing companies attended. There was much discussion as a result of the workshop, and many new collaborations discussed among the participants. It also led to the company 10X Genomics offering to provide library preparations and assemble 4 of my focal butterfly species for free.
Year(s) Of Engagement Activity 2017
URL http://www.exeter.ac.uk/news/events/details/index.php?event=7032
 
Description Invited presentation at Cornwall Wildlife Trust meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Supporters
Results and Impact Invited oral presentation on "Environmental DNA: a new technique to survey Cornish marine biodiversity?" at the annual Cornwall Wildlife Trust Marine Recorders Evening.
Year(s) Of Engagement Activity 2018
 
Description Poster presentation at marine conservation event 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact MSc student poster presentation: 'Environmental DNA (eDNA): a novel approach to survey elasmobranchs in Cornish seas'
Year(s) Of Engagement Activity 2018