The role of transposable elements in generating functional diversity

Lead Research Organisation: University of East Anglia
Department Name: Biological Sciences

Abstract

The DNA housed inside the cells of all organisms contains the genes required for building and maintaining living bodies. However, in most organisms, there are other components of the genome that are not specifically required for life and these elements may multiply within the genome in a manner similar to computer viruses multiplying on a computer hard-drive. These genetic elements are known as transposable elements and come in two forms: those which use a copy and paste mechanism (type I) and those that use a cut and paste mechanism (type II). Scientists have typically viewed the presence of these elements in genomes as parasitic and their effects on the host as either neutral or negative. However, it is increasingly recognised that these mobile elements can have a positive effect on the evolutionary potential of a species. For example, the resistance of the fruit fly Drosophila melanogaster to an organophosphate pesticide has increased as a result of the production of a novel truncated protein generated by the insertion of a transposable element in an existing longer gene. Other researchers have suggested that transposable elements may have more profound impacts on biodiversity as a whole and have suggested that the insertion of transposable elements may lead to increased evolutionary potential and may explain why some groups of organisms are particularly species rich. However, to date there has been little research that goes beyond identifying a circumstantial relationship between the abundance of transposable elements and the speciation rates of a group of organisms.

In this study, we will focus on a species rich group of Neotropical catfishes that (i) have considerable differences in the transposable element content of their genomes and (ii) have lineages with different rates of speciation and (iii) show a sudden increase in speciation rate ~25 million year before present which is broadly contemporaneous with the TE expansion.

We will firstly quantify the abundance of different transposable elements across the genomes of Corydoras catfishes. This will provide an accurate estimate of the TE density in different species, and by mapping these onto a phylogenetic framework, the timing of TE expansions in different species will be quantified. Subsequently, we will investigate whether TEs increase the mutation rates of promoter regions as a result of the intrinsic DNA repair mechanisms that heal TE insertion / excision sites. Next we will investigate whether catfish lineages with high speciation rates have more TEs inserted in genes and promoter regions than species with low diversification rates. Finally, we will investigate whether certain groups of genes (e.g. colour pattern genes) have been more greatly affected by TEs than other genes (e.g. housekeeping genes).

The results generated in this proposal will allow a thorough investigation of the role of transposable elements in increasing the diversity of genes and their promoter regions and provide important evidence as to whether TEs underpin rapid speciation in some taxonomic groups.

Technical Summary

Transposable Elements (TEs) are ubiquitous repetitive DNA sequences that can move and multiply within genomes. They make up significant proportions of most eukaryotic genomes, for example, 85% of the maize genome and 45% of the human genome (Human Genome Consortium 2001) are TEs. Transposable Elements have historically been viewed as deleterious, however, this idea is now changing with a number of apparently beneficial impacts on host gene function and structure identified. More broadly, TEs may have macro-evolutionary impacts, with the 'epi-transposon' and 'TE-Thrust' hypotheses suggesting TE proliferation may increase speciation rates. However, a robust statistical relationship between TE abundance and speciation rate has yet to be identified.

The Corydoradinae are sexual neotropical armoured catfishes comprising more than 200 extant species. A well-resolved mtDNA phylogeny which identified 9 major lineages has been previously generated by Taylor's group (Alexandrou MA, et al. (2011) Nature 469:84-88) which provides a phylogenetic framework for macro-evolutionary investigation. Further work in the Taylor group has identified considerable variation in genome size, with the smallest genome sizes with C-values of ~0.5pg and the largest ~4.5pg per haploid cell (Marburger et al. under revision Proc Roy Soc B). Transposable Element abundance increases dramatically in the species with largest genome sizes, with the TC1-superfamily increasing from <1% of the genome to ~70% of the genome across the Corydoras lineages (Marburger et al. under review Proc Roy Soc B).

In this proposal, we will test the hypothesis that TE insertion and excision increases the mutation rate of gene promoter regions as a result of imperfect DNA repair by the cell and that these TE generated mutations underpins increases in lineage diversification rate - with lineages with more TEs and higher mutation rates evolving more rapidly than lineages with fewer TEs in their genomes.

Planned Impact

The proposed work is expected to generate significant impacts - we describe below who will benefit and the mechanisms in place to show how that impact will be achieved.

1. DISSEMINATION OF FUNDAMENTAL SCIENCE ACROSS ACADEMIC AND PUBLIC DOMAINS:

This proposal will generate a wealth of data of great interest to molecular biologists, evolutionary biologists, developmental biologists, ichthyologists, aquarists and hobbyists interested in catfishes and the general public (including schools).

The data generated in the project will act as a catalyst for future investigation by both the Pi, Co-Is and PDRA as well as the groups of researchers described above. These impacts will be delivered by full research team through published papers, submission of sequences to repositories such as Genbank, press releases, science blogs and conference presentations. Public dissemination will be achieved through the whole team via the diverse outreach and engagement activities specified in the pathways to impact plan. Collectively we have a strong record in such activities

We have identified a number of possibilities where the proposed research may generate broad interest.

i) TE insertions and mutation rates. The role of non-coding DNA in genomes is a topic generating much debate. While this proposal does not address the role per se, it does address the IMPACT of non-coding DNA on the genome in the form of whether or not they increase mutation rates, and how this then influences macro-evolutionary patterns. Thus, linking the activity of autonomous genetic elements to organismal diversification rates would be of great interest to many molecular and evolutionary biologists and also the general public in terms of why some groups are species rich and others species poor.

ii) Comparative genomics. We will generate whole genomes for 2 species of Corydoras catfish and improve the assemblies of two others. This will be of great benefit for comparative evolutionary genomics as the species sequenced are part of an adaptive radiation, some species of which have undergone whole genome duplications. Thus workers in the fields of comparative genomics, transposable elements and whole genome duplication will benefit from the genomic resources generated. This research will also interest researchers in the field of salmonid aquaculture which have also undergone a whole genome duplication

iii) Developmental biologists - transposable elements (sleeping beauty) are being used to 'barcode' cells (Sun et al Nature. 2014. 514:322-327). By inducing TE activity, every cell has a unique pattern of TE insertions allowing cell lineages to be tracked during development. Thus, additional knowledge of the potential impacts of these 'barcoding' systems on the genome will be beneficial.

iv) Mimicry researchers (will benefit from the genomic resources generated in this project. The Corydoras are one of the few vertebrate species that have evolved Müllerian mimicry. Thus there is ample scope to investigate the role of TEs in increasing mutation rates of colour pattern genes in this group.

2. TRAINING OF SKILLED PEOPLE FOR NON-ACADEMIC PROFESSIONS

The named PDRA will benefit from manifold aspects of the project including developing computational and bioinformatics skills (python/perl/R languages) which will increase employability in both academic and non-academic environments, participation in impact and outreach activities which will develop communication and presentation skills, digital media skills (websites and animations), conference attendance will lead to better networks, increasing employability.
 
Description The Corydoradinae catfishes have some of the most diverse genome sizes in any group of fish. Genome sizes range from ~0.8gb to 4.5gb.
In this project we sequenced the genomes of representative species from 4 of the 9 Corydoras mtDNA lineages that span the range of genome sizes.
We generated a chromosome level genome for Corydoras fulleri and contain level genome sequences for the 3 other species sequenced (Aspidoras sp. , Corydoras elegant and Corydoras metae). These are the first genomes for the catfish Family Callichthyidae. The chromosome level genome of a basal diploid species will be pivotal in understanding genome evolution and whole genome duplication across the group.
Custom TE libraries were generated for each of the species and a novel pipeline was also developed to assist with this process and assist in annotating other non-model species where TE libraries do not exist (Bell et. al 2022). This pipeline demonstrated that using species specific libraries can identify ~40% more Test than using libraries from more distantly related species. Moreover, more accurate estimates of the age and diversity of TE family expansions are obtained from species specific libraries. This resource is available as a GitHub repository (https://github.com/ellenbell/FasTE) with additional scripts for TE parsing here (https://github.com/clbutler/RM_TRIPS).
TE content was higher in species with bigger genome sizes and this was primarily driven by the Tc1 Mariner family of TE elements.
TEs were not inserted into the promoter region of genes associated with pigmentation at a higher rate than expected by chance.
Exploitation Route The Corydoras genomes will be used by the entire teleost fish and vertebrate genomics community.
Corydoras specific TE libraries will be used by the community.
Sectors Other

 
Title Data from: Transposable element annotation in non-model species - on the benefits of species specific repeat libraries using semi-automated EDTA and DeepTE de novo pipelines 
Description Transposable elements (TEs) are significant genomic components which can be detected either through sequence homology against existing databases or de novo, with the latter potentially reducing underestimates of TE abundance. Here, we describe the semi-automated generation of a de-novo TE library which combines the newly described EDTA pipeline and DeepTE classifier in a non-model teleost (Corydoras sp. C115). We assess performance using both genomic and transcriptomic input by five metrics: (i) abundance (ii) composition (iii) fragmentation (iv) age distributions and (v) capture of potential horizontally transferred TEs. We identified notable differences in these metrics between different TE libraries, and highlight how library choice can have a major impact on TE content estimates in non-model species. This repository incorporates six raw (unparsed) Repeat Masker (RM) output files for two genomes (Corydoras sp. c115 and Corydoras maculifer) one transcriptome (C. maculifer), two Repeat Libraries (one based on the RepBase Danio rerio library and one de novo library build on the C. sp. c115 genome). The RM ouput files correspond to one homology based transposon search using the D. rerio library and one species specific search using the de novo library. It also includes a script to acompany horizontal transfer analysis and a transposable element renamins script. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL http://datadryad.org/stash/dataset/doi:10.5061/dryad.m0cfxpp3h
 
Description Latitute festival science tent 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact A team presented a multi-activity science outreach event that included Corydoras catfish and discussion of mimicry, colour patterns and genome evolution.
Year(s) Of Engagement Activity 2019
 
Description Norwich Science Festival Talk 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact I gave a talk titled "Copycats: bloated genomes and colour pattern mimicry in Neotropical catfishes". Between 50 and 100 people attended the talk which took place in the Norwich Forum as part of Oceans's day at the 2018 Norwich Science festival.
Year(s) Of Engagement Activity 2018
URL http://norwichsciencefestival.co.uk/wp-content/uploads/sites/4/NSF18-interactive-Programme.pdf