A DNA resource for phylogenetics and taxonomy of the insects

Lead Research Organisation: Imperial College London
Department Name: Life Sciences - Biology

Abstract

While molecular systematics has been a steadily growing discipline for some 20 years, only the recent advance in sequencing technology and bioinformatics has finally provided the prospect of integrating all or most species on Earth into a DNA based system, for a general evolutionary synthesis of the living world using a uniform set of data. Yet, in doing so, many problems remain. The number of possible phylogenetic trees is vast, and the extraction and compilation of primary sequence information requires sophisticated bioinformatics tools. In this project we will tackle these issues in one major portion of the Tree-of-Life, the holometabolan (complete metamorphosis) insects. We have already developed a set of bioinformatic scripts for processing sequence data from public databases, and will apply this here to the analysis of the large number of available data for insects. We will also provide scripts to enable an iterative process to update this database regularly for an ever expanding DNA taxonomy resource. The current 'best' trees will be available from the project website site for both inspection and download and a collection of concatenated data matrices will also be available for download so that members of the research community may combine them with their data for their own studies. The immediate results provided will be a searchable database containing all available Holometabola species and sequences, a pool of large-scale phylogenetic trees representing each individual gene used, plus a tree representing the supermatrix of all the genes. Additionally to this the collection of trees can be used to investigate and explore the partially unresolved relationships of the insect orders within the Holometabola and the basal relationships within the four main orders, which will provide a more complete understanding of the phylogeny of the group. These trees will also then form the starting point for more directed applied studies for specific groups of interest. The study will assess the current status of insect molecular systematics by compiling all available sequence information for commonly used genes. Finally, an important function of the database is its use in identifying unknown query sequences. In conclusion, this project provides a unique opportunity to utilize the increasing amounts of taxonomic sequence data that are now ready for a general synthesis and broad scale phylogenetic analysis. With comparatively simple means and in a short time period, we will be able to make great progress towards building the universal tree of insects.
 
Description This project was designed to extract all available taxonomic data from publicly available databases, to construct phylogenetic trees. Bioinformatics pipelines have been generated and applied to various groups of insects for a new DNA-based taxonomy.
Exploitation Route The resulting trees are widely cited and used by the academic literature.
Sectors Agriculture, Food and Drink,Environment

 
Description The bioinformatics procedures are widely used and have led to high-profile publications.
First Year Of Impact 2007
Sector Agriculture, Food and Drink,Environment