Construction of a HAPPY map for the pea aphid Acyrthosiphon pisum

Lead Research Organisation: University of Cambridge
Department Name: Plant Sciences

Abstract

During the last decade our understanding of the functioning of many living organisms has been greatly facilitated by the availability of their genome sequences (all nucleotides in their DNA/heredity material). Thus we now have genome sequences for certain mammals (including humans), birds, fish, insects, plants and many micro-organisms. For insects the first full genome sequence was for the fruit fly, a highly studied model species, and the sequence, in a very user-friendly web-based database 'Fly Base', is available to the whole research community allowing many diverse uses. Since then many other insect genomes have become available allowing comparative studies of how insects have diversified and evolved. There is now a project underway to sequence the genome of a greenfly or aphid and this will extend studies to an insect species with a very different life-history (no pupal stage) and with very interesting behaviours. In an ideal world, sequencing would produce the complete genetic 'text' of the organism, spanning each chromosome from end to end without gaps. In reality, this is never possible. Even the most comprehensive sequencing projects produce a sequence which is fragmentary, like a collection of unbound (and un-numbered) pages of text. Just as the pages of a book must be ordered and bound to be of greatest use, it is important to work out the correct order of these sequence fragments (known as 'contigs') in the genome. Once this is done, it becomes much easier to analyse the genome sequence, fill in any missing pieces of sequence, evaluate its content, and search for particular features. Working out the order of the sequence contigs requires a genome map. Such a map works rather like the index of a book: it tells one where certain key words (or certain fragments of the genetic sequence) lie within the book as a whole. There are many ways to make genome maps, but one of the most versatile and accurate approaches is known as HAPPY mapping. This proposal seeks to build a HAPPY map and, in conjunction with other projects going on around the world, to eventually produce an 'AphidBase' of data for the research community. Although the genome sequence currently underway will be from one particular aphid species it is anticipated that the information obtained will be very relevant to other aphids including many which are important pests of agricultural crops, causing both direct feeding damage and transmitting plant virus diseases. Thus it will provide important information on how aphids develop resistance to insecticides, how new pesticides could be designed and potential new control strategies that don't require chemical treatments.

Technical Summary

During the last decade our understanding of many living organisms has been greatly facilitated by genome sequencing projects. For insects the first full length sequence was for Drosophila melanogaster and since then the sequences of many other insect genomes have become available, allowing comparative studies of how insects have diversified and evolved. There is now a project underway to sequence the 525Mb genome of the pea aphid, Acyrthosiphon pisum, and it is anticipated that this will contribute greatly to studies of bacterial endosymbiosis, insect-vectored virus transmission, phenotypic plasticity and adaptation to host plants. Whilst shotgun sequence will be of immense utility to the world community of entomologists, its value can be multiplied several times over by mapping the genome to provide positional context for the contigs. This requires a physical 'map' of the genome. One suitable map would be a HAPPY map, a simple method for ordering markers on chromosomes and determining the distance between them. HAPPY mapping is based on the analysis of approximately HAPloid DNA samples using the PolYmerase chain reaction. This proposal seeks to build such a HAPPY map which will also enable the completeness and accuracy of the shotgun data to be assessed, and will provide the only viable route for local or genome-wide sequencing finishing at a later date. It will also provide a resource for comparative mapping of related aphid species to investigate genomic synteny. It is anticipated that the information obtained from the A. pisum genome will be very relevant to other aphid species including many which are important pests of agricultural crops and vectors of plant viruses. Thus it will provide important information on how aphids evolve resistance to insecticides, how to identify potential new pesticide targets and which proteins involved in host recognition might be blocked and thus provide non-chemical alternative control strategies.

Publications

10 25 50
 
Description During the last decade our understanding of the functioning of many living organisms has been greatly facilitated by the availability of their genome sequences (all nucleotides in their DNA/heredity material). Thus we now have genome sequences for certain mammals (including humans), birds, fish, insects, plants and many micro-organisms. For insects the first full genome sequence was for the fruit fly, a highly studied model species, and the sequence, in a very user-friendly web-based database 'Fly Base', is available to the whole research community allowing many diverse uses. Since then many other insect genomes have become available allowing comparative studies of how insects have diversified and evolved. There is now a project underway to sequence the genome of a greenfly or aphid and this will extend studies to an insect species with a very different life-history (no pupal stage) and with very interesting behaviours.

In an ideal world, sequencing would produce the complete genetic 'text' of the organism, spanning each chromosome from end to end without gaps. In reality, this is never possible. Even the most comprehensive sequencing projects produce a sequence which is fragmentary, like a collection of unbound (and un-numbered) pages of text. Just as the pages of a book must be ordered and bound to be of greatest use, it is important to work out the correct order of these sequence fragments (known as 'contigs') in the genome. Once this is done, it becomes much easier to analyse the genome sequence, fill in any missing pieces of sequence, evaluate its content, and search for particular features.

Working out the order of the sequence contigs requires a genome map. Such a map works rather like the index of a book: it tells one where certain key words (or certain fragments of the genetic sequence) lie within the book as a whole. There are many ways to make genome maps, and several have been applied to the Aphid genome, one of approaches is known as HAPPY mapping.

The aim of this proposal was to build a HAPPY map using an existing technology known as Polymerase Chain Reaction (PCR). During the course of this project it became apparent that this PCR technology had a high failure rate which meant that we would not be able to complete the map using the traditional PCR method. As a result we investigated alternative typing methods, and based on advice from colleagues we devised an alternative and novel method typing method based on Restriction-site Associated DNA sequencing that uses a next generation sequencing technology called illumina sequencing to create sequence tags that can then be used to make maps. We hope that this new method will, in conjunction with other projects going on around the world, eventually produce an ordered aphid genome sequence for the research community. Although the genome sequence currently underway will be from one particular aphid species it is anticipated that the information obtained will be very relevant to other aphids including many which are important pests of agricultural crops, causing both direct feeding damage and transmitting plant virus diseases. Thus it will provide important information on how aphids develop resistance to insecticides, how new pesticides could be designed and potential new control strategies that don't require chemical treatments.
Exploitation Route Three FileMakerPro relational databases: 1. Aphid Scaffold Splitter Database (1.7GB); 2. Aphid Sequence Entry Database (113MB); 3. Aphid Marker Database (360MB); see pdf attachments for database field details and calculations (copies of all three databases available from J. Pachebat).
Expected end of 2010 from HGSC: illumina sequencing results from the 8 Phi29 amplified, MseI digested, RAD illumina sequencing libraries: to be analysed at the HGSC in collaboration with Dr Justin Pachebat and Dr Paul Dear, with the technique and results expected to be published in a refereed journal. RAD illumina sequence reads will be entered into the NCBI Short Read Archive (SRA) (http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi), RAD sequence tag reads will be aligned against the current A. pisum genome assembly to form RAD stacks, and contig and scaffold ends containing RAD stacks identified and characterised as "RAD stack markers" for analysis to test marker distribution, the efficiency of the RAD illumina sequencing methods and subsequent linkage analysis. It is anticipated that both RAD stacks and resulting RAD stack HAPPY markers will be made available on request from the authors and in supplementary information on publication.
Sectors Agriculture, Food and Drink

URL http://www.ncbi.nlm.nih.gov/Traces/sra/sra.cgi
 
Description ArtCell Exhibition 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Scientific art exhibit with microscopy images from Jim Haseloff and Fernan Federici, opened by the Mayor of Cambridge and open to the public.
Year(s) Of Engagement Activity 2011
 
Description BBC Radio 4 interview 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Interview about the potential impact of Synthetic Biology
Year(s) Of Engagement Activity 2010
 
Description Edinburgh Science Festival: Designer Life 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Panelist on public debate on Synthetic Biology, transmitted as part the "Material World" programme radio.
Year(s) Of Engagement Activity 2010
 
Description Royal Society: Future Technologies 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Public discussion at South Bank, London with Sir Tim Burners-Lee, Stephen Fry, Prof. Dame Wendy Hall and Bill Thompson
Year(s) Of Engagement Activity 2010
 
Description Wired magazine interview and pictorial articles 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Published interview about Synthetic Biology and article about engineering living systems. "Building new life forms at the iGEM Jamboree", and "At home with the DNA hackers".
Year(s) Of Engagement Activity 2009