Pig genome annotation and analysis

Lead Research Organisation: European Bioinformatics Institute
Department Name: Ensembl Group

Abstract

We propose to provide state of the art analysis and annotation of the pig genome sequence being generated by the International Pig Genome Sequencing Project. We will make the annotated genome sequence accessible on the Web through the Ensembl site at http://www.ensembl.org . The pig genome is the entire DNA sequence of the pig which defines all the biological molecules that make up a pig. By acquiring, managing and annotating the pig genome sequence one accelerates research for both pig biology and for mammalian biology. Impact on pig biology: Because of the extensive selective breeding which has occurred during domestication, there are a considerable number of breed or line-specific features, from fat/muscle ratios, litter size to skin colour. These features can be mapped genetically into broad regions of the genome, but the final identification of the genes responsible and the causal genetic variation is very complex. The availability of a well-annotated pig genome sequence with links to other data sources, especially those on phenotypes such as growth, carcass composition or responses to infectious disease would provide a dramatic boost to the identification of these causative genes.

Technical Summary

The genome represents a complete description of an organism. However, to understand the functioning of the genes and regulatory elements, and to design sensible molecular biological experiments to test hypotheses, the genome sequence must be related to the extant functional data for that organism. We propose to annotate and analyse the sequence being generated by the International Pig Genome Sequencing Project. We will use the well established Ensembl system as the main tool for storage, management and dissemination of pig genome data. Pig genome sequencing is currently funded to 3-4x coverage from mapped clones, with two chromosomes at higher coverage. Experience from other low coverage genomes, such as cow, rabbit and armadillo is that this coverage will minimally provide an effective representation of exons, which can then be assembled into genes using a guide genome. By definition this approach cannot resolve lineage specific expansions in the pig genome. However, with this more clone based strategy there will be new opportunities for combining both assembly and annotation strategies to leverage more information out of a 3x assembly. We will integrate the pig genome sequence with diverse pre-existing data sets, including SNPs, ESTs and quantitative trait loci (QTL). We will integrate the sequence with maps (genetic, physical) and physical resources (clones, microarrays) providing a seamless route for interrogation and development of experimentation tools. Finally computational approaches, integrating the above resources and also leveraging the comparative genomics potential in the mammalian clade will be used to analyse and present the genome in a user friendly format. An annotated pig genome sequence will dramatically accelerate research on the pig as an important animal for agriculture and human biology. Our aim is to make the pig genome sequence maximally useful by delivering an annotated sequence of the highest quality in a user friendly manner.

Publications

10 25 50
publication icon
Flicek P (2010) Ensembl's 10th year. in Nucleic acids research

publication icon
Flicek P (2008) Ensembl 2008. in Nucleic acids research

publication icon
Hubbard TJ (2007) Ensembl 2007. in Nucleic acids research

publication icon
Hubbard TJ (2009) Ensembl 2009. in Nucleic acids research

 
Description We have helped advance the information one can retrieve for the Pig Genome, allowing people to see more information on pig genomes, for example, helping researchers understand why some Pigs have infections and others dont
Exploitation Route This help Pig Researchers (eg, at the Roslin Institute, or the Institute of Animal Health) cheaply and easily develop methods
Sectors Agriculture, Food and Drink,Environment,Pharmaceuticals and Medical Biotechnology

URL http://www.ensembl.org/Sus_scrofa/Info/Index
 
Title Pig Ensembl database and releases 
Description The annotated reference genome sequences have been delivered through a series of Ensembl releases. The following updates for pig have occurred: Initial genebuild was in September 2009, on assembly Sscrofa9 Ensembl Release 67 (May 2012) Pig: The annotation of the pig genome assembly (Sscrofa10.2) on which the pig genome sequence paper was based was migrated from Pre-Ensembl to the full Ensembl site. The annotated reference genome sequences have been delivered through a series of Ensembl releases. The following updates for pig have occurred: Ensembl Release 69 (October 2012) Pig: An Ensembl-Havana gene set was added to the annotation. The VEGA manual annotation which had been generated through a community effort was added. For more details see: http://oct2012.archive.ensembl.org/Sus_scrofa/Info/Index Ensembl Release 74 (December 2013) Pig: secondary structure of non-coding RNAs are now shown on the gene summary page, using the R2R package. More details about new features for pig in release 74 can be found at: http://www.ensembl.org/Sus_scrofa/Info/WhatsNew?db=core 
Type Of Material Database/Collection of data 
Year Produced 2009 
Provided To Others? Yes  
Impact The data are highly accessed by researchers- there were 223781 site visits in 2013 alone. 
 
Description Training workshops for researchers to use the data generated 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Several training workshops with an emphasis on using the Ensembl Genome Browser for farmed animal genomes have been held:
13th May 2013 at the University of North Carolina, Chapel Hill, North Carolina, USA
6th June 2013 at The Genome Analysis Centre, Norwich
1st July 2013 at Maastricht University, Maastricht, The Netherlands
22nd and 23rd August 2013 at The Roslin Institute, University of Edinburgh.
29 Sept - 2nd October 2014: Animal Genome Informatics, EBI. Workshop on working with NGS data specific to farmed animals.
Demonstrations of the Ensembl Genome Browser were provided at both the Plant and Animal Genome Conference, San Diego, USA, and at the Plant and Animal Genomes Conference Asia, Singapore:
Overduin B, 2013. Genome resources at EBI - Ensembl and Ensembl Genomes. In: Plant and Animal Genome XXI Conference, 12-16 January, 2013, Town & Country Convention Center San Diego, California. Abstract W343.
Overduin B, 2013. Genome Annotation Resources at the EBI - Ensembl and Ensembl Genomes. In: Plant and Animal Genome Asia 2013, 17-19 March, 2013, Grand Copthorne Waterfront Hotel, Singapore. Abstract W040.
Clarke L, Cunningham F, McLaren W, Ritchie G, Gil L, Thorman A, Hunt S, 2013. Using Ensembl to understand variation data. In: Plant and Animal Genome XXI Conference, 12-16 January, 2013, Town & Country Convention Center San Diego, California. Abstract W345.

Additionally, all of the Ensembl's general, multi-species workshops include mention of the farm animal resources. In total, there were exactly 100 Ensembl workshops in 2013 and details of the locations and dates of these training events can be provided on request.

Impacts include increased use of the data by individual users due to improved understanding of Ensembl by workshop participants (this is evidenced by feedback gathered from participants), and incremental improvements to Ensembl and Ensembl courses from participant feedback.
Year(s) Of Engagement Activity 2013,2014