Ensembl in a new era - deep genome annotation of domesticated animal species and breeds
Lead Research Organisation:
EMBL - European Bioinformatics Institute
Department Name: Genome Assembly and Annotation
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
The Ensembl genome browser is a widely used web-based interface that makes deeply annotated reference genomes for domesticated animals available in a unified way to researchers. An explosion in the number of genomes produced for domesticated animals is expected in the coming three years. In this proposal we describe how we will ensure that the Ensembl genome browser can keep pace to provide deep annotation of these genomes.
Populations of domesticated animals are diverse, including many different breeds and populations within each species. Advances in sequencing technologies means that the recent rise in the number of assembled genomes for domesticated animal species is expected to continue and accelerate. However:
- Current Ensembl resources are primarily focused around individual reference genomes for a single or a small number of representatives per species.
- New ways of storing, comparing, annotating, visualising and making available the diversity of genomes for each domesticated animal species are urgently required.
- Support for efforts to annotate this wealth of genome sequence data in a timely manner is critical to realising the potential impact of these data.
The overarching aim of this proposal is to establish and maintain deeply annotated genomes for domesticated animal species in the Ensembl genome browser. To achieve this aim we will:
- Analyse and annotate domesticated animal genomes as they become available, including alternate assemblies, exploiting the growing volumes of functional data.
- Run comparative genomics analyses both between species and within species.
- Acquire data from re-sequencing projects to characterise genetic variation within species and annotate variants by genomic region.
To ensure that the research community can make the most efficient use of the resource we will provide training and ensure we regularly adjust our priorities based on user feedback.
Populations of domesticated animals are diverse, including many different breeds and populations within each species. Advances in sequencing technologies means that the recent rise in the number of assembled genomes for domesticated animal species is expected to continue and accelerate. However:
- Current Ensembl resources are primarily focused around individual reference genomes for a single or a small number of representatives per species.
- New ways of storing, comparing, annotating, visualising and making available the diversity of genomes for each domesticated animal species are urgently required.
- Support for efforts to annotate this wealth of genome sequence data in a timely manner is critical to realising the potential impact of these data.
The overarching aim of this proposal is to establish and maintain deeply annotated genomes for domesticated animal species in the Ensembl genome browser. To achieve this aim we will:
- Analyse and annotate domesticated animal genomes as they become available, including alternate assemblies, exploiting the growing volumes of functional data.
- Run comparative genomics analyses both between species and within species.
- Acquire data from re-sequencing projects to characterise genetic variation within species and annotate variants by genomic region.
To ensure that the research community can make the most efficient use of the resource we will provide training and ensure we regularly adjust our priorities based on user feedback.
Publications


Harrison PW
(2024)
Ensembl 2024.
in Nucleic acids research

Martin FJ
(2023)
Ensembl 2023.
in Nucleic acids research
Title | Addition of annotations of commercially important aquaculture species |
Description | We have released new assemblies for Atlantic Salmon (Salmo salar), Rainbow Trout (Oncorhynchus mykiss), European Seabass (Dicentrarchus labrax) and Carp (Cyprinus carpio carpio), that are amongst the most commercially important aquaculture species in Europe. |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | Expanded access for users to key aquaculture species. Important for precision breeding in species of economic and environmental importance |
Title | Ensembl Variant Effect Predictor (VEP) Farmed Animal Annotation Updates |
Description | Over the course of the grant, we have collaborated with the European Variation Archive (EVA) to synchronise our supported assemblies where possible and have improved our methods for identifying compatible variant data and importing it to Ensembl. In addition to updates to key species including cow, pig, horse, chicken and sheep, we have recently made variation data available for 4 new farmed/food species: domestic yak, greater amberjack, mallard and rainbow trout. As EVA currently supports one assembly per species, we currently remap variant data to secondary assemblies used in the community. We now annotate pig and chicken variants which lie in regulatory elements and display these data on variant specific pages. We have updated Ensembl VEP to annotate user-input variants with respect to these regulatory elements. This option is currently available via the REST and command line interfaces and will be available via the web interface in the next Ensembl release for pig, chicken, turbot and European seabass. To aid interpretation of missense variants, we calculate SIFT scores for all transcripts in cow, pig, horse, chicken, goat, sheep, cat and dog. We now also display CADD scores for pig variants, and in the next Ensembl release these will also be available via Ensembl VEP. We display population allele frequencies for variants from public sources and in the next release the Ensembl VEP web tool will report frequencies from sheep,goat, dog and chicken population studies We have also continued to import the latest phenotype association data from OMIA and AnimalQTL for each Ensembl release as well as citations mentioning RefSNP variant identifiers, as mined from the literature by EuropePMC. An example: http://www.ensembl.org/Gallus_gallus/Variation/Mappings?db=core;r=1:33025-34025;v=rs3387277637;vdb=variation;vf=16911667 |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | New, improved assemblies have been generated for many key livestock species since large- scale variant calling efforts completed, leaving gaps in variant coverage over newer regions. We are piloting variant calling against the latest pig genome, with the aim of identifying novel variants in the novel regions. Once this is successful, we will extend to other species providing a more complete view of genomic variation. |
URL | http://www.ensembl.org/ |
Title | Improved Ensembl annotations and annotation of additional breeds |
Description | CpG islands added to chicken, pig and horse reference annotations in Ensembl. Re-annotation with new transcriptomic data for Horse (EquCab3.0 - GCA_002863925.1) Annotation of breeds for chicken (2), sheep (18), pig (8), goat (2), buffalo (1), warthog (1) |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | Improved Ensembl annotation available to the research community to improve and accelerate an array of downstream scientific discoveries in these species |
URL | https://www.ensembl.org/ |
Title | New Ensembl reference assembly/annotations for chicken, cow and donkey |
Description | New Ensembl reference assembly/annotations for chicken (ARS-UI_Ramb_v2.0 - GCA_016772045.1), cow (ARS-UCD1.3 - GCA_002263795.3), Donkey (ASM1607732v2 - GCA_016077325.2). All made available through Ensembl. |
Type Of Material | Database/Collection of data |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | New reference annotations mark a step change for these communities enabling improved downstream analyses that require a genomic context. |
URL | https://www.ensembl.org/ |
Title | New and updated species and data types |
Description | We have expanded the reference assembly for pig to include new publicly released tissue and developmental time point specific transcriptomic datasets, and ATAC-Seq regulatory tracks were added for the first time. The chicken assembly GRCg6a was reannotated, as well as the annotation of a broiler and layer assembly GRCg7w and GRCg7b). We have included allele frequency data from the European Variation Archive that is now displayed on variant pages for chicken (PRJEB44919), dog (PRJEB24066) and salmon (PRJEB34225). |
Type Of Material | Database/Collection of data |
Year Produced | 2022 |
Provided To Others? | Yes |
Impact | Expanded access for users to regulatory and variation data for farmed and companion animal breeds |
Description | Unveiling intriguing diversity of African pigs and wild suids through epigenetics |
Organisation | University of Evora |
Country | Portugal |
Sector | Academic/University |
PI Contribution | Advice on establishing reference genome sequences based on long-read sequencing data. |
Collaborator Contribution | Leadership of the project, acquisition of samples and data generation. |
Impact | No data as yet. Funding application submitted. |
Start Year | 2024 |
Description | AQUA-FAANG Final Conference Panel Discussion |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | On the final day there was a panel discussion focused on functional genomics and future perspectives for the aquaculture sector in which Peter Harrison (EMBL-EBI), Garth Ilsley (EMBL-EBI), Gabriela Merino (EMBL-EBI) and Emily Clark (Roslin Institute) paerticipated. In the discussion accessibility and usability of functional annotation information was discussed and its usefulness for genomic selection as well as the route to application of the data particularly in the context of genome editing. The panel discussion provided very useful feedback for development and priorities for the Ensembl Genome Browser. Audience members asked many questions of the panel and plans were made for future related activity. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.aqua-faang.eu/final-conference.html |
Description | AQUA-FAANG Final conference, AQUA-FAANG relevance to industry session, Peter Harrison talk on Ensembl gene annotation, regulation and variant effect prediction for aquaculture |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | At the Horizon 2020 AQUA-FAANG project final conference as part of the AQUA-FAANG relevance to industry session on the industry engagement day of the conference, Peter Harrison gave a talk on Ensembl gene annotation, regulation and variant effect prediction for aquaculture. The audience asked questions, in particular about Ensembls Variant Effect Predictor tooling. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.aqua-faang.eu/final-conference.html |
Description | BovReg Final Conference - Peter Harrison gave a talk on EuroFAANG Data Infrastructure: standardizing and presenting BovReg and, global FAANG, data |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | At the BovReg Final Conference Peter Harrison gave a talk on EuroFAANG Data Infrastructure: standardizing and presenting BovReg and, global FAANG data. This included the Ensembl cattle annotation, and the need for improved gene and regulatory annotation to be improved in the coming years. |
Year(s) Of Engagement Activity | 2024 |
URL | https://bovreg.eu/bovreg-final-conference/ |
Description | BovReg Final Confernece - Future of Functional Annotation beyond BovReg panel discussion |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | During the final conference for the Horizon 2020 project BovReg, Peter Harrison contributed to panel discussions in the Future of Functional Annotation beyond BovReg. The discussion included the future annotation and regulatory builds for cattle, and how this award could support those efforts going forward to ensure this becomes available to the community through Ensembl. |
Year(s) Of Engagement Activity | 2024 |
URL | https://bovreg.eu/bovreg-final-conference/ |
Description | ISAG 2023 - FAANG workshop Panel discussion |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | At the International Society for Animal Genetics Conference in Cape Town Garth Ilsley (EMBL-EBI) and Emily Clark (Roslin) were involved an open discussion on the implementation of the next phase of FAANG. The discussion included efforts to make functional annotation more accessible including to other spaces such as industry and animal breeders, and whether it was possible to out source annotation efforts to the community. The discussion was very relevant to the development of priorities for the Ensembl Genome Browser for farmed and domestic animals, particularly that the community saw additional regulatory builds for more species as a priority. |
Year(s) Of Engagement Activity | 2023 |
URL | https://www.isag.us/Docs/Proceedings/ISAG_2023_Abstracts.pdf |
Description | PAG 31 - Workshop on Workshop: Genome Annotation Resources at the EBI Adam Frankish (on behalf of Jane Loveland) gave a talk on Vertebrate Genomes in Ensembl |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | At the PAG 31 conference in the Workshop on Genome Annotation Resources at the EBI Adam Frankish gave a talk on Vertebrate Genomes in Ensembl. This was given on behalf of Jane Loveland who could not attend for personal reasons. The talk covered Ensembl (www.ensembl.org) infrastructure for accessing genomic information covering over 300 vertebrate species, including cattle, pig, sheep, horse and chicken. We generate automatic, evidence-based genome annotation from multiple lines of evidence. The audience asked questions and requested more information on how to access these resources. |
Year(s) Of Engagement Activity | 2024 |
URL | https://pag.confex.com/pag/31/meetingapp.cgi/Paper/52768 |
Description | PAG31 - Panel Discussion Implementing the Next Phase of FAANG |
Form Of Engagement Activity | A formal working group, expert panel or dialogue |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Other audiences |
Results and Impact | At the global FAANG workshop at the Plant and Animal Genomes conference (PAG31) Emily Clark (Roslin Institute) and Peter Harrison (EMBL-EBI) were involved an open discussion on the implementation of the next phase of FAANG. The discussion included the task forces and efforts to make functional annotation more accessible including to other spaces such as industry and animal breeders. The discussion was very relevant to the development of priorities for the Ensembl Genome Browser for farmed and domestic animals. |
Year(s) Of Engagement Activity | 2024 |
URL | https://plan.core-apps.com/pag_2024/abstract/a6603333-8741-4c5d-a45f-f5d49ee1d01c |