📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

BBSRC-NSF/BIO: Next generation collaborative annotation of genomes and synteny

Lead Research Organisation: European Bioinformatics Institute
Department Name: Genome Assembly and Annotation

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Development of the genome annotation tools Artemis and Apollo has run in parallel for almost 20 years. The Berkeley-based Apollo team and the Sanger-based Artemis team have in some cases produced clear alternative paradigms for viewing and annotating genome data but a more predominant theme has been convergence in purpose and approach. A particular strength of the Apollo system is its performance, scalability, and interoperability. We will build upon Apollo infrastructure to include components that have been essential to Artemis users and have been frequently requested by the Apollo community. These will include developing "snap-to-grid" functionality that auto-aligns exons to reading frames during interactive editing; support for small-scale users to load draft annotated genomes from a file, make changes and store them in the same file (a highly used feature of Artemis); and parallel viewing of the same sequence at two zoom levels. The Artemis Comparison Tool is built upon Artemis software and enable comparative genomics data and synteny to be explored in the context of genome annotation. We will include ACT-influenced views into the new Apollo. Moreover, adaptors will be created that allow Apollo to present genome annotation and comparative views directly from Ensembl databases and APIs. This will enable multiple users to remotely perform fast synteny-guided annotation of multiple genomes, in a way that was only previously possible for small genomes using local edited flat-files.

The new Apollo will replace the existing Artemis and Apollo projects and lead to collaborative development and long term sustainability. This tool will support scalable short-term annotation projects as well as long-term curation, enabling non-experts to annotate and curate across the full range of sequenced genomes.

Planned Impact

The primary beneficiaries of this work will be genomic scientists in academic and industrial research and in education. In a research context, the tool will be used by professional annotators ("biocurators") during genome projects, to edit automated annotation of gene models, and to update and improve those annotations based on new evidence. Due to the increasing number of species that require curatorial attention, it is vitally important that expertise is shared as broadly as possible; combining the two major annotation tools into a single new generation tool will allow convergence in the way genomes are annotated and curated, helping to solidify and establish best practices for professional biocuration. The software will also encourage participation in genome annotation by researchers who are interested in a particular species (and may be domain experts in that species) but are not professional biocurators; for example, members of the research communities that work with that species and so are downstream users of the gene models being produced. Because the tool is based on a popular browser with a well-established "look and feel", most users should find creating or editing annotation intuitive and satisfying.

The current Artemis and Apollo tools are used extensively for teaching purposes, as well as research. Several projects in the US have incorporated Apollo into undergraduate teaching and Artemis has been an integral part of bioinformatics training workshops for several thousand junior researchers around the world. Artemis has also been used in an engagement project involving more than 70 UK schools. For many students at all levels, experiencing genome annotation first hand is eye opening way to understand common concepts in genomics, genetics and molecular biology. In the proposed work a new generation annotation tool will be produced, merging existing annotation paradigms. This will enable a greater convergence in teaching approaches - it will no longer be necessary to train two independent communities. This will simplify the field from a student perspective and bring two communities of genome scientists together.

Publications

10 25 50

publication icon
Harrison PW (2024) Ensembl 2024. in Nucleic acids research

publication icon
Martin FJ (2023) Ensembl 2023. in Nucleic acids research

 
Description This award successfully advanced the development of an updated and intuitive Apollo interface, through the integration of Artemis curation tools, a formal UX design process and integration with key annotation workflows including Ensembl and GENCODE. The funded development phase has significantly advanced the development and features of Apollo, such as enabling data extraction from common file formats (GFF3 and FASTA), allowing the Apollo Collaboration Server to display numerous views including Synteny, Linear, Dotplot and the 6 frame view from Artemis. The combination of these views into a single friendly Apollo interface combines the annotation and curation strengths of the tools, and thus these views on the data allow for different interpretations and insights to be made while allowing features within them to be edited. The project has been developed to be easily deployable using the industry recognised containerisation technology Docker. This has enabled deployments to be made on a range of installations, including EMBL-EBI Cloud infrastructure. This ease of deployment is key for the integration of Apollo into the annotation of the key resources globally, at EMBL including Ensembl (https://www.ensembl.org/), GENCODE (https://www.gencodegenes.org/) and WormBase (https://parasite.wormbase.org/).

To ensure global reach the development team has added the ability for people to login using common internet accounts (such as Google or Microsoft). Once logged in, people are able to see where other people who are also logged in at the same time are in the genomic space, providing a similar experience reading a Google Doc with multiple others. To ensure relationships between genomic entities in Apollo are "valid", the ability to define these hierarchies can now be controlled by ontologies. For a user chosen ontology, only options which fit within it are provided when a person tries to create a child feature of another feature. In addition to this, a "plugin infrastructure" has been put in place to allow Organisations to create custom validations to adhere to any specific local data rules.

In terms of the other major development objectives of the funded award, Apollo now supports synteny views, providing the views for comparative data previously available in ACT. The service has improved support for GFF3, an important file format standard, ingress and egress. Significant progress has been made with the addition of Artemis like features, particularly with the 6 frame reading frames. This work is led by Glasgow and is still ongoing as their grant started later. Of the planned work themes, the integration with annotation workflows has only been partially achieved, this was due to the changing landscape of the APIs in Ensembl. These have been under highly active development for the latter part of the Apollo award, meaning that work against the current perl and REST APIs would be quickly obsolete as the APIs documented in the original proposal are now due to be retired. Integration has been achieved through the use of an agreed workflow process that relies on FTP exchange, and Ensembl will update this process to the new graphQL APIs if it becomes warranted in the future.

In a major landmark for the Apollo resource, the beta release of Apollo was made in May 2023. An Apollo user summit was held shortly after this in June to engage with multiple user groups to establish the requirements for future roadmaps in addition to the features outlined in the grant. This supplemented the already well established User Experience design process to combine the best aspects of Apollo and Artemis into the new Apollo service. To set the scene for the latter, there have been several meetings with teams at EMBL-EBI, establishing the current state of the art and feeding key user experiences into the design sprints. We followed a formal ux design process as we held in person and virtual design sprint meetings that fed the 2023 user summit that provided the full direction to the required features, particularly from the larger annotation resources of Wormbase, VeupathDB, Gencode, and Ensembl.

Despite the award now ending, Apollo continues to be developed through the now closely established collaboration of University of California Berkeley, EMBL-EBI and Glasgow University. This process continues to be guided through the well established User Experience design process that involves close collaboration with the user community and annotation resources.
Exploitation Route The WebApollo interface, complete with artemis curation tools, will support a huge variety and array of community annotation and curation projects. We expect that the refined and user led design we will see significant utilisation of the software from both academia and industry. It will also be the main curation software for Ensembl Havana and WormBase manual annotation efforts. The tool will have usage in a wide range of species and industries including healthcare settings in human and mouse and agricultural species for annotations relating to food security and biodiversity.

The project has attracted interest from other genome annotation groups including those involved in the METT and gencode. The METT has utilised dedicated Apollo instances administered by EMBL-EBI staff allowing researchers to inspect hosted resources and create on the fly sessions with self supplied data.

A significant development is that the gencode group will be adopting Apollo as its annotation tool. To facilitate this effort has been put into integrating Apollo with the software used in Ensembl to assign stable identifiers to genomic features and to load the annotation data into Ensembl databases for dissemination through the Ensembl website to the research communities.

Wormbase staff members have been involved in the stakeholder meetings and user summits. The Apollo software has been developed so that plugins can be written to expand it to meet the needs of those who use it, keeping the main tool applicable for a broad audience but customisable for those who require it.

Potential to host Apollo instances at EMBL-EBI to provide community annotation resources for various research communities.
Sectors Agriculture

Food and Drink

Healthcare

Pharmaceuticals and Medical Biotechnology

URL https://apollo.jbrowse.org/demo
 
Title Apollo 
Description Apollo is a web-based tool that enables collaborative on-line editing of genome annotations. The BBSRC/NSF award supports further development of the tool. 
Type Of Technology Webtool/Application 
Year Produced 2013 
Open Source License? Yes  
Impact None far from the development supported by the award 
URL https://github.com/GMOD/Apollo
 
Title Apollo3 (WebApollo) 
Description Apollo3 is the latest iteration of the web-based tool that enables collaborative on-line editing of genome annotations. The BBSRC/NSF award supports further development of the tool. 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact None, not yet beta released 
URL https://github.com/GMOD/Apollo3
 
Description Apollo User summit 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact Three day Apollo User Summit held at the Wellcome Genome Campus, Hinxton, held in June 2023. Provided an introduction to Apollo, talks from attending users on their needs and desires for Apollo to support their research and services. Generating ideas for future features in working groups, with further sessions on idea refinement. The outcomes of the event directly fed the development of the service to ensure it was most effective for the needs of the community.
Year(s) Of Engagement Activity 2023