WormBase ParaSite

Lead Research Organisation: European Bioinformatics Institute
Department Name: Ensembl Genomes

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

WormBase ParaSite is a database that provides rapid access to new high-throughput genomic and related data from parasitic flatworms and roundworms (helminths). These data include genome sequence, gene expression data, and regulatory data, and are generally produced using massively parallel nucleotide sequencing strategies, and need to be integrated and interpreted to inform parasitology. A major challenge is to provide structural and functional annotation on the genome assemblies, to automatically update this as new experimental evidence becomes available and maintain tracking between successive versions such that researchers can continue their work as the reference data sets improve.

ParaSite is mostly implemented through the re-use and (where necessary, the) extension of database technologies developed elsewhere, including the MAKER pipeline (and other tools like RepeatMasker and RFAM) for genome annotation, tools derived from lepbase for representation of genome quality the Ensembl software stack for genome data management and preparation, and the BioMart data warehousing tool that provides high-performance data discovery and retrieval for common use cases centred on genes. Both Ensembl and BioMart provide an interface through the use of the mod-perl programming language embedded in an Apache webserver, while utilising MySQL (a common relational database management system) as the underlying data store. Increasingly, we are supporting the direct incorporation of data stored in binary, indexed file formats (e.g. BAM, CRAM for sequence alignments), simplifying the database build process and improving performance. We are using the emerging Track hub technology to arrange these files to ensure that users can locate and filter data of interest appropriately.

Planned Impact

Across the globe parasitic worms (helminths) cause a massive economic burden and are responsible for long term, chronic diseases. Helminths are therefore studied with the aim of killing or controlling them. For pathogens with smaller genomes, particularly viruses, bacteria, and protozoa, access to genome data has transformed the way research is conducted and has led to major insights into spread of infections and drug resistance, and has led to the development of new drugs and vaccine candidates. A similar transformation is starting to take place in helminth research; rapid changes in sequencing technologies have driven down costs and large scale data on genome and gene expression are becoming available. WormBase ParaSite was established in 2014, to enable the helminth research field to accelerate by exploiting the rapid growth in available data. Through assisting helminth researchers, ParaSite will impact governments, NGOs and companies with an interest in disease control.

Amongst the downstream beneficiaries from helminth research will be those suffering from infections - more than a billion people worldwide. Human infections, mainly amongst the poorest communities, can result in abdominal pain, haemophilia, stunted growth and mental development, malnutrition, fatigue, disfigurement, blindness, circulatory disorders, or liver and bladder pathologies. Some anthelmintic drugs do exist but with an over-reliance on a small repertoire, the development and spread of drug resistance is an ever-present danger.

The global agriculture industry will also benefit from new helminth control measures. In the UK, potato farming is badly affected by potato cyst nematode, and livestock are affected by gastrointestinal nematodes and liver flukes.

WB-PS was launched to exploit the rapid increase in available helminth sequence data (genomes and gene expression data). Through the organisation, analysis and dissemination of these data, WormBase ParaSite aims to: (i) provide a clear, annotated representation of the functional regions of genome sequences; (ii) transfer knowledge from well-annotated to less well-annotated genomes and (iii) allow comparisons between helminths so that differences between genomes can be correlated with the evolution of pathogenic traits. Automatic pipelines integrate new data to ensure that users can access an up-to-date interpretation of all available data, and the use of standard data query and retrieval interfaces reduces time that would otherwise be wasted in finding and re-formatting data to make it interoperable.

The new application will up-scale WormBase ParaSite - to ensure that the expected flood of new data (more numerous and more contiguous genome assemblies; new expression and variation data) can be processed and made useful to helminth researchers. Another objective is to ensure rapid releases such that this data is quickly disseminated to the community; another is to provide training, in situ at prominent nodes of helminth research, to ensure maximise the familiarity of researchers with the available data and tools.

A new portal within ParaSite will be aimed directly at researchers developing drug treatments. We will use sequence similarity to identify homologues to known drug targets from other species (as curated in the ChEMBL resource). We will provide filters to allow users to select genes from parasites whose homologues have properties such as inhibition by a drug that has reached clinical trials but has no known toxicology warnings, or aggregated scores that reflect physico-chemical properties of a compound or drug. To predict new, exploitable target-compound combinations, users will be able to combine their results with relevant gene expression data (e.g. expressed in a mammalian-infective stage), absence of an orthologue in the parasite's host.

Publications

10 25 50

publication icon
International Helminth Genomes Consortium (2019) Comparative genomics of the major parasitic worms. in Nature genetics

publication icon
Howe KL (2021) Ensembl 2021. in Nucleic acids research

publication icon
Cunningham F (2022) Ensembl 2022. in Nucleic acids research

publication icon
Howe KL (2020) Ensembl Genomes 2020-enabling non-vertebrate genomic research. in Nucleic acids research

publication icon
Gene Ontology Consortium (2021) The Gene Ontology resource: enriching a GOld mine. in Nucleic acids research

publication icon
Bolt BJ (2018) Using WormBase ParaSite: An Integrated Platform for Exploring Helminth Genomic Data. in Methods in molecular biology (Clifton, N.J.)

 
Description We have developed and improved WormBase Parasite (http://parasite.wormbase.org), a resource currently providing access to 210 genomes from 130 nematode and 39 flatworm species. Data available includes genome assemblies, annotations, comparative genomics, and functional analysis, and a range of query interfaces and tools, including genome browsers and a data mining platform. During the funded period we made 6 releases of the resource and added a number of new features to the platform, including a mechanism for the capture of free-text comments on genes from the research community, and a sub portal for exploring helminth gene expression data. We also provided training for the resource (with interactive training workshops) at national and international conferences. The 2017 WormBase ParaSite article (PMID: 27899279) has 236 citations in PubMed.
Exploitation Route For those studying parasite-mediated pathologies, ParaSite provides an organised way to efficiently access the data; and information about similarities and differences between genes, and species, that will potentially provide the information needed to develop new strategies for control and treatment. A variety of interfaces (interactive and programmatic) are provided to facilitate data access.
Sectors Agriculture, Food and Drink,Chemicals,Pharmaceuticals and Medical Biotechnology

URL http://parasite.wormbase.org
 
Description The target audience of WBPS is primarily academia, where our resource has grown to be accessed by tens of thousands of users per year, with >200 citations of the 2017 paper recorded in PubMed. The most widely accessed species include medically and agronomically important species such as blood flukes, tapeworms and soil-transmitted helminths.
Sector Agriculture, Food and Drink,Healthcare
Impact Types Economic

 
Description WormBase: expanding the reference resource for helminth research
Amount £889,457 (GBP)
Funding ID MR/S000453/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2018 
End 03/2022
 
Title WormBase ParaSite 
Description WormBase ParaSite is aimed at researchers engaged in parasitic worm genomics, encompassing flatworms as well as nematodes, and provides genome sequence, genome browsers, semi-automatic annotation and comparative genomics data for >160 species. Additional tools include a cross species data mining platform, protein and nucleotide sequence search, and a variant effect predictor to enable the analysis of different strain/isolate genomes in the context of the reference. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact The 2017 WormBase ParaSite article (PMID: 27899279) has 236 citations in PubMed. 
URL https://parasite.wormbase.org
 
Description Ensembl 
Organisation Ensembl
Country United Kingdom 
Sector Academic/University 
PI Contribution The Ensembl team at EMBL-EBI develop software and infrastructure for the storage and display of genomic data for selected species. WormBase ParaSite have deployed their software and infrastructure, with the specific goal of enabling genomics for the helminth research community.
Collaborator Contribution The Ensembl team at EMBL-EBI develop software and infrastructure for the storage and display of genomic data for selected species. WormBase ParaSite have deployed their software and infrastructure, with the specific goal of enabling genomics for the helminth research community.
Impact Continued WormBase ParaSite releases are reliant on Ensembl software.
Start Year 2014
 
Description WormBase consortium 
Organisation WormBase (Biology and Genome of C.Elegans)
Country United States 
Sector Charity/Non Profit 
PI Contribution WormBase Consortium is led by Paul Sternberg of CalTech, Kevin Howe of the EBI, Matt Berriman of the Wellcome Sanger Institute, and Lincoln Stein of the Ontario Institute for Cancer Research. The consortium runs a model organism database containing data from research on C. elegans and other nematodes. WormBase Parasite provides searching and data access capabilities that are not available through the WormBase website
Collaborator Contribution WormBase curates reference genomes which are then imported into WormBase Parasite and provide important functional information for understanding the genomes of comparator species.
Impact Provision of annotated genomes for C. elegans and Brugia malayi
Start Year 2014
 
Description Presentation and training at annual "Molecular and Cellular Biology of Helminth Parasites" meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Poster presentation and extensive demonstration and feedback gathering with a cohert of ~100 helminth researchers.
Year(s) Of Engagement Activity 2018
 
Description Wellcome Advanced Course in Helminth Genomics - WormBase ParaSite component 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact In collaboration with Wellcome Genome Campus Connecting Science and the Wellcome Sanger Institute Parasite genomics group, we developed a week-long comprehensive training course on helminth genomics, covering topics such as genome assembly and annotation, and population genomics. A significant component of the course (1.5 days) was devoted to WormBase ParaSite. We delivered the course for the first time in September 2019 to a cohort of African helminth biologists in Ghana.
Year(s) Of Engagement Activity 2019
 
Description Workshop at the British Society of Parasitology Spring meetings 2018 and 2019 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact Practical workshop demonstrating common use-cases for WormBase ParaSite tools
Year(s) Of Engagement Activity 2018,2019