Establishment of International Plant and Insect Pathogen Sequence Database (IPIPSD) Using Existing Deep Sequencing Data
Lead Research Organisation:
University of Oxford
Department Name: Zoology
Abstract
This is an international collaboration project aiming to exploit existing datasets using new bioinformatics tools of pathogen screening and identification. The existing datasets are newly generated by two area-leading international consortia in the 1000 Plant (1KP) transcriptome and 1000 Insect Transcriptome Evolution (1KITE) projects. Ongoing bioinformatics analyses are co-ordinate by the China National Genebank (CNGB). The newly available technology tool for pathogen screening is established in a previous NERC funded project to the PI in the Technology Proof of Concept programme. The new opportunity is based on the mutual benefits for conducting a pathogen screening study using the two large sequencing datasets and the new bioinformatics pipeline. The objective is to set up an International Plant and Insect Pathogen Database (IPIPSD) that can archive, sort and represent pathogen sequences from datasets generated by the Next Generation Sequencing (NGS) technology. The 1KP and 1KITE projects were designed to make gene and gene function discoveries in non-model species living in a wide variety of environmental/ecological niches. The shoot gun NGS sequencing strategy non-discriminately produced sequences derived from plants/insects as well as from pathogens naturally infected them. We propose to search, harvest, annotate and sort pathogen sequences from the 1KP and 1KITE libraries. IPIPSD will be largely composed of non-model species thus is unique to the other existing databases. The database will be published in peer reviewed scientific journals and made freely accessible to the public via the World Wide Web. To publicise IPIPSD and to promote the pathogen screening technology, an International Workshop of Pathogen Screening is proposed in the 9th International Conference of Genomics in November 2014. Once set up, IPIPSD will facilitate pathogen screening in environmental samples, providing a much needed knowledge advance for supporting environmental/ecological studies. Furthermore, IPIPSD will serve as a platform for large long term international collaborations, e.g., the Genome 10000 (Genome 10K) and 5000 Insect and Arthropod Genome Sequencing Initiative (i5K) which are also hosted in CNGB.
Planned Impact
The specific users are the academic community including pathologist, virologists, microbiologists, plant scientists, entomologists, biologists, bioinformaticians, ecologists, and environmental scientists. This project is to set up an International Plant and Insect Pathogen Sequence Database (IPIPSD) that facilitate pathogen screens in plants and insects. The database will be based on existing datasets produced by two current international projects: the 1000 Plant (1KP) transcriptome and the 1000 Insect Transcriptome Evolution (1KITE) projects. The vast majority of species used in these two projects are non-model species living in the natural conditions. Thus the IPIPSD will particularly have application potentials in environmental and ecological research. In the long term, IPIPSD will assist ecologists to assess plant and insect anti-pathogen immunities in the natural conditions. To biologists, a "bigger" view of pathogen profiles in the experimental systems offers the opportunity to extend hypotheses testing in natural conditions. To environmental scientists, environment change has impacts on emerging disease issues not only to humans but also to the plant and animal communities that support the global ecosystem. Bioinformatics becomes increasingly important in all aspects in biological sciences. We will collaborate with the strong bioinformatics team in the China National Genebank (CNGB), China, to set up IPIPSD and to make it freely available to the worldwide research communities via the World Wide Web. IPIPSD will be published in peer reviewed scientific journals. Before submission of IPIPSD for publication, an International Workshop on Pathogen Screening will be organised in the 9th International Conference on Genomics to popularise IPIPSD and to obtain comments and feedbacks from experts as well as wider users.
Non-academic users who will benefit from this project in the long term are: agricultural and horticultural managers, farmers, and conservation managers. Policy makers and the general public will also benefit from the scientific problems/questions solved by using the IPIPSD and associated bioinformatics tools. We expect to discover and record previously unknown pathogen prevalence in the species used in 1KP and 1KITE projects. We do not expect that all these pathogens pose immediate threats to the environment and human health, but we would hypothesize that the balance between infections and their host immunities is important to maintain the environment health and wealth. All data generated in this project will be made to the NERC Environment Information Data Centre (EIDC) for releasing to the general public and for reuse.
We are aware that there may be commercial opportunities when IPIPSD is completed, particularly in the area of pathogen survey in commercial goods that include live/raw plant and animal materials. The investigators have worked together with the NERC technology transfer team and will work with the team again if opportunities develop.
The proposed IPIPSD offers an excellent opportunity to contribute the UK science excellence to the international research community. It also enhances the capability of screening pathoegns in live/raw plant/food/insect exports/imports, potentially increases the nation's health, wealth, and economic competitiveness.
Non-academic users who will benefit from this project in the long term are: agricultural and horticultural managers, farmers, and conservation managers. Policy makers and the general public will also benefit from the scientific problems/questions solved by using the IPIPSD and associated bioinformatics tools. We expect to discover and record previously unknown pathogen prevalence in the species used in 1KP and 1KITE projects. We do not expect that all these pathogens pose immediate threats to the environment and human health, but we would hypothesize that the balance between infections and their host immunities is important to maintain the environment health and wealth. All data generated in this project will be made to the NERC Environment Information Data Centre (EIDC) for releasing to the general public and for reuse.
We are aware that there may be commercial opportunities when IPIPSD is completed, particularly in the area of pathogen survey in commercial goods that include live/raw plant and animal materials. The investigators have worked together with the NERC technology transfer team and will work with the team again if opportunities develop.
The proposed IPIPSD offers an excellent opportunity to contribute the UK science excellence to the international research community. It also enhances the capability of screening pathoegns in live/raw plant/food/insect exports/imports, potentially increases the nation's health, wealth, and economic competitiveness.
People |
ORCID iD |
Hui Wang (Principal Investigator) |
Publications
Gao S
(2014)
Applications of RNA interference high-throughput screening technology in cancer biology and virology.
in Protein & cell
Luo J
(2017)
Micropathogen Community Analysis in Hyalomma rufipes via High-Throughput Sequencing of Small RNAs.
in Frontiers in cellular and infection microbiology
Ma J
(2015)
Mutational bias of Turnip Yellow Mosaic Virus in the context of host anti-viral gene silencing.
in Virology
Malham SK
(2014)
The interaction of human microbial pathogens, particulate material and nutrients in estuarine environments and their impacts on recreational and shellfish waters.
in Environmental science. Processes & impacts
Wang Y
(2014)
A survey of overlooked viral infections in biological experiment systems.
in PloS one
Zhao F
(2017)
Comparative transcriptome analysis of PBMC from HIV patients pre- and post-antiretroviral therapy
in Meta Gene
Zhou C
(2018)
Characterization of viral RNA splicing using whole-transcriptome datasets from host species.
in Scientific reports
Description | We developed a virus screening pipeline for searching virus signals in the existing datasets generated by the next generation sequencing technology. By using the pipeline, many unexpected virus infections were detected in experimental samples and some of the findings have been published in peer reviewed scientific journals. |
Exploitation Route | We have developed a web-based system which is hosted by the China National Gene Bank. The website is http://116.6.107.16/virus4onekp/ |
Sectors | Agriculture Food and Drink Digital/Communication/Information Technologies (including Software) Environment Pharmaceuticals and Medical Biotechnology |
Description | Advisor for virus screening |
Geographic Reach | Local/Municipal/Regional |
Policy Influence Type | Influenced training of practitioners or researchers |
Title | Virus screening |
Description | A bioinformatics pipeline was developed for screening virus sequences from the next generation sequencing datasets |
Type Of Material | Technology assay or reagent |
Year Produced | 2015 |
Provided To Others? | Yes |
Impact | This method reduced the computering resources required for screening viruses in deep sequencing datasets. |
Title | Virus database |
Description | The database provides information of viruses identified in the 1000 plant transcription project. |
Type Of Material | Database/Collection of data |
Provided To Others? | No |
Impact | The database will be included in future publications. |
Description | Virus screening in China National Gene Bank |
Organisation | Beijing Genomics Institute |
Country | China |
Sector | Academic/University |
PI Contribution | I designed and supervised the virus screening activities carried out by the collaborator, China National Gene Bank. |
Collaborator Contribution | The China National Gene Bank provided datasets and facilities. |
Impact | The establishment of virus screening pipeline enabled the Chinese National Gene Bank to screen virus infections in its plant and insect datasets. |
Start Year | 2014 |
Description | Virus screening workshop (Shenzhen) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Five international leading researchers gave talks for virus screening and anti-viral strategies in the virus workshop in the 9th International Conference of Genomics, 09-12 Sept 2014, Shenzhen, China. The workshop attracted 40-70 academic and industrial researchers. |
Year(s) Of Engagement Activity | 2014 |