RNAcentral, the RNA sequence database

Lead Research Organisation: University of Manchester
Department Name: School of Biological Sciences

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

Under this proposal, we will continue the development of RNAcentral, an international database of non-coding RNA sequences, currently made up of sequence data contributed by 15 member databases. To make RNAcentral more comprehensive, we will import 21 additional ncRNA databases and carry out regular data releases. In addition to the core sequence data, our users care most about functional annotation of ncRNAs. We will therefore focus on incorporating additional types of annotations, such as high-quality secondary structures, inter-molecular interactions, GO and SO terms, and textual annotation from Wikipedia. We will map RNAcentral sequences onto appropriate reference genomes, and provide new functionality such as exploring overlapping sequences in the same species. New visualisations will be developed to display these new data, taking advantage of modern web technologies. In order to increase the sustainability of RNA databases worldwide, we will develop prototype RNAcentral infrastructure elements that we will make available to RNAcentral database contributors. To this end, we will develop an improved sequence search facility in collaboration with the miRBase database, and make this search available to them to search their sequence data and display the results on their own website using a RESTful API. This functionality will subsequently be made available to other RNAcentral databases. To disseminate information about RNAcentral, we will engage in outreach and training activities by hosting workshops, holding annual SAB meetings, and publishing biennial papers in the NAR Database Issue. RNAcentral, as a comprehensive repository of ncRNAs, will underpin a global effort to unravel the functions of ncRNAs.

Planned Impact

Non-coding RNAs are found in every living organism, and advances in ncRNA research, reflected in and supported by RNAcentral, will contribute to new applications in biotechnology, therapeutics, agriculture, and ecology. RNAcentral, as a comprehensive database of ncRNA sequences, indirectly contributes to all BBSRC strategic objectives: food security, biofuels, industrial biotechnology and human health. RNAcentral will be used by bioinformaticians and wet-lab scientists in both academia and industry working on all aspects of ncRNA Biology. As sequencing technologies become more advanced and new RNA structure probing technology emerge, there is a growing need to maintain a comprehensive and well-annotated collection of all ncRNAs.

By capturing and disseminating this valuable knowledge, we will be addressing the BBSRC's enabling theme of innovation, allowing industrial partners to make more rapid discoveries and inventions of benefit to society. RNAs hold great hope for ever-wider clinical and biotechnological applications. For example, microRNAs have been implicated as diagnostic signatures for cancer, snoRNAs in the major Prader-Willi phenotypes, bacterial small RNAs in pathogenicity, plant small RNAs in hybrid necrosis, and ribozymes in the cleavage of specific target RNAs. Again, improved annotation of and access to RNA data will improve the discovery and utilization of novel RNA targets for diagnostics and drug targets. There is intense research in the field of RNA based therapeutics and they hold some promise to improve health and welfare internationally. In the area of plant sciences we expect our annotations to be of use in genome engineering to improve disease resistance and crop yields. In addition, the ability to make RNAs in very large quantities has raise the idea of using RNA directly as a weed and pest control measure through crop spraying.

A number of commercial organisations manufacture experimental resources, for example microarrays, based on up-to-date gene annotation. Some resources have also been made available for specific classes of non-coding RNA gene; for example, several companies make microRNA detection kits. The companies themselves will therefore benefit from improved annotation of non-coding RNAs, and these resources underpin experimental studies in commercial and academic organisations. Along with the more clinical aspects described above RNAcentral helps to foster wealth creation through innovative application of RNA sequence information.

Non-coding RNAs such as ribosomal RNAs have long been used as a tag to identify species. Application of high throughput sequencing has opened up opportunities to understand biodiversity on an unprecedented scale. By better understanding biodiversity and how it is being changed will enhance our ability to manage and conserve the world's great natural genetic resources.

Having all known non-coding RNA sequences in a single resource gives a much easier overview of the growth and impact of RNA data. For example, one can compare the number of RNA genes versus protein coding genes in a genome. This will allow policy makers and funders to better gauge the scale of support needed to maximise output compared to other priorities.

Publications

10 25 50
publication icon
The RNAcentral Consortium (2017) RNAcentral: a comprehensive database of non-coding RNA sequences. in Nucleic acids research

publication icon
The RNAcentral Consortium (2019) RNAcentral: a hub of information for non-coding RNA sequences. in Nucleic acids research

 
Description The RNAcentral resource has had 17 public releases. 40 different databases contribute their RNA data. The number of users has been steadily increasing. RNAcentral has developed methods and interfaces to search the millions of sequences in the database. In this grant, we have developed a process by which other RNA databases and resources can use these RNAcentral searches in their own web pages. The miRBase database (funded under BB/M011275/1) has implemented this search interface.
Exploitation Route RNAcentral is used by RNA researchers around the world. Its continued availability is vital to many different research programmes in academia and industry.
Sectors Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://rnacentral.org/
 
Description This grant funds the development of a sequence database, RNAcentral (http://rnacentral.org/). RNAcentral is used by researchers around the world as a primary sequence database of RNA gene sequences, and is the only resource of its kind. The uses are wide-ranging, but include commercial and pharmaceutical organisations with interests in non-coding RNA sequence and function. The papers describing RNAcentral have been cited over 320 times (Google Scholar).
Sector Education,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Title RNAcentral 
Description RNAcentral aims to collect all known non-protein-coding RNA sequences in a single resource. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact RNAcentral has brought together over 40 independent expert RNA databases. There have been 8 releases of the database. 
URL http://rnacentral.org/