RNA proposal title - The RNAcentral database of non-coding RNAs
Lead Research Organisation:
European Bioinformatics Institute
Department Name: Ensembl Genomes
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
We will create a federated database and associated web portal, RNAcentral, to accession, store and represent non-coding RNA sequence data. A database repository (using the Oracle Relational Database Management System) will be constructed as an extension to the European Nucleotide Archive, and new tools developed to facilitate the submission of RNA sequence. In addition to direct submission, the repository will also be populated through the development of import pipelines based on agreed standards for data representation with expert databases who have agreed to support the project (initially gtRNAdb, HGNC, lncRNAdb, miRBase, Modomics, piRNAbank, Pombase, Refseq, Rfam, the Ribosomal Database Project, RNAdb, sRNAmap, SRPDB, tmRDB, the tmRNA website and VEGA). A web portal will be developed (using the Drupal open-source content management system) providing access to the submitted and imported sequences, and providing links out to the expert resources' own sites. A data warehouse (using a common biological data warehousing tool such as BioMart or InterMine) will also be developed, and bulk downloads of sequence sets will be provided. In the second period of the project, we will develop further pipelines to identify redundancy among submissions, assign submitted sequences to defined families, and (with the aid of prediction tools such as Rfam and RNAmmer, and in collaboration with genomic and model organism resources) systematically provide complete sets of non-coding RNA annotations across all complete genomes. The resources developed under this proposal will serve as the core infrastructural component of a wider international initiative to coordinate work on functional RNAs.
Planned Impact
RNAcentral will provide an underpinning resource contributing indirectly to all BBSRC strategic objectives: food security, biofuels, industrial biotechnology and human health. It will be used by members of diverse life science research communities, ranging from bioinformaticians, to experimental biologists, to academic clinicians. RNAcentral will have an important impact in applications such as biotechnology, therapeutics, agriculture and ecology. The need for RNAcentral has become critical through the huge growth in discovery of non-coding RNAs from next generation sequencing. By capturing and disseminating this valuable knowledge, we will be directly addressing the BBSRC's enabling themes, data driven biology, systems approaches to biosciences and synthetic biology. A fundamental part of the latter two themes is a complete "parts list" for each genome, and RNAcentral will help move science towards that goal and allow researchers to find all RNA genes in an organism easily.
RNAs hold great hope for ever-wider clinical and biotechnological applications. For example, microRNAs have been implicated as diagnostic signatures for cancer, snoRNAs in the major Prader-Willi phenotypes, bacterial small RNAs in pathogenicity, plant small RNAs in hybrid necrosis, and ribozymes in the cleavage of specific target RNAs. Again, improved annotation of and access to RNA data will improve the discovery and utilization of novel RNA targets for diagnostics and drug targets. There is intense research in the field of RNA based therapeutics and they hold some promise to improve health and welfare internationally.
A number of commercial organisations manufacture experimental resources, for example microarrays, based on up-to-date gene annotation. Some resources have also been made available for specific classes of non-coding RNA gene; for example, several companies make microRNA detection kits. The companies themselves will therefore benefit from improved annotation of non-coding RNAs, and these resources underpin experimental studies in commercial and academic organisations. Along with the more clinical aspects described above RNAcentral will help to foster wealth creation through innovative application of RNA sequence information.
Non-coding RNAs such as ribosomal RNAs have long been used as a tag to identify species. Application of high throughput sequencing has opened up opportunities to understand biodiversity on an unprecedented scale. By better understanding biodiversity and how it is being changed will enhance our ability to manage and conserve the world's great natural genetic resources.
Having all known non-coding RNA sequences in a single resource will give allow for a much easier overview of the growth and impact of RNA data. For example one will be able to compare the number of RNA genes versus protein coding genes in a genome. This will allow policy makers and funders to better gauge the scale of support needed to maximise output compared to other priorities.
RNAs hold great hope for ever-wider clinical and biotechnological applications. For example, microRNAs have been implicated as diagnostic signatures for cancer, snoRNAs in the major Prader-Willi phenotypes, bacterial small RNAs in pathogenicity, plant small RNAs in hybrid necrosis, and ribozymes in the cleavage of specific target RNAs. Again, improved annotation of and access to RNA data will improve the discovery and utilization of novel RNA targets for diagnostics and drug targets. There is intense research in the field of RNA based therapeutics and they hold some promise to improve health and welfare internationally.
A number of commercial organisations manufacture experimental resources, for example microarrays, based on up-to-date gene annotation. Some resources have also been made available for specific classes of non-coding RNA gene; for example, several companies make microRNA detection kits. The companies themselves will therefore benefit from improved annotation of non-coding RNAs, and these resources underpin experimental studies in commercial and academic organisations. Along with the more clinical aspects described above RNAcentral will help to foster wealth creation through innovative application of RNA sequence information.
Non-coding RNAs such as ribosomal RNAs have long been used as a tag to identify species. Application of high throughput sequencing has opened up opportunities to understand biodiversity on an unprecedented scale. By better understanding biodiversity and how it is being changed will enhance our ability to manage and conserve the world's great natural genetic resources.
Having all known non-coding RNA sequences in a single resource will give allow for a much easier overview of the growth and impact of RNA data. For example one will be able to compare the number of RNA genes versus protein coding genes in a genome. This will allow policy makers and funders to better gauge the scale of support needed to maximise output compared to other priorities.
Publications
Silvester N
(2015)
Content discovery and retrieval services at the European Nucleotide Archive.
in Nucleic acids research
RNAcentral Consortium
(2015)
RNAcentral: an international database of ncRNA sequences.
in Nucleic acids research
Pakseresht N
(2014)
Assembly information services in the European Nucleotide Archive.
in Nucleic acids research
Silvester N
(2018)
The European Nucleotide Archive in 2017.
in Nucleic acids research
The RNAcentral Consortium
(2017)
RNAcentral: a comprehensive database of non-coding RNA sequences.
in Nucleic acids research
Description | A new resource, RNAcentral (http://rnacentral.org) has been developed for accessing information about non coding RNAs (i.e. functional RNA molecules that do not encode for proteins). Over the course of 6 releases, information was integrated from 23 collaborating databases, and over 12,000,000 RNAs sequences were included in the resource. Tools have been developed for clustering similar RNAs, exploring taxonomic distributions, searching for data by sequence or description, and for visualising non-coding RNA genes on the genome. As of January 2019, the database holds 14,476,418 sequences. |
Exploitation Route | Non-coding RNAs (ncRNAs) have a vital role in biology but hitherto there have been few resources dedicated to them and it has been hard for researchers to find, search and download all relevant data. With the increasing prevalence of whole-genome and transcriptome sequencing, however, the number of known ncRNAs has significantly increased. RNAcentral makes this data available to researchers through an integrated portal for the first time, making exploration of all relevant data for the first time. |
Sectors | Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology |
URL | http://rnacentral.org |
Description | The bioinformatics company Era7 uses RNAcentral to build a reference sequence database for metagenomics analysis, which is available in open-source and paid-for versions. |
First Year Of Impact | 2017 |
Sector | Environment,Healthcare |
Impact Types | Economic |
Description | Bioinformatics and Biological Resources |
Amount | £843,027 (GBP) |
Funding ID | BB/N019199/1 |
Organisation | Biotechnology and Biological Sciences Research Council (BBSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2017 |
End | 12/2019 |
Title | RNAcentral |
Description | A new database providing access to data about non-coding RNAs. The website has been continuously improved during 2016 with new features including a redesigned homepage, more relevant search results, and a lightweight genome browser. Over 1,000,000 new sequences were added in 2016. |
Type Of Material | Database/Collection of data |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | RNAcentral currently integrates data from 26 RNA-centric resources, with 4 new databases added in 2017. Having established itself as the leading resource for RNA sequence data, RNAcentral now also provides additional analysis by annotating all sequences with Rfam families and using these annotations for quality control. RNAcentral is regularly updated with 8 releases made available since its launch in 2014. The annual RNAcentral consortium meeting brings together numerous individual databases and provides a forum for discussion of common interests and priorities. The RNAcentral website had approximately 20,000 users in 2017 (measured by unique IP address). |
URL | http://rnacentral.org |
Description | A presentation on "RNAcentral: The Non-coding RNA Sequence Database" to the UK RNA meeting. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation on "RNAcentral: The Non-coding RNA Sequence Database" was made to the UK RNA meeting. |
Year(s) Of Engagement Activity | 2016 |
URL | https://rnauk2016.wordpress.com |
Description | Presentation in EMBL-EBI workshop on "Databases for microRNA and lncRNA Biology" |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | RNAcentral was presented and demonstrated at the EMBL-EBI workshop on "Databases for miRNA and lncRNA Biology". |
Year(s) Of Engagement Activity | 2016 |
URL | http://blog.rnacentral.org/2016/09/upcoming-workshop-databases-for-mirna.html |
Description | Presentation on "Bioinformatic Resources for ncRNA Analysis: RNAcentral and Rfam" at ELIXIR meeting in Aarhus. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A oresentation on "Bioinformatic Resources for ncRNA Analysis: RNAcentral and Rfam" at ELIXIR meeting at the Technical University of Denmark, Aarhus. |
Year(s) Of Engagement Activity | 2016 |
Description | Presentation on "Bioinformatic Resources for ncRNA Analysis: RNAcentral and Rfam" at St. Petersburg State University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation on "Bioinformatic Resources for ncRNA Analysis: RNAcentral and Rfam" was delivered at St. Petersburg State University, Russia. |
Year(s) Of Engagement Activity | 2016 |
URL | https://bioseminars.wordpress.com/2016/04/11/bioinformatucs-ncrna/ |
Description | Presentation on "RNAcentral and Rfam: databases for non-coding RNA sequences and RNA families" at the Adam Mickiewicz University. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A seminar was delivered on "RNAcentral and Rfam: databases for non-coding RNA sequences and RNA families" at the Adam Mickiewicz University in Poznan, Poland. |
Year(s) Of Engagement Activity | 2016 |
URL | http://know-rna.amu.edu.pl/lecture-dr-petrov/ |
Description | Presentation on "RNAcentral and Rfam: databases for non-coding RNA sequences and RNA families" at the University of Toronto. |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A seminar was given on "RNAcentral and Rfam: databases for non-coding RNA sequences and RNA families" at the University of Toronto. |
Year(s) Of Engagement Activity | 2016 |
URL | https://torbug.org/schedule |
Description | Presentation on "Rfam and RNAcentral: Tools for understanding the RNA universe" at the University of Cambridge |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A seminar was delivered on "Rfam and RNAcentral: Tools for understanding the RNA universe" at the University of Cambridge. |
Year(s) Of Engagement Activity | 2016 |
URL | http://talks.cam.ac.uk/talk/index/64142 |
Description | Talk and demonstration on "Introduction to RNAcentral, a non-coding RNA sequence database" at Mahidol University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A talk and demonstration were given on the subject, "Introduction to RNAcentral, a non-coding RNA sequence database" at Mahidol University, Bangkok, Thailand. |
Year(s) Of Engagement Activity | 2016 |
Description | Talk and interactive session on "Searching and accessing non-coding RNA sequences with RNAcentral and Rfam" at EMBL-EBI workshop on "Exploring Biological Sequence Data" |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A talk and interactive workshop on "Searching and accessing non-coding RNA sequences with RNAcentral and Rfam" were delivered at the EMBL-EBI workshop on "Exploring Biological Sequence Data" |
Year(s) Of Engagement Activity | 2016 |
URL | http://www.ebi.ac.uk/training/events/2016/exploring-biological-sequence-data |
Description | Talk on RNAcentral at the Benasque meeting on Computational Analysis of RNA Structure and Function |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A presentation on RNAcentral was made at the Benasque meeting on Computational Analysis of RNA Structure and Function |
Year(s) Of Engagement Activity | 2015 |
URL | http://benasque.org/2015rna/cgi-bin/talks/allprint.pl |
Description | Talk/demonstration on RNAcentral at EBI training course on Online resources for Non-Coding RNA |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | A talk and demonstration on RNAcentral at the EBI training course on Online resources for Non-Coding RNA |
Year(s) Of Engagement Activity | 2015 |
Description | Webinar on RNAcentral: an international databases of ncRNA sequences |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | A Webinar on "RNAcentral: an international databases of ncRNA sequences". The webinar remains available online. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.ebi.ac.uk/training/online/course/rnacentral-webinar |