The GARNet Transcriptomics and Bioinformatics service III

Lead Research Organisation: University of Nottingham
Department Name: Sch of Biosciences

Abstract

This grant application requests a continuation and enhancement of previous work extending back for many years where NASC has acted as a public data warehouse and service for the model plant - Arabidopsis. For several years, the UK Arabidopsis community has had access to a variety of technological and informatic (database and computational) services that were provided for it by NASC under the umbrella of a multi-site consortium called GARNet. These included very large numbers (thousands) of GeneChip results that had recorded information on what all of the 22,000+ genes in a plant were doing at various times and under defined conditions. This GeneChip data from NASCarrays has been disseminated to the plant user community through an open website with no restrictions for use or location. The GARNet service also made it possible to maintain, develop and renovate the much older NASC database and user catalogue relating to seed and DNA stock data and distribution of seed stocks. In addition the programme has allowed NASC to co-operate with all of the appropriate and relevant international bodies in order to build better, more useful and more open services. As part of this, NASC has adopted the common languages (also known as ontologies and controlled vocabularies) that have been internationally defined to make interchange between widely distributed databases possible and precise. NASC is spplying for support to continue this integrated information services under the guidance of the democratically elected Arabidopsis community advisory body / GARNet. NASC also wishes to extend the service to include stronger and more user-friendly analysis tools for all users of GeneChip data on arabidopsis regardless of their location or local facilities. This last approach is in response to specific needs identified by users from the plant community.

Technical Summary

This grant application requests a continuation and enhancement of NASC's bioinformatics activities as a public data warehouse and service for Arabidopsis. The GARNet programme has provided transcriptomics, phenotype and various associated bioinformatics services for the Arabidopsis community via NASC. GeneChip data from NASCarrays has been disseminated to the wider user community through an open website with no restrictions. The GARNet service has also provided the entire NASC informatics infrastructure for maintenance and development of our European seed and DNA stock data and distribution catalogue. In addition the programme has allowed us to adopt all relevant standards (MIAME, PO, PATO) and to co-operate with the appropriate bodies in transcriptomics, phenomics and distributed computing in order to build better and more useful services. We are asking for support to continue our integrated bioinformatics services under the guidance of the democratically elected Arabidopsis community advisory body / GARNet. We also wish to extend the service to include distributed access to Genespring workgroup software for GeneChip users in order to fill some of the analysis needs identified by our users.

Publications

10 25 50
 
Description There are more than 22,000 Arabidopsis researchers in >9,000 laboratories worldwide. The Nottingham Arabidopsis Stock Centre (NASC) has a vital core role as infrastructure support for this highly distributed and prolific Arabidopsis community.

We provide materials, data and guidance worldwide (> 100,000 seed tubes per annum 2015-2019); and our existence helps tens of thousands of users to save time, money and effort through centralised services. Our outreach is extensive, regular and user- oriented and we constantly strive to improve both our customer service and our value for the community.
Exploitation Route Our services are equally available to Universities, institutes, companies and international users through simple, intuitive interfaces. Distribution abroad requires the same infrastructure as a purely UK resource but adds value by encouraging international donation of stocks and data, supplementing grant income and helping to consolidate the Arabidopsis and wider plant Community. All European plant research groups requiring Arabidopsis stocks are obliged to use NASC but thousands of non- Europeans access our resource.
Sectors Agriculture, Food and Drink,Energy,Environment,Manufacturing, including Industrial Biotechology,Other

URL http://arabidopsis.info
 
Description The increasing demands of a growing, prosperous world for improved agricultural products including food, fibre and fuel, intensifies the need for an extensive understanding of the basic biology and ecology of plants. Arabidopsis is the most widely used model system to study plant biology and has delivered numerous breakthroughs in understanding of plant and basic biological processes. The knowledge gained from studies in Arabidopsis serves to advance our understanding of other plant species, particularly crop species, and thus translates into new or improved plant products and increased agricultural productivity. Arabidopsis has underpinned the genomic revolution in plant science and represents the template on which other plant and crop genomes are annotated and assessed. Arabidopsis data is key to modern crop science and through that to food security and quality of life. We are the European Arabidopsis Stock Centre and send out ~100,000+ tubes of seed worldwide (up to 140,000 in some years). Any one of those tubes can enable or inform a project that may result in any of these impacts. Together with the US stock centre we support the world Arabidopsis community and our impact is through them.
Sector Agriculture, Food and Drink,Energy,Environment,Manufacturing, including Industrial Biotechology,Other
Impact Types Economic

 
Description BBSRC BBRF
Amount £1,092,256 (GBP)
Funding ID BB/P024068/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2017 
End 10/2021
 
Title NASCarrays 
Description From 2002 to 2013 NASC arrays was the primary world database for arabidopsis transcriptomics data. We generated the majority of the world public arabidopsis transcriptome data through our physical (early technical access) Affymetrix array service; and served this data without restriction to the world community. For the first few years we produced and released more transcriptomics data by volume than the Human or Mouse community (although because of the proliferation of participating sites we were overtaken once those communities and a few others had caught up on our lead). All of our data was given away [as per our remit] to collaborating and competing projects such as the EBI (ArrayExpress), GEO in the US, BAR in Canada, and many other academic sites and commercial entities such as GenevestigATor. As a natural consequence of the rapid proliferation in sites and analysis tools (tool development beyond basic access and correlation was not in our remit) and the increased ease of access to transcriptomc techniques, many of these other sites were (and are) largely dependent on the volume and quality of our data. Following BBSRC committee funding advice, we ended the NASCarrays database in 2013 and gracefully retired the data to these sites (especially GEO as the perceived dominant site). To ensure good data practice, we made appropriate safety backups at iPlant in the US and also have frozen-curation FTP access at NASC to all data. The arabidopsis community still uses our data, re-badged at other sites with variable levels of attribution to NASCarrays. 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact We were the first centre to provide sufficient transcriptomic data in the reliable and reproducible Affymetrix format for real bioinformatic correlation studies. We were also used as domain champions for standards and the instantiation of such activities such as MIAME and large database integration activities. This was both in collaboration with other providers/databases and directly with end users in biology and computational fields; probably more importantly because we made the data available to the wider community, many researchers picked up our data and used it independently (although that can be harder to track). 
URL http://affymetrix.arabidopsis.info
 
Title arabidopsis.info 
Description The NASC germplasm database holds data on just under 1 million stocks that have been acquired since the centre began in 1990/91. We replaced the Arabidopsis Information Service in Germany (1964 - 1987) and acquired all 200+ stocks from them. In 1999 this was increased to 20,000 and in 2013 passed the 800,000 mark. We hold genomic, genetic, phenotypic, passport, collection, images, and other sundry data about these stocks and make this information freely available to researchers and collaborators worldwide. As part of the database we run a cost-recovery catalogue for ordering stocks which includes user data and a fully developed e-commerce solution bespoke to NASC. We also integrate our data with exteral and internal databases using a variety of mathods from direct data exchange through dynamic URLs to fully fledged Web Services (SOAP and REST). 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact This database underpins the distribution of arabidopsis germplasm resources to the UK and European plant community. It also provides the same service to worldwide users in collaboration/complementarity with the US stock center ABRC. This has accelerated the ease of uptake of germplasm and data associated with these germplasm entities and supported arabidopsis and other plant research to make it one of the most productive model species. 
URL http://arabidopsis.info
 
Title atensembl 
Description AtEnsembl was the first Ensembl browser/database to be produced by non-EBI staff (or ex-staff). We took the data from our http://ukcrop.net database (AceDB format) and integrated it with new genomics/sequencing data. We added data from our Germplasm activities and over time added NASCarrays and other available datasets. To this end we were the first genome browser of any species in the world to integrate genome data with stock ordering and transcriptome data. We also included early SNP data from primitive arrayseq and were consequently the most richly populated genome browser of any species for many years. Some current browsers still do not have this richness of function and almost all other species browsers are not at the point that we were when we officially closed the database. Increased computational skills in the community had led to an inevitable international proliferation of browsers; several of them using our freely available data. Our efforts, and the efforts of our collaborators at TAIR/ABRC to make access to our services as open and exploitable as possible helped to make the proliferation of browsers and analysis tools at multiple sites possible and attractive and develop the current community options. We officially closed the database at failed renewal of funding (lack of uniqueness). A snapshot is still available at the old URL as per 'good practice' but is not actively linked from our other resources to ensure data quality/currency. 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact When we were unique, we were a critical point in supporting the community with an integrated seed/genome/array data browser. The first generation of Gramene's arabidopsis data and the EBIs own arabidopsis plant-ensembl database were directly derived from and attributed to us. After the proliferation of competing browsers led to a perception by the funding committee that we were not unique and should concentrate on our core (unique in Europe) remit of germplasm distribution; we dropped our browser and integrated with the prevailing providers. Throughout this process and during our own development, we always made our data freely and openly available to others (collaborators or competitors) in he community and therefore were instrumental in catalysing a range of current bioinformatics browser systems for plants (e.g. AIP). 
URL http://atensembl.arabidopsis.info