Arabidopsis Stock Centre Module.

Lead Research Organisation: University of Nottingham
Department Name: Sch of Biosciences

Abstract

NASC is a seed and data resource that emerged consensually out of the community 20 years ago and services a vigorous worldwide Arabidopsis community in collaboration with our sister centre ABRC in the USA.

Our resource is widely used and appreciated globally as shown by our distribution statistics (between 50-100,000 stocks per year over the last 5 years sent all over the world from each centre). Our users extend from crop scientists and biologists through to mathematicians and system biologists. We are referenced in a very large number of publications as an underpinning resource for plant sciences, and have helped scientists to receive essential materials in a very cost effective, straightforward, consistent and efficient manner for two decades.

As required by the BBSRC, we run the NASC stock centre as a partial cost recovery service that charges small fees for seed and clone stocks in order to subsidise our operations. To do this we have an e-commerce system linked up to our internal germplasm data module. This proposal is for the informatics and bioinformatics of the catalogue, germplasm curation and data distribution.

Primary objective : to continue and maintain informatic operations for our very busy (continually expanding and extending) germplasm distribution centre.

Secondary objective : to use current state of the art integration technologies to stay relevant to changes and progress in the international arabidopsis resource provision community.

Recent changes in the international Arabidopsis funding model have led the imminent closure (this August) of the central arabidopsis database resource TAIR in the USA. As a community we are being asked to share the responsibility for Arabidopsis data resources more generally across multiple sites of specialisation.

NASC is a critical partner in the future 'federated' model for Arabidopsis informatics.

Our germplasm resource has been separate from TAIR for ALL seed operations since 1991; so we are ready for this change. In addition, we were a founding member of earlier distributed resources within Europe (PLANet: framework V funding), and have prior experience and expertise in federation (internally and externally).

This proposal would ensure continuity and stability of physical and germplasm data resources for the European Arabidopsis community (and beyond). Our past experience, current capabilities and strong positive ties with the US stock centre (ABRC) and the US Arabidopsis Informatics Portal (AIP), will allow us to lead this area internationally.

Technical Summary

This proposal requests support for survival and curational expansion of our current database and e-commerce site; with significant core changes and updates associated with increasing federation, fluid partnerships and local/remote virtualisation.

Primary objectives:
1. Continue and maintain informatic operations for our germplasm distribution centre including ongoing curation of stock additions (~20K per year).
2. Develop state of the art federated data integration based on existing infrastructure and experience.

Our current germplasm database and catalogue are MySQL, Apache, Perl & Python on RedHat with internal curation through Servoy API (Java) and command line scripts; plus Visual Basic pre-filtering of data. We have an AMIGO tree browser based on PO/PATO (EAV variant) for a subset of data with some ecotypes served through GoogleMaps (KML and live). Financial cost-recovery operations are UoN bespoke (Generic Payment Pathway ; WorldPay - scheduled to change provider in 2013) connected to our internal e-commerce catalogue. Some data has already been migrated into iplant (development not deployed) and we are working on local virtualisation to support migration out of existing hardware into UoN central repositories. Our legacy locus/genomics data associated with stocks are in MySQL served via XML (by XSLT) and DHTML / JavaScript plus legacy external ensembl and internal Gbrowse / CHADO with script-automated curation updates.

Current NASC Web Services (120+) are SOAP2 (and legacy BioMOBY) with initial RESTful conversions in progress.

TAIR closes this August. To be a core provider to the federated AIP that replaces TAIR and to maintain our services, we need to consolidate our data management approaches listed above; and capitalise on our early adoption of critical standards. We are already well placed in expertise and services, but the next few years will see significant change and we need to maintain our development impetus to stay functional.

Planned Impact

The increasing demands of a growing, prosperous world for improved agricultural products including food, fibre and fuel, intensifies the need for an extensive understanding of the basic biology and ecology of plants. Arabidopsis is the most widely used model system to study plant biology and has delivered numerous breakthroughs in understanding of plant and basic biological processes.

Knowledge gained from studies in Arabidopsis serves to advance our understanding of other plant species, particularly crop species, and thus translates into new or improved plant products and increased agricultural productivity. Arabidopsis has underpinned the genomic revolution in plant science and represents the template on which other plant and crop genomes are annotated and assessed. Arabidopsis data is key to modern crop science and through that to food security and quality of life.

Filing of patents is one measure of potential commercial activity and in 2010 there were 1,137 US utility patents referencing Arabidopsis compared to only 23 in 1994. A similarly dramatic 35-fold increase in European patent applications referencing Arabidopsis has occurred in the same timeframe.

According to The Arabidopsis Information Resource (TAIR); as of May 10th 2011 there were 21,771 Arabidopsis researchers in 8,465 laboratories worldwide. The Nottingham Arabidopsis Stock Centre (NASC) and our sister centre ABRC in the US, together have a vital core role as infrastructure support for this highly distributed and prolific Arabidopsis community.

Our services are equally available to Universities, institutes, companies and international users through simple, intuitive interfaces. Distribution abroad requires the same infrastructure as a purely UK resource but adds value by encouraging international donation of stocks, supplementing grant income and helping to consolidate the Arabidopsis and wider plant Community. All European plant research groups requiring Arabidopsis stocks are obliged to use NASC (All American users are obliged to use ABRC); but thousands of non-Europeans access our resource, particularly from Asia (notably China).

We provide materials, data and guidance worldwide and our existence helps tens of thousands of users to save time, money and effort through centralised services. Our outreach is extensive, regular and user- oriented and we constantly strive to improve both our customer service and our value to the community.

We have also been useful to BBSRC policy makers and marketing units through our inclusion in BBSRC publications: the BBSRC Data Sharing Policy documentation held NASC up as one of four examples of good practice; the 2009 BBSRC Bioscience Resources for Food Security pamphlet specifically flagged us as a key collection seed resource and our transcriptome analysis service as supporting the UK Food Security priority. These documents are utilised both by science policy makers and strategic users.

Publications

10 25 50

publication icon
Castellanos-Uribe M (2020) Integrated BioBank of Luxembourg-University of Luxembourg: University Biobanking Certificate. in Biopreservation and biobanking

publication icon
Eremina M (2016) Brassinosteroids participate in the control of basal and acquired freezing tolerance of plants. in Proceedings of the National Academy of Sciences of the United States of America

 
Description There are more than 22,000 Arabidopsis researchers in >9,000 laboratories worldwide. The Nottingham Arabidopsis Stock Centre (NASC) has a vital core role as infrastructure support for this highly distributed and prolific Arabidopsis community.

We provide materials, data and guidance worldwide (between 50,000 and >100,000 seed tubes per annum); and our existence helps tens of thousands of users to save time, money and effort through centralised services. Our outreach is extensive, regular and user- oriented and we constantly strive to improve both our customer service and our value for the community.
Exploitation Route Our services are equally available to Universities, institutes, companies and international users through simple, intuitive interfaces. Distribution abroad requires the same infrastructure as a purely UK resource but adds value by encouraging international donation of stocks and data, supplementing grant income and helping to consolidate the Arabidopsis and wider plant Community. All European plant research groups requiring Arabidopsis stocks are obliged to use NASC but thousands of non- Europeans access our resource.
Sectors Agriculture, Food and Drink,Energy,Environment,Manufacturing, including Industrial Biotechology,Other

URL http://arabidopsis.info
 
Description The increasing demands of a growing, prosperous world for improved agricultural products including food, fibre and fuel, intensifies the need for an extensive understanding of the basic biology and ecology of plants. Arabidopsis is the most widely used model system to study plant biology and has delivered numerous breakthroughs in understanding of plant and basic biological processes. The knowledge gained from studies in Arabidopsis serves to advance our understanding of other plant species, particularly crop species, and thus translates into new or improved plant products and increased agricultural productivity. Arabidopsis has underpinned the genomic revolution in plant science and represents the template on which other plant and crop genomes are annotated and assessed. Arabidopsis data is key to modern crop science and through that to food security and quality of life. We are the European Arabidopsis Stock Centre and reliably send out >50,000 tubes of seed worldwide per annum. Any one of those tubes can enable or inform a project that may result in any of these impacts. Together with the US stock centre we support the world Arabidopsis community and our impact is through them.
Sector Agriculture, Food and Drink,Energy,Environment,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology,Other
Impact Types Economic

 
Description UKPGRG
Geographic Reach National 
Policy Influence Type Participation in a guidance/advisory committee
Impact The UK Plant Genetic Resources Group (UKPGRG) serves as the technical forum to discuss and implement the conservation and use of plant genetic resources in the UK. The broad membership includes both curators of ex situ plant genetic resource centres, those involved in in situ conservation, and representatives from non-governmental organisation, the commercial plant breeding sector and Universities. Botanic gardens, the Forestry Commission and statutory collections are also represented. The Group provides advice and technical support to Government Departments on technical and policy matters which relate to the UK or the UK's international role in the area of plant genetic resources.
URL http://ukpgrg.org
 
Description BBSRC BBRF
Amount £1,092,256 (GBP)
Funding ID BB/P024068/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2017 
End 10/2021
 
Description The Nottingham Arabidopsis Stock Centre (arabidopsis.info)
Amount £1,485,289 (GBP)
Funding ID BB/V018337/1 
Organisation Biotechnology and Biological Sciences Research Council (BBSRC) 
Sector Public
Country United Kingdom
Start 11/2021 
End 10/2026
 
Title EURISCO 
Description EURISCO is a search catalogue providing information about ex situ plant collections maintained in Europe. It is based on a European network of ex situ National Inventories. Since 2014, EURISCO is hosted at and maintained by the Leibniz Institute of Plant Genetics and Crop Plant Research (IPK, Gatersleben, Germany) on behalf of ECPGR. ECPGR and other Central Crop Databases have been established through the initiative of individual institutes and of ECPGR Working Groups. The databases hold passport data and, to varying degrees, characterization and primary evaluation data of the major collections of the respective crops in Europe. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact We have shown commitment to a European wide catalogue of germplasm resources. This was brokered via the UKPGRG group and assisted by DEFRA. Our inclusion of more than 600,000 stocks makes a very strong statement about the relative support of the UK government to germplasm resources compared to our European neighbours. 
URL http://www.ecpgr.cgiar.org/resources/latest-news/news-detail/accessions-from-the-nottingham-arabidop...
 
Title NASCarrays 
Description From 2002 to 2013 NASC arrays was the primary world database for arabidopsis transcriptomics data. We generated the majority of the world public arabidopsis transcriptome data through our physical (early technical access) Affymetrix array service; and served this data without restriction to the world community. For the first few years we produced and released more transcriptomics data by volume than the Human or Mouse community (although because of the proliferation of participating sites we were overtaken once those communities and a few others had caught up on our lead). All of our data was given away [as per our remit] to collaborating and competing projects such as the EBI (ArrayExpress), GEO in the US, BAR in Canada, and many other academic sites and commercial entities such as GenevestigATor. As a natural consequence of the rapid proliferation in sites and analysis tools (tool development beyond basic access and correlation was not in our remit) and the increased ease of access to transcriptomc techniques, many of these other sites were (and are) largely dependent on the volume and quality of our data. Following BBSRC committee funding advice, we ended the NASCarrays database in 2013 and gracefully retired the data to these sites (especially GEO as the perceived dominant site). To ensure good data practice, we made appropriate safety backups at iPlant in the US and also have frozen-curation FTP access at NASC to all data. The arabidopsis community still uses our data, re-badged at other sites with variable levels of attribution to NASCarrays. 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact We were the first centre to provide sufficient transcriptomic data in the reliable and reproducible Affymetrix format for real bioinformatic correlation studies. We were also used as domain champions for standards and the instantiation of such activities such as MIAME and large database integration activities. This was both in collaboration with other providers/databases and directly with end users in biology and computational fields; probably more importantly because we made the data available to the wider community, many researchers picked up our data and used it independently (although that can be harder to track). 
URL http://affymetrix.arabidopsis.info
 
Title arabidopsis.info 
Description The NASC germplasm database holds data on just under 1 million stocks that have been acquired since the centre began in 1990/91. We replaced the Arabidopsis Information Service in Germany (1964 - 1987) and acquired all 200+ stocks from them. In 1999 this was increased to 20,000 and in 2013 passed the 800,000 mark. We hold genomic, genetic, phenotypic, passport, collection, images, and other sundry data about these stocks and make this information freely available to researchers and collaborators worldwide. As part of the database we run a cost-recovery catalogue for ordering stocks which includes user data and a fully developed e-commerce solution bespoke to NASC. We also integrate our data with exteral and internal databases using a variety of mathods from direct data exchange through dynamic URLs to fully fledged Web Services (SOAP and REST). 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact This database underpins the distribution of arabidopsis germplasm resources to the UK and European plant community. It also provides the same service to worldwide users in collaboration/complementarity with the US stock center ABRC. This has accelerated the ease of uptake of germplasm and data associated with these germplasm entities and supported arabidopsis and other plant research to make it one of the most productive model species. 
URL http://arabidopsis.info
 
Title atensembl 
Description AtEnsembl was the first Ensembl browser/database to be produced by non-EBI staff (or ex-staff). We took the data from our http://ukcrop.net database (AceDB format) and integrated it with new genomics/sequencing data. We added data from our Germplasm activities and over time added NASCarrays and other available datasets. To this end we were the first genome browser of any species in the world to integrate genome data with stock ordering and transcriptome data. We also included early SNP data from primitive arrayseq and were consequently the most richly populated genome browser of any species for many years. Some current browsers still do not have this richness of function and almost all other species browsers are not at the point that we were when we officially closed the database. Increased computational skills in the community had led to an inevitable international proliferation of browsers; several of them using our freely available data. Our efforts, and the efforts of our collaborators at TAIR/ABRC to make access to our services as open and exploitable as possible helped to make the proliferation of browsers and analysis tools at multiple sites possible and attractive and develop the current community options. We officially closed the database at failed renewal of funding (lack of uniqueness). A snapshot is still available at the old URL as per 'good practice' but is not actively linked from our other resources to ensure data quality/currency. 
Type Of Material Database/Collection of data 
Provided To Others? Yes  
Impact When we were unique, we were a critical point in supporting the community with an integrated seed/genome/array data browser. The first generation of Gramene's arabidopsis data and the EBIs own arabidopsis plant-ensembl database were directly derived from and attributed to us. After the proliferation of competing browsers led to a perception by the funding committee that we were not unique and should concentrate on our core (unique in Europe) remit of germplasm distribution; we dropped our browser and integrated with the prevailing providers. Throughout this process and during our own development, we always made our data freely and openly available to others (collaborators or competitors) in he community and therefore were instrumental in catalysing a range of current bioinformatics browser systems for plants (e.g. AIP). 
URL http://atensembl.arabidopsis.info