13-NSFABI: Arabidopsis Information Portal

Lead Research Organisation: University of Cambridge
Department Name: Genetics

Abstract

For over a decade, The Arabidopsis Information Resource (TAIR) has served as a data and analysis resource for a community of thousands of Arabidopsis and other plant species' researchers.

Supported by a Research Coordination Network grant, IAIC-sponsored meetings and workshops produced a concept for a new Arabidopsis Information Portal (AIP).

This new resource portal is envisaged as a mechanism to bring together the ever-increasing amounts of Arabidopsis data into a single, user-friendly location using the latest web technologies and web services.

It will adopt a modular, federated development model which ensures that responsibility for generation and maintenance of valuable data to remains in the hands of the individual data providers and spreads the burden of supporting such resources across a potentially wider range of countries and funding agencies.

The AIP will be developed by a team with deep experience in scientific infrastructure, data integration, and community engagement, and will take advantage of significant NSF investments in the plant biology research community.

In addition to providing the community with access to broader and richer data sets, including genomic, epigenomic, transcriptomic, proteomic, metabolomic and phenomic data, the AIP will cooperate with the iPlant project to provide access to a sophisticated suite of tools that can be used to analyze, visualize, and interpret the data.

AIP will continue the TAIR model of educational engagement with the plant science community. It will provide reviews and demonstrations of the developing AIP functionalities at appropriate national meetings and host workshops for potential developers of new tools and resources.

Not only will AIP modernize the bioinformatics capacity of the Arabidopsis community, it will provide a foundation for multi-agency, multi-national collaboration in building and funding biological informatics capabilities.

Technical Summary

The goal of the AIP team is to develop a next-generation Arabidopsis cyberinfrastructure that facilitates data aggregation and integration, presents a flexible and task- oriented user interface, and is modular and extensible.

Wherever possible the AIP team will aim for integration of separate data sets over simple aggregation, thus enabling more sophisticated queries and analyses to be performed.

Apart from data directly curated by the AIP core, a guiding principle of the project is that provision of data collections, such as transcriptomics and proteomics, should be the responsibility of domain experts who will make their data available to AIP in metadata-rich, standardized formats.

AIP will collaborate with community data providers to facilitate integration of their data and analytical code into its framework.

AIP will develop a user-customizable web portal that will enable queries across integrated data sets, bioinformatic analyses, and visualization of results.

The default interface will replicate much of the functionality of TAIR, but will be configurable and extensible.
Though the AIP is designated a 'portal' it is in aggregate a complete Arabidopsis-oriented cyberinfrastructure comprised of three major components: a Portal Layer to provide graphical access; a Data Layer, which will federate or integrate data repositories differing in type, content, and physical location; and a mediating Web Services Layer, based on InterMine and the iPlant Agave Web Services technologies, that will provide access to data sources and the national science infrastructure.

This modular development strategy enables components to be easily replaced as new technologies emerge.
Each component will communicate using standard protocols and will be developed using strategically aligned software practices.

Planned Impact

The proposed Arabidopsis-oriented cyberinfrastructure will serve the plant worldwide community, taking on and expanding the role of TAIR.

This active community comprises research scientists, educators and students, with the TAIR web site currently receiving over 36,000 unique visitors and 1.8 million page views per month.

Arabidopsis research is important both to basic plant science and to the breeding of commercial crops. The direct beneficiaries of this project will be found therefore both in the academic and the commercial sector, and the research facilitated by the project could have an impact on policy making and on the public in general.

Publications

10 25 50
publication icon
Hanlon M (2015) Araport: an application platform for data discovery in Concurrency and Computation: Practice and Experience

publication icon
Krishnakumar V (2015) Araport: the Arabidopsis information portal. in Nucleic acids research

publication icon
Krishnakumar V (2017) ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery. in Plant & cell physiology

 
Description Arabidopsis is the key model organism for the worldwide plant research community. Although a database, TAIR, existed for Arabidopsis, it had limited search flexibility and extensibility. As part of this award we supported the use of InterMine, the large-scale data integration platform developed in the Micklem group, to generate the ThaleMine system as part of the new Arabidopsis data portal, Araport.
Exploitation Route Further grant applications are under consideration to support further development of ThaleMine. ThaleMine is being maintained by JCVI, is available as a resource to support basic research in the worldwide plant research community and has around 50,000 users per year.
Sectors Agriculture

Food and Drink

Education

Energy

Environment

Manufacturing

including Industrial Biotechology

URL https://bar.utoronto.ca/thalemine
 
Title ThaleMine 
Description Araport InterMine database 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact Open resource to support worldwide plant research community (academic and industrial) 
URL https://bar.utoronto.ca/thalemine
 
Description 13-NSFABI: Arabidopsis Information Portal 
Organisation J Craig Venter Institute
Country United States 
Sector Charity/Non Profit 
PI Contribution Supported the development and use of ThaleMine using the InterMine platform.
Collaborator Contribution Applied InterMine software in production environment; developed other parts of Araport portal.
Impact See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering.
Start Year 2014
 
Description 13-NSFABI: Arabidopsis Information Portal 
Organisation Phoenix Bioinformatics Corporation
Country United States 
Sector Private 
PI Contribution Supported the development and use of ThaleMine using the InterMine platform.
Collaborator Contribution Applied InterMine software in production environment; developed other parts of Araport portal.
Impact See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering.
Start Year 2014
 
Description 13-NSFABI: Arabidopsis Information Portal 
Organisation Texas A&M University
Country United States 
Sector Academic/University 
PI Contribution Supported the development and use of ThaleMine using the InterMine platform.
Collaborator Contribution Applied InterMine software in production environment; developed other parts of Araport portal.
Impact See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering.
Start Year 2014
 
Description First AIP community developer workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Workshop trained likely future developers of AIP software applications.

Test applications were developed, developers were educated.
Year(s) Of Engagement Activity 2014
 
Description GARNet2016: Innovation in the Plant Sciences 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Workshop for GARNet conference
Year(s) Of Engagement Activity 2016
 
Description Monogram 2017 Conference Bristol 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Invited platform presentation
Year(s) Of Engagement Activity 2017