13-NSFABI: Arabidopsis Information Portal
Lead Research Organisation:
University of Cambridge
Department Name: Genetics
Abstract
For over a decade, The Arabidopsis Information Resource (TAIR) has served as a data and analysis resource for a community of thousands of Arabidopsis and other plant species' researchers.
Supported by a Research Coordination Network grant, IAIC-sponsored meetings and workshops produced a concept for a new Arabidopsis Information Portal (AIP).
This new resource portal is envisaged as a mechanism to bring together the ever-increasing amounts of Arabidopsis data into a single, user-friendly location using the latest web technologies and web services.
It will adopt a modular, federated development model which ensures that responsibility for generation and maintenance of valuable data to remains in the hands of the individual data providers and spreads the burden of supporting such resources across a potentially wider range of countries and funding agencies.
The AIP will be developed by a team with deep experience in scientific infrastructure, data integration, and community engagement, and will take advantage of significant NSF investments in the plant biology research community.
In addition to providing the community with access to broader and richer data sets, including genomic, epigenomic, transcriptomic, proteomic, metabolomic and phenomic data, the AIP will cooperate with the iPlant project to provide access to a sophisticated suite of tools that can be used to analyze, visualize, and interpret the data.
AIP will continue the TAIR model of educational engagement with the plant science community. It will provide reviews and demonstrations of the developing AIP functionalities at appropriate national meetings and host workshops for potential developers of new tools and resources.
Not only will AIP modernize the bioinformatics capacity of the Arabidopsis community, it will provide a foundation for multi-agency, multi-national collaboration in building and funding biological informatics capabilities.
Supported by a Research Coordination Network grant, IAIC-sponsored meetings and workshops produced a concept for a new Arabidopsis Information Portal (AIP).
This new resource portal is envisaged as a mechanism to bring together the ever-increasing amounts of Arabidopsis data into a single, user-friendly location using the latest web technologies and web services.
It will adopt a modular, federated development model which ensures that responsibility for generation and maintenance of valuable data to remains in the hands of the individual data providers and spreads the burden of supporting such resources across a potentially wider range of countries and funding agencies.
The AIP will be developed by a team with deep experience in scientific infrastructure, data integration, and community engagement, and will take advantage of significant NSF investments in the plant biology research community.
In addition to providing the community with access to broader and richer data sets, including genomic, epigenomic, transcriptomic, proteomic, metabolomic and phenomic data, the AIP will cooperate with the iPlant project to provide access to a sophisticated suite of tools that can be used to analyze, visualize, and interpret the data.
AIP will continue the TAIR model of educational engagement with the plant science community. It will provide reviews and demonstrations of the developing AIP functionalities at appropriate national meetings and host workshops for potential developers of new tools and resources.
Not only will AIP modernize the bioinformatics capacity of the Arabidopsis community, it will provide a foundation for multi-agency, multi-national collaboration in building and funding biological informatics capabilities.
Technical Summary
The goal of the AIP team is to develop a next-generation Arabidopsis cyberinfrastructure that facilitates data aggregation and integration, presents a flexible and task- oriented user interface, and is modular and extensible.
Wherever possible the AIP team will aim for integration of separate data sets over simple aggregation, thus enabling more sophisticated queries and analyses to be performed.
Apart from data directly curated by the AIP core, a guiding principle of the project is that provision of data collections, such as transcriptomics and proteomics, should be the responsibility of domain experts who will make their data available to AIP in metadata-rich, standardized formats.
AIP will collaborate with community data providers to facilitate integration of their data and analytical code into its framework.
AIP will develop a user-customizable web portal that will enable queries across integrated data sets, bioinformatic analyses, and visualization of results.
The default interface will replicate much of the functionality of TAIR, but will be configurable and extensible.
Though the AIP is designated a 'portal' it is in aggregate a complete Arabidopsis-oriented cyberinfrastructure comprised of three major components: a Portal Layer to provide graphical access; a Data Layer, which will federate or integrate data repositories differing in type, content, and physical location; and a mediating Web Services Layer, based on InterMine and the iPlant Agave Web Services technologies, that will provide access to data sources and the national science infrastructure.
This modular development strategy enables components to be easily replaced as new technologies emerge.
Each component will communicate using standard protocols and will be developed using strategically aligned software practices.
Wherever possible the AIP team will aim for integration of separate data sets over simple aggregation, thus enabling more sophisticated queries and analyses to be performed.
Apart from data directly curated by the AIP core, a guiding principle of the project is that provision of data collections, such as transcriptomics and proteomics, should be the responsibility of domain experts who will make their data available to AIP in metadata-rich, standardized formats.
AIP will collaborate with community data providers to facilitate integration of their data and analytical code into its framework.
AIP will develop a user-customizable web portal that will enable queries across integrated data sets, bioinformatic analyses, and visualization of results.
The default interface will replicate much of the functionality of TAIR, but will be configurable and extensible.
Though the AIP is designated a 'portal' it is in aggregate a complete Arabidopsis-oriented cyberinfrastructure comprised of three major components: a Portal Layer to provide graphical access; a Data Layer, which will federate or integrate data repositories differing in type, content, and physical location; and a mediating Web Services Layer, based on InterMine and the iPlant Agave Web Services technologies, that will provide access to data sources and the national science infrastructure.
This modular development strategy enables components to be easily replaced as new technologies emerge.
Each component will communicate using standard protocols and will be developed using strategically aligned software practices.
Planned Impact
The proposed Arabidopsis-oriented cyberinfrastructure will serve the plant worldwide community, taking on and expanding the role of TAIR.
This active community comprises research scientists, educators and students, with the TAIR web site currently receiving over 36,000 unique visitors and 1.8 million page views per month.
Arabidopsis research is important both to basic plant science and to the breeding of commercial crops. The direct beneficiaries of this project will be found therefore both in the academic and the commercial sector, and the research facilitated by the project could have an impact on policy making and on the public in general.
This active community comprises research scientists, educators and students, with the TAIR web site currently receiving over 36,000 unique visitors and 1.8 million page views per month.
Arabidopsis research is important both to basic plant science and to the breeding of commercial crops. The direct beneficiaries of this project will be found therefore both in the academic and the commercial sector, and the research facilitated by the project could have an impact on policy making and on the public in general.
People |
ORCID iD |
Gos Micklem (Principal Investigator) |
Publications
Hanlon M
(2015)
Araport: an application platform for data discovery
in Concurrency and Computation: Practice and Experience
Krishnakumar V
(2015)
Araport: the Arabidopsis information portal.
in Nucleic acids research
Krishnakumar V
(2017)
ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery.
in Plant & cell physiology
Krishnakumar V
(2017)
ThaleMine: A Warehouse for Arabidopsis Data Integration and Discovery.
Pasha A
(2020)
Araport Lives: An Updated Framework for Arabidopsis Bioinformatics.
in The Plant cell
Description | Arabidopsis is the key model organism for the worldwide plant research community. Although a database, TAIR, existed for Arabidopsis, it had limited search flexibility and extensibility. As part of this award we supported the use of InterMine, the large-scale data integration platform developed in the Micklem group, to generate the ThaleMine system as part of the new Arabidopsis data portal, Araport. |
Exploitation Route | Further grant applications are under consideration to support further development of ThaleMine. ThaleMine is being maintained by JCVI, is available as a resource to support basic research in the worldwide plant research community and has around 50,000 users per year. |
Sectors | Agriculture Food and Drink Education Energy Environment Manufacturing including Industrial Biotechology |
URL | https://bar.utoronto.ca/thalemine |
Title | ThaleMine |
Description | Araport InterMine database |
Type Of Material | Database/Collection of data |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | Open resource to support worldwide plant research community (academic and industrial) |
URL | https://bar.utoronto.ca/thalemine |
Description | 13-NSFABI: Arabidopsis Information Portal |
Organisation | J Craig Venter Institute |
Country | United States |
Sector | Charity/Non Profit |
PI Contribution | Supported the development and use of ThaleMine using the InterMine platform. |
Collaborator Contribution | Applied InterMine software in production environment; developed other parts of Araport portal. |
Impact | See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering. |
Start Year | 2014 |
Description | 13-NSFABI: Arabidopsis Information Portal |
Organisation | Phoenix Bioinformatics Corporation |
Country | United States |
Sector | Private |
PI Contribution | Supported the development and use of ThaleMine using the InterMine platform. |
Collaborator Contribution | Applied InterMine software in production environment; developed other parts of Araport portal. |
Impact | See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering. |
Start Year | 2014 |
Description | 13-NSFABI: Arabidopsis Information Portal |
Organisation | Texas A&M University |
Country | United States |
Sector | Academic/University |
PI Contribution | Supported the development and use of ThaleMine using the InterMine platform. |
Collaborator Contribution | Applied InterMine software in production environment; developed other parts of Araport portal. |
Impact | See publications etc attached to this grant. Multidisciplinary: biology, bioinformatics, software engineering. |
Start Year | 2014 |
Description | First AIP community developer workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop trained likely future developers of AIP software applications. Test applications were developed, developers were educated. |
Year(s) Of Engagement Activity | 2014 |
Description | GARNet2016: Innovation in the Plant Sciences |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Workshop for GARNet conference |
Year(s) Of Engagement Activity | 2016 |
Description | Monogram 2017 Conference Bristol |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Invited platform presentation |
Year(s) Of Engagement Activity | 2017 |