📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

BBSRC-NSF/BIO PanOryza: Globally coordinated genomes, proteomes and pathways for rice

Lead Research Organisation: European Bioinformatics Institute
Department Name: Genome Assembly and Annotation

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

This project will create the pan gene set for cultivated Asian rice (Oryza sativa) and its wild Oryza relatives, using newly developed software to produce consistent gene models across all varieties and species. At the outset, we will have access to 16 "platinum standard" reference sequence (PSRefSeq) genomes for Asian rice, that represent the genetic diversity of O. sativa, as well as 25 wild rice genomes. At present, gene models are being annotated independently across different projects, leading to inconsistencies that cause confusion and challenges in fundamental and applied rice research. We will release consistent, aligned gene models for all Oryza genus reference genomes, through internationally recognised platforms Ensembl Plants and Gramene. We will develop capabilities within the Planteome knowledgebase for annotating rice genes with functional information, through semi-automated literature extraction, and provide a community platform for gene symbol assignment and manual gene model revisions. The PSRefSeqs will be fully aligned at the chromosomal level to define synteny, and enable users to view syntenic relationships between PSRefSeqs and their gene model sets. Genetic variant data coming from >3000 re-sequenced rice accessions will be mapped onto the aligned PSRefSeqs, and released via the European Variant Archive and linked to trait data on the varieties in the SNP-Seek platform.

Consistently annotated coding gene products i.e. proteomes, will be released via the world-leading protein knowledge-base UniProt, for all 16 O. sativa PSRefSeqs and 25 proteomes for wild rice. UniProt will add functional annotations, coordinated with Planteome, as well as defining the pan-proteome i.e. the full set of proteins present in all varieties of O. sativa and at the genus level. Large scale mass spectrometry data sets will be mined to provide protein-level evidence for coding genes, and to find and annotate sites of Post-Translational Modifications.

Planned Impact

The PanOryza proposal encompasses and brings together the most widely used international databases for rice research and development. The planned objectives will improve data sharing between these resources, and greatly improve the quality of the data on rice genes, variants, proteins and post-translational modifications.

Beyond academic beneficiaries, the following groups will directly benefit:

- Seed companies and breeders will benefit through improved linkage of traits data e.g. held in SNP-Seek to improved gene models across the pan gene set for allele mining


There is potential for indirect benefits in breeders of other crops, via adoption of software and approaches developed in PanOryza improving capabilities for understanding the pan gene set within other species.

Staff will benefit through exposure to an international network of bioinformaticians, working in a key area of food security

Publications

10 25 50

publication icon
Contreras-Moreira B (2025) A pan-gene catalogue of Asian cultivated rice. in bioRxiv : the preprint server for biology

publication icon
Dyer S (2025) Ensembl 2025 in Nucleic Acids Research

publication icon
Harrison PW (2024) Ensembl 2024. in Nucleic acids research

 
Description We have imported the genome assemblies of 16 rice cultivars into Ensembl Plants with gene annotation provided by our collaborators at Gramene. The list of cultivar includes four preexisting genomes plus twelve new Platinum standard sequences. We have developed the pipelines to import the data from Ensembl Rapid Release and prepared the UniProt data model, back-end and front-end in Proteome pages to integrate these data. We have finalised version 1 of the pan-gene clusters and added these identifiers as gene synonyms in Ensembl.
Exploitation Route Beyond academic beneficiaries, seed companies and breeders will benefit through improved linkage of traits data e.g. held in SNP-Seek to improved gene models across the pan gene set for allele mining. There is also potential for indirect benefits in breeders of other crops, via adoption of software and approaches developed in PanOryza improving capabilities for understanding the pan gene set within other species.
Sectors Agriculture

Food and Drink

Digital/Communication/Information Technologies (including Software)

 
Description The genomes made available via Ensembl Plants are downloaded by several breeding companies for use internally, and as part of community tools which include our data e.g. FAIDARE,
First Year Of Impact 2023
Sector Agriculture, Food and Drink
 
Title Ensembl Beta - MAGIC 15 with pan-gene identifiers 
Description 13 of the MAGIC 16 rice genomes were added into Ensembl Beta with Pan-gene identifiers added as gene synonyms 
Type Of Material Database/Collection of data 
Year Produced 2025 
Provided To Others? Yes  
Impact None yet, the identifiers are provided to support the accompanying paper which is under review 
URL https://beta.ensembl.org
 
Title MAGIC 15 rice in Ensembl Plants 
Description The assemblies and annotations generated by the project partners were imported into Ensembl Plants where they are available for users to browse, and the outputs of comparative genomics analyses are provided across all rice cultivars plus wild relatives. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact The comparative analyses have provided an important QC step in the generation of pan-gene clusters 
URL https://plants.ensembl.org/Oryza_sativa/Info/Cultivars?db=core
 
Title MAGIC 16 rice in Ensembl Rapid Release 
Description The assemblies and annotations generated by the project partners were imported into Ensembl Rapid Release where they are available for users to browse, perform sequence search and discover homologues. 
Type Of Material Database/Collection of data 
Year Produced 2023 
Provided To Others? Yes  
Impact Preparatory steps for these to be imported into Ensembl Plants 
URL https://rapid.ensembl.org/
 
Description Barley Genome Net 2025 - Dundee 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact The methods developed as part of PanOryza were presented to members of the barley research community
Year(s) Of Engagement Activity 2025
 
Description Monogram 2023 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The rice Pan-genome integration into Ensembl Plants was presented at Monogram 2023 to raise awareness of new data and functionality among the small grains research community
Year(s) Of Engagement Activity 2023
URL https://research.reading.ac.uk/monogram-2023/
 
Description PAG rice 2024 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The PanOryza project was presented at the IRIC workshop "Rice Informatics for the Global Community"of the Plant and Animal Genome conference. There were useful interactions with the audience.
Year(s) Of Engagement Activity 2024