Control of polyA site choice by m6A RNA modification

Lead Research Organisation: University of Dundee
Department Name: School of Life Sciences

Abstract

Genes are found within DNA. When genes are switched on, their DNA code is copied into a related molecule called RNA. There are many ways that RNA can be further processed and, ultimately, this means that there are many different codes that can be made from the same gene. As a result, controlling the processing of RNA is fundamentally important to a cell.

One example of these processing changes relates to where an RNA ends. Often different RNA copies of the same gene can stop at more than one place. As a result, a longer or shorter stretch of the gene's code is made and so the instructions contained within them are different. When the end of an RNA is determined, a string of more than 50 "A" bases (which we call the polyA tail) is added to the end of each RNA copy to protect it through its lifetime.

The RNA code comprises 4 bases. In addition to A, there is G, C and U. However, sometimes these bases are altered with chemical modifications. The most common modification is the addition of a chemical methyl group to A. We have known this for some time, but only recently have we realised how important these RNA modifications can be.

The BBSRC previously gave us funding to map the modified As in the RNAs. We succeeded in this objective by pioneering a new technique, called nanopore direct RNA sequencing, to identify modified As. In the same study, we found that the main consequence of losing this chemical modification was that the length of RNA copies shifts. In other words, the modified A could instruct the cell where to stop the copies of RNA made at thousands of genes.

Our aim in this study is to work out how a modified A can tell the cell where to stop an RNA copy of a gene and when it does so what is the consequence? We have a good idea how to tackle this problem. The process of stopping an RNA copy and adding a polyA tail is controlled by a biological machine of more than 20 different proteins. These proteins are closely related in very different species. However, plants have evolved a special feature in one of these proteins because part of the protein can specifically recognise modified As. Our preliminary data analysis suggests that this part of this protein binds to the modified As and when it does so, it prevents RNA copies stopping nearby.

It therefore seems that plants have made special use of this RNA modification to control where RNA copies of genes stop. The only other species that have this same special feature in the corresponding protein appear to be a group of animal parasites called the Apicomplexa. Some members of this group are responsible for human diseases such as malaria and toxoplasmosis.

To tackle this problem, we have assembled a team with world leading expertise in plant RNA biology and the analysis of large RNA sequencing datasets. As a result of the work of this team, we will learn which genes are sensitive to RNA length control by modified As. We will determine which genes are directly controlled by this modification and identify proteins and protein complexes involved in this control. We will test how these interactions affect where an RNA copy stops. We will make the relevant protein and RNA molecules and test directly how they affect binding to each other. We will then ask what the consequence is for shortening RNA molecules from genes when regulation by modified As cannot occur. We will determine if the shortened RNAs are degraded more quickly or less able to be translated into protein.

We hope to explain why plants have evolved a special way to control where copies of thousands of their genes stop. By understanding this, we should be well placed to address the same question in the group of parasites that appear to control their genes in a similar way. Dundee is home to the largest University-based drug discovery unit in the world. Consequently, we hope to establish sufficient knowledge here to implement a related study into these pathogens in the near future.

Technical Summary

This project will reveal the mechanism and impact of a newly discovered feature of plant gene expression. We recently discovered that the major impact of m6A on RNA processing was at the level of poly(A) site selection, resulting in a global shortening of 3'UTRs in mutants defective in m6A. The poly(A) signal is recognised by the conserved protein CPSF30, which, uniquely in plants (and the Apicomplexa) can be expressed as an isoform with a YTH m6A reader domain. Together these finding suggest that plants have adapted m6A to control 3'UTR length.

We will use nanopore direct RNA sequencing to map sites of 3' end formation dependent on the YTH domain of AtCPSF30. We will use iCLIP to map the RNA binding sites of AtCPSF30-YTH linking them to sites of m6A modification and altered patterns of poly(A) site choice. We will use in vitro RNA binding experiments with nuclear extracts and purified recombinant proteins to determine the mechanism by which m6A and the YTH domain of AtCPSF30-YTH inhibit poly(A) site selection. We will use pulse labelling of nucleotide analogues and polysome profiling to measure the functional consequences of shortened 3'UTRs on mRNA stability and translatability. By analyzing the corresponding RNA with nanopore cDNA PCR sequencing we will be able to link outcomes to specific processing choices in full-length mRNAs for the first time.

We have world leading expertise in studying RNA biology in plants (Gordon Simpson) and pioneering analysis development in RNA sequencing (Geoff Barton).

Our findings will explain the evolution and function of the YTH domain of CPSF30 in plants and reveal those genes and processes under its control. Since a similar regulatory module exists in the Apicomplexa (a group of disease-causing protozoan parasites), we will pioneer a combination of experimental approaches that could lead to the rationale development of new drug targets for the treatment of disease.