Unravelling SWI/SNF ARID1A/B paralogs function at sequence resolution

Lead Research Organisation: Institute of Cancer Research

Department Name: Division of Cancer Biology

Abstract

Paralog proteins emerged through gene duplication events, they are very similar in sequence and structure and have related functions, but small and subtle differences in their sequences can lead to specific modulation of their roles. More than 60% of human proteins have paralogs and they are prevalent in chromatin protein complexes. Systematic approaches for sequence to function mapping are required to disentangle the specific biological roles and behaviours of the paralog pairs, and lack of scalable methods for this has limited our understanding of the molecular basis of diverging paralog functions. The overarching aim of this project is to understand differential sequence-function relationships of paralog protein pair ARID1A/1B. These proteins are subunits of DNA-binding multiprotein assemblies called SWI/SNF complexes that are important for many essential functions, including cell proliferation, cell cycle control, response to DNA damage and organism development. Most proteins work by associating with other proteins, so characterising how proteins interact is important for fully understanding how they perform their roles. In this project we will identify the proteins that ARID1A and 1B interact with, mapping the domains or motifs that mediate the interactions, investigating the effect of mutations in their sequence on cell growth and finally, integrating all the data to construct models that can be helpful for understanding the mechanisms of ARID1A and 1B function.
To identify ARID1A and 1B associated proteins we will purify ARID1A/B in conditions that maintain the native interactions with antibodies that specifically recognise them and use a technique called mass spectrometry to identify proteins that co-purify with them. We will also use mass spectrometry to identify post-translational modifications, small chemical "flags" that can regulated different aspects of protein function, like protein activity, interactions or locatlisation amongst others. To identify binding domains we will use peptides, or short protein fragments, covering the entire length of ARID1A and 1B arrayed on a paper membrane. A cell extract is added, and proteins that can interact with the peptides remain bound to the membrane. Each peptide spot will then be analysed by mass spectrometry to identify the bound proteins. This strategy is optimal to detect binding dependent on short motifs. To map binding domains that depend on the 3D structure of ARID1A/B we will fix the interactions inside the cell using a "molecular glue" that binds proteins that are very close to each other. We will use mass spectrometry to identify the regions of the proteins that were linked together. To identify aminoacids in ARID1A/B that are important for cell growth when the paralog is absent, we will mutate each aa sequentially to alanine and monitor cell proliferation. Finally, we will consolidate all the data to produce a graph that represents an integrated view of the knowledge acquired. This will be useful to generate hypothesis on how ARID1A/B differentially perform their specific roles.

Technical Summary

Paralog proteins emerge from gene duplications events. They exhibit very high sequence and structure similarity, as well as having related functions that are reflected by their synthetic lethal genetic associations. However, small subtle sequence determinants result in distinct regulatory and functional attributes. Over 60% of human proteins have paralogs and are prevalent in chromatin complexes, but despite the significant implications of this redundancy, the molecular basis of diverging paralog function is underexplored limiting our mechanistic understanding of these gene families. ARID1A and 1B are defining paralog subunits of the SWI/SNF chromatin remodelling complex important for essential biological processes including cell proliferation, cell cycle control, DNA damage response and organism development, and knowledge of how sequence diversity impacts their function is limited. The aim of this project is to understand differential sequence determinants of ARID1 function. To do this, we will use affinity purification coupled to mass spectrometry to identify ARID1A/B interacting proteins and post-translational modifications in the presence / absence of the other paralog. We will then map residues and motifs that mediate interactions at high sequence resolution using two complementary approaches, PRISMA (Protein Interaction Screen on peptide Matrices) and crosslinking-mass spectrometry. These studies will be complemented with alanine mutational scanning of ARID1A/1B to identify residues that are critical when cells are devoid of the alternative paralog. Finally, we will construct a protein knowledge graph that represents a model for understanding the mechanisms of divergent ARID1A and 1B function. Our work will yield a high-molecular-resolution functional footprint of ARID1A/B interactions with contact site information, provide a basis to explore gene regulatory and PTM-modulated ARID1A/B functions, and serve as a paradigm for the study of paralog proteins.

Funded Value:

£892,103

Funded Period:

Apr 24 - Apr 27

Funder:

BBSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

BB/Y004477/1

Principal Investigator:

Jyoti Choudhary

Research Subject:

Biomolecules & biochemistry (63%)

Cell biology (18%)

Genetics & development (18%)

Research Topic:

Epigenetics (18%)

Multiprotein complexes (36%)

Organelles & components (18%)

Structural biology (27%)

Organisations

People	ORCID iD
Jyoti Choudhary (Principal Investigator)	http://orcid.org/0000-0003-0881-5477
Mercedes Pardo Calvo (Researcher Co-Investigator)	http://orcid.org/0000-0002-3477-9695

Publications

Author Name

Title Publication Date Published

10 25 50

Research Tools and Methods
Collaboration
Engagement Activities


Title	Multipep SPOT membrane synthesis
Description	Automated synthesis of peptides on cellulose membranes
Type Of Material	Improvements to research infrastructure
Year Produced	2024
Provided To Others?	Yes
Impact	Ability to synthesise peptide arrays for PRISMA studies in-house


Title	PhoXplex
Description	A method that combines phospho-enrichable cross-linking with isobaric labeling for quantitative proteome-wide mapping of protein interfaces
Type Of Material	Technology assay or reagent
Year Produced	2024
Provided To Others?	Yes
Impact	NA
URL	https://pubs.acs.org/doi/10.1021/acs.jproteome.4c00567


Description	PRISMA with BB/SG
Organisation	Institute of Cancer Research UK
Country	United Kingdom
Sector	Academic/University
PI Contribution	We contributed expertise on the PRISMA method and helped with the data analysis
Collaborator Contribution	NA
Impact	PhD thesis, defended December 2024
Start Year	2023


Description	ICR CPD Brainstorming event Jan 2025
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Other audiences
Results and Impact	The event included several talks on the ongoing work of the CPD. We presented our PRISMA research and discussed how it could be useful to the CPD. This sparked questions and discussion and led to further collaborations.
Year(s) Of Engagement Activity	2025


Description	ICR Cancer Biology Division Open Day
Form Of Engagement Activity	Participation in an open day or visit at my research institution
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	The aim of the Open Day was to disseminate the research activities of the Cancer Biology Division. The technology described in this grant and some preliminary results were presented as an oral talk and a poster, which sparked questions and discussion, and initiated collaborations with other research groups.
Year(s) Of Engagement Activity	2024


Description	ICR-MRC Doctoral Training Program Proteomics Module May 2024
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	This was a teaching module for ICR PhD students on Protein Interaction Proteomics, which consisted of interactive teaching on experimental proteomics methods for investigating protein interactions and hands-on learning of different bioinformatics tools for data analysis.
Year(s) Of Engagement Activity	2024

Abstract

Technical Summary

Organisations

People

ORCID iD

Publications