Copy number variation and gene expression

Lead Research Organisation: University of Nottingham
Department Name: School of Life Sciences


In many animals and plants, the structure of the genome appears to be very wasteful, with large amounts of DNA apparently having no function, and many sequences being highly repeated. In addition to this complexity in genome structure, there is variation in the gene content of the genome. There have been rapid advances recently in our understanding of such variation in gene copy number, particularly in the study of the human genome. For most human genes, all of us have two copies - one from each parent. For some genes, however, different people can have different numbers of copies, and it is clear that the number of copies can matter. People who are Rhesus-negative have no copies at all of the corresponding gene, while Rhesus-positive individuals have one or two copies. These differences underlie the mother-child mismatch that can lead to a 'Rhesus baby' . Similarly, reduced numbers of the alpha-globin gene (usually four copies per person) are the basis of the globally common blood disease alpha-thalassaemia. Although natural selection generally ensures that variants do not get to be very common if they have negative effects on their carriers, it is clear that in humans and many other species, there are some important genes that show extensive variation, with many different gene copy numbers in the population. For example, the gene encoding the salivary amylase enzyme, responsible for digestion of starch in food, is present in variable numbers - some people have as few as two copies of this gene, while others can have as many as 12. It has been suggested that this variation leads to corresponding variation in the production of amylase in saliva, and so to variation in ability to digest starch. This is an attractive idea, but is it really true? More generally, does variation in gene number really lead to variation in gene function - and if so, how? In reality, the situation is complicated. Having twice as many genes does not necessarily mean that twice as much of the relevant protein is made - there are examples in which extra copies of a gene may or may not be used, or be only partially active, depending on its precise position, or on the number of other copies in the same cell. This project examines directly whether copy number variation leads to changes in gene expression - and if so in what patterns. To do this we will measure protein levels in cells and secretions, and ask how they relate to the copy number of variable genes. Surprisingly, counting copies of DNA in a genome is technically difficult - indeed, sequencing DNA is much more straightforward than knowing exactly how many copies have been sequenced. Clearly, if copy number is not being measured accurately, deductions about the effect on gene expression will also be inaccurate. It is therefore a real challenge in this project to type copy number variation accurately, so that (for example) a test can clearly distinguish whether an individual has 6 or 7 copies of a variable gene. One particular advantage our group has in undertaking this work is experience in the accurate measurement of gene copy number. The project examines three examples of variable-number genes from the human genome that show wide variation in number; the alpha-defensins (involved, as their name implies, in defence against infection) vary commonly between 4 and 11 copies per person; the salivary amylase gene, varying 2-12 copies per person; and finally, the beta-defensin DEFB109 varies independently of the alpha-defensins (between 2 and 7 copies in most people) but also has some copies that are inactivated by an internal mutation - in this case it is likely that gene expression will relate not just to the total number of copies, but the number of copies capable of making an active protein.

Technical Summary

Copy number variation is prevalent in the genomes of many species - but how does copy number variation affect gene function? This project addresses this question using analysis of (a) 'multiallelic' copy number variation with numerous allelic states, rather than simple presence/deletion variation, (b) gene expression in physiologically relevant tissues, rather than cell lines, and (c) protein-level expression, rather than mRNA levels. These approaches examine functional variation most relevant to its adaptive value, and in systems of interest for mechanisms of genome evolution. Human genes will be examined to take advantage of the detail with which both genomic variation and associated phenotypes can be described. We will specifically examine variation of the human alpha-defensin (DEFA1A3) genes (variable 4-11 copies per diploid cell), and protein expression in neutrophils; variation of the salivary amylase gene AMY1 (2-12 copies per diploid cell) and protein levels in saliva; and variation of the DEFB109 gene (which varies between 2 and 7 copies per diploid cell in total but for which a pseudogene variant is common) and protein levels secreted by corneal cells. A key component of the proposal is implementation of accurate measures of copy number variation. We will develop and apply new assays that achieve accuracy in discriminating copy numbers (for example, in the range 6-10 per diploid cell) not available from standard approaches. In particular, we will apply Paralogue Ratio Test assays (and multiplex extensions of that approach) to ensure that accurate data are generated for genomic copy number in the samples tested. Copy number measurement methods will also be calibrated and validated by generation of reference standards using high-throughput (SOLiD 4) sequence reads. This work will generate findings of general relevance to the study of variation in gene expression, and of particular implications for variation in response to diet and to infectious disease

Planned Impact

Who will benefit from of this research? The healthcare implications of the work are of potential benefit to a wide variety of patient groups, with particular relevance to obesity, age-related insulin resistance, infection (especially eye infections) and inflammatory disorders. The biomedical implications of this work respond to the BBSRC's Strategic Research Priority 3 ('Basic bioscience underpinning health') as described in the Strategic Plan 2010-2015, with specific relevance to dietary aspects of human health, and to age-related decline in immune function. The results of this research will benefit members of the academic research community in a wide range of disciplines in biological sciences, including researchers interested in understanding genetic factors in health and disease, gene expression, systems biology, functional genomics, population genetics, genome evolution, resistance to infectious disease and dietary adaptation. How will they benefit from this research? Variation in copy number of the salivary amylase genes may have direct healthcare implications for individual variation in digestion of high-starch diets, of relevance to development of obesity and insulin resistance. The alpha-defensins are important in antimicrobial resistance and inflammation; understanding the regulation of gene expression for the alpha-defensins may have implications for numerous infectious and inflammatory disorders. Antimicrobial peptides including DEFB109 are being evaluated as a topical therapy in eye infections, and understanding individual variation in DEFB109 function, and its relation to susceptibility to eye infection, may have a future role in identifying patient subgroups that would particularly benefit from supplementation. The results of this project could also have healthcare impact by application to other disorders for which variation in gene copy number is an important predisposing factor, such as the beta-defensins (implicated in Crohn's disease and psoriasis), immunoglobulin Fc receptor genes FCGR3A/3B (responsible for predisposition to a range of autoimmune disorders) and the chemokine genes CCL3L1/CCL4L1 (implicated in rheumatoid arthritis and AIDS/HIV infection). The published results of this work will have fundamental implications for our understanding of the relationship between genome structure and function, by examining the relationship of protein-level expression with complex multiallelic variation in copy number. What will be done to ensure that they have the opportunity to benefit from this research? In addition to the traditional routes of publication, the outcomes of this project will be communicated to target audiences by dissemination using our own web pages, and the University of Nottingham's Communications unit. The potential benefits translating into clinical impact will be exploited primarily via the interaction with Ophthalmology colleagues in Nottingham, particularly for the DEFB109 work. If there are commercially exploitable findings, the necessary steps towards protecting intellectual property and licensing will be taken in collaboration with Nottingham's Research Information Services department.
Description We have mainly investigated two examples of gene copy number variation in the human genome, the (digestive) salivary amylase gene, and the (antimicrobial) alpha-defensin genes. In the salivary amylase variation we have discovered unexpected new levels of variation, demonstrating that the salivary amylase gene variation is coupled to similar variation in the pancreatic amylase genes. We conclude that previous findings on the functional effects of the salivary amylase genes may have been misinterpreted - it is now possible that the effects observed were actually effects of the pancreatic amylase genes. These issues are of particular interest because of recent claims that salivary amylase gene copy number variation is a major predisposing factor to obesity.

At the alpha-defensin genes we have studied the variation and its relationship to expression. We have been able to define the global variation in detail, which is of particular application in the study of the kidney disease IgA nephropathy, in which alpha-defensin variation is implicated. We have also shown that (contrary to earlier findings) there is no significant association between gene copy number and the amounts of alpha-defensin in blood cells.
Exploitation Route Our findings on the amylase variation will clarify the ongoing controversy about whether there really is a link between amylase variation and obesity. Our alpha-defensin discoveries will be key to understanding the role of this variation in IgA nephropathy.
Sectors Agriculture, Food and Drink,Healthcare,Pharmaceuticals and Medical Biotechnology