Statistical Methods for Pharamacogenetics

Lead Research Organisation: University College London
Department Name: UCL Genetics Institute


Pharmacogenetics is the use of genetic information to improve the prescribing of drugs, either through rapid prediction of the correct dose for a patient according to his/her genetic type, or by avoiding the prescription of drugs to those with unsuitable genetic types. Until recently pharmacogenetics was focussed on a small number of genes known to be involved in drug transport or action, but increasingly genetic data from the whole genome is taken into account. Although so-called personalised medicine has been talked about for many years now, the achievements to date in this area remain limited to a few diseases and drugs. One bottleneck in the process is efficient statistical analysis of datasets that are often relatively small, but complex in terms of both the outcome measures made on patients and the diversity of patients and their treatments. The goal of this proposal is to develop and make available better statistical methods that can help make pharmacogenetics more productive.

Technical Summary

Pharmacogenetics (PGx) has the potential to generate improved drug therapies for patients, particularly in more accurate dose-finding and avoidance of adverse events, as well as economic benefits to the pharmaceutical industry (possibility to rescue drugs not currently marketable) and to healthcare providers such as the NHS (reduced prescriptions of ineffective or harmful drugs). Broadly speaking, PGx has until now failed to deliver substantial return on investment, but the potential rewards are so great that continuing effort and investment is warranted. Because of small and heterogeneous datasets, appropriate statistical analyses are crucial to maximising the output of pharmacogenetics research, but to date there appears to have been relatively little investment, at least in the public sector, in the development of statistical methods tailored to the needs of PGx research. In this proposal we seek to make several contributions to statistical methodology to enhance the prospects of PGx, focussing on two broad strands: (1) the prediction of drug response from genome-wide genetic data, and (2) Bayesian modelling of population and other sources of heterogeneity in order to extract more information from complex data sets.


10 25 50

publication icon
International League Against Epilepsy Consortium On Complex Epilepsies. Electronic Address: (2014) Genetic determinants of common epilepsies: a meta-analysis of genome-wide association studies. in The Lancet. Neurology

publication icon
Speed D (2013) Response to Lee et al.: SNP-based heritability analysis with dense data. in American journal of human genetics

publication icon
Speed D (2012) Understanding complex traits: from farmers to pharmas. in Genome medicine

publication icon
Speed D (2015) Relatedness in the post-genomic era: is it still useful? in Nature reviews. Genetics

publication icon
Speed D (2012) Improved heritability estimation from genome-wide SNPs. in American journal of human genetics

Description MRC Strategic Skills Career Development Award in Biostatistics
Amount £444,000 (GBP)
Funding ID MR/L012561/1 
Organisation Medical Research Council (MRC) 
Sector Public
Country United Kingdom
Start 09/2014 
End 09/2017
Title LDAK 
Description LDAK is software for investigating the aetiology and predicting complex traits. It's primary aim is to obtain kinship coefficients corrected for Linkage Disequilibrium, which can otherwise bias estimates of heritability, however, it also includes our published prediction method, MultiBLUP, as well as features for generating risk scores, and performing gene-based analyses. 
Type Of Material Improvements to research infrastructure 
Year Produced 2012 
Provided To Others? Yes  
Impact The website is receiving a few hundred hits per month, with over 100 registered users since the latest update (March 2014), and now beginning to be cited in other group's publications. 
Description International League Against Epilepsy (ILAE) Consortium on Complex Epilepsies 
Organisation International League Against Epilepsy (ILAE)
Country Global 
Sector Academic/University 
PI Contribution The ILAE is a multi-centre consortium, set up to combine data from epilepsy groups across four continents. We have been heavily involved in the analysis by writing the analysis protocols and performing these on behalf of the SANAD (Liverpool-based epilepsy drug trial) and Melbourne cohorts.
Collaborator Contribution By combining data across cohorts, the ILAE is able to analyse over 50,000 samples, making it by far the largest epilepsy genetic study.
Impact So far, the ILAE has published the 2014 Lancet Neurology paper (see Publications) which reports results from the consortium's primary analysis. Three follow-up studies are nearing publication.
Start Year 2010
Title LDAK 
Description LDAK is an evolving software (written in C) for analysing genetic association study data. Originally, it was designed for computing SNP weights which enable more accurate estimates of heritability. Now, LDAK also contains MultiBLUP, our software for constructing linear prediction models which outperforms all existing methods that we have been able to compare with. LDAk can also compute genetic profile risk scores and perform gene-based analysis and meta-analysis. 
Type Of Technology Software 
Year Produced 2012 
Open Source License? Yes  
Impact Methodological aspects of LDAK (the SNP weightings and MultiBLUP prediction) are described in the American Journal of Human Genetics and Genome Research publications attached to Grant G0300766, while LDAK was used for the Brain and Lancet Neurology papers also on the grant. 
Description DataDive - community data analysis workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Approximately 100 analysts (a combination of scientific researchers, charity officials, and enthusiastic volunteers) spent a weekend working over data problems identified by three local / national charities. The common theme was combining skills to analyse the charity's data, and aim to provide guidelines on future decision making; for example, what signs can be used to identify individuals likely to become homeless, and on which areas should more attention be placed.

By the end of the weekend, a web package was developed to mine data for simple trends, on which the charities could then take action.
Year(s) Of Engagement Activity 2014