Polygenic Risk Prediction with Machine Learning

Lead Research Organisation: University of St Andrews
Department Name: Computer Science

Abstract

The Polygenic Risk Score (PRS) shows promise as a diagnostic tool, but thus far has only been able to explain a minority of the variance seen in disease phenotype, which prevents its adoption in a clinical setting. The typical approach for selecting SNPs (Single Nucleotide Polymorphisms) for PRS calculation involves the implicit assumption that phenotypic effects depend on the action of a small number of influential SNPs whose effects combine additively, an assumption which some are now calling into question (Boyle, Li & Pritchard, 2017). It also discards a massive amount of potentially informative genetic data.
The possibility of accounting for the miniscule complex interactions of many SNPs requires some reworking of the protocol for genotype-phenotype association research. Machine Learning techniques are ideally suited to complex analysis in data-rich fields, and have the added advantage of not requiring prior knowledge of the underlying mechanism governing the system in order to make predictions. Hence machine learning provides excellent tools for making improvements to the current GWAS (Genome-wide Association Study) and PRS framework, and some successes have already been seen in this area (see 'GraBLD', Paré, Mao & Deng, 2017). Machine learning methods have also had recent success in predicting the function of other complex poorly-understood systems in genomics (see for example 'Basset', Kelley, Snoek & Rinn, 2016), further showing their suitability for this area.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/R513337/1 01/10/2018 30/09/2023
2268475 Studentship EP/R513337/1 01/01/2019 31/12/2021 Chloe Hequet