Machine Learning Methods for Predicting Phospholipidosis

Lead Research Organisation: University of Cambridge
Department Name: Chemistry


Phospholipidosis is the accumulation of excessive quantities of fatty material (specifically phospholipids) within cells, which can occur in many different organs and cell types. Effects have been noted in the nervous system, lymphatic system, liver, kidneys, eyes and lungs. Phospholipidosis is of great concern to the pharmaceutical industry, especially in the context of the nervous system, where phospholipidosis in neurons can disrupt cell signalling.Since the development of medicines is such an enormously expensive process, it is extremely important to be able to predict adverse effects from chemical structure in advance of synthesis. Ideally, predictions of toxicity should be made at a very early stage in the design of new medicines, hence minimising the expense and time wasted on medicines that turn out to be unsafe or ineffective.In this project, we will produce predictive computer models of the phospholipidosis inducing potential of substances that might possibly be developed into medicines. These models will be substantially more sophisticated and accurate than the models that have previously appeared in the scientific literature. The main method we will use is called Random Forest. The forest is a set of several hundred decision trees , each of which is basically a flow diagram. We will train them to learn patterns in the known properties of existing medicines, and failed candidates, and their tendencies to induce phospholipidosis. However, the way in which we will generate the trees involves computer-simulated dice-rolling. This will ensure that they are all different, though based on the same underlying information. The decision trees then behave like jury members, voting on whether each new substance should be classed as safe or unsafe.The work proposed here is a cost-effective project with a very high probability of successfully predicting phospholipidosis inducing potential. It uses state-of-the art computer-based chemistry and machine learning methods to address a major current problem in designing and developing medicines. More generally, this work is at the cutting edge of the developing field of computational toxicology. For social and political reasons, this is almost certain to become a hot area as concerns about the environmental and health effects of chemicals and medicines mount, at the same time as animal experiments are likely to be increasingly phased out.


10 25 50
Description Different phospholipidosis-inducing compounds are predicted to interact with different putative phospholipidosis-relevant targets. This strongly suggests that different compounds induce phospholipidosis via different targets, and therefore also by different mechanisms.
Exploitation Route Further experimental and computational research could follow up the mechanistic suggestions we have made.
Sectors Chemicals,Healthcare,Manufacturing, including Industrial Biotechology

Description Relevance vector machine software was made publicly available for use by SMEs and larger companies in sectors such as biotech, chemicals and pharmaceuticals. Datasets developed in the course of the project releating to phospholipidosis have also been made available and distributed. Finding and methodologies from the project are being incorporated into University level teaching material.
First Year Of Impact 2011
Sector Agriculture, Food and Drink,Chemicals,Education,Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Economic