Solubility Prediction in Novel Chemical Space: Combining Statistics and DFT Calculations

Lead Research Organisation: University of Leeds
Department Name: Sch of Chemistry

Abstract

An improved protocol for in silico solubility prediction using statistical tools such as
single/multivariate linear regression (S/MLR) and principle component analysis (PCA) is proposed. The
required descriptors for statistical analysis will be generated by DFT calculations, in contrast to the common
ClogP fragments approach, to encompass both global and local structural properties of the solute, solutesolvent
and solute-solute lattice interactions. Principle component analysis (PCA) and chemical space
mapping with these descriptors will enable selective experimental solubility measurements to expand the
chemical space coverage of the method. A training set of data, including difficult to predict salts and ionic
compounds, will be used to develop the protocol, before validation and final evaluation with a test set of
compounds. Upon completing these successfully, the project will switch focus to the expansion of reliable
solubility prediction into novel chemical space.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509243/1 01/10/2015 31/12/2021
1650991 Studentship EP/N509243/1 01/10/2015 23/12/2016 Ben Kyffin