Predicting Clinical Events Using NLP Analysis of Clinical Notes in Diabetes Patients

Lead Participant: Red Star Consulting Limited



The introduction of Electronic Health Records (EHR) and the move away from paper notes has led to a proliferation of healthcare data, much of it held in free-text notes, which can now be shared across different healthcare settings and clinical specialties. However the increase in volume of data, alongside increasing rates of chronic illness and co-morbidities, has meant that clinicians struggle to synthesise this information within the short appointment times allocated. This gives rise to the risk that the clinician may not have a full picture of the overall health of the patient and may miss important symptoms.


The vision of this project is to build Machine Learning models which will (1) analyse all the clinical notes associated with a patient, (2) predict the risk of different clinical endpoints such as heart attack or death (3) and present this information to the clinician as a score or alert. Clinicians can use this to tailor the consultation, identify high risk patients, and target specific clinical outcomes.


This feasibility study will assess the technical feasibility of developing ML models and implementing them in a clinical setting. The collaborative partnership of clinical and technical expertise will also consider how to commercialise such technology and what is the most appropriate business model.


SCI-Diabetes is a world renowned EHR which has comprehensive records for 99% of diabetes patients in Scotland. The feasibility study will focus on predicting different clinical endpoints for diabetic patients using this data.


Other than manually clicking into each note - a time consuming process - there is no way for clinicians to review the entire history of a patient. Most other NLP approaches aim to extract structured information from free text and convert these into clinical codes (such as identifying mentions of specific diseases).

Instead of extracting information from free text, this proposal uses the text to directly predict different clinical endpoints. As well as analysing the entire patient history, the model will benefit from being able to aggregate different clinical judgements and even detect new patterns of disease progression.


The lead applicant Red Star will develop the ML models and collaborators will be NHSGG&C (expertise on the data, disease and development of ML models), Tactuum (expertise in decision support tools in both UK and USA) and Dr Ann Wales from DHI Scotland who is also Director for Scottish Government Knowledge and Decision Support Programme."

Lead Participant

Project Cost

Grant Offer

Red Star Consulting Limited, Glasgow £49,885 £ 34,919


NHS Greater Glasgow and Clyde £6,884 £ 6,884
Digital Health and Care Institute
Tactuum Ltd, Glasgow £14,031 £ 9,822


10 25 50