Predicting and stratifying drug resistant epilepsy for improved outcomes: A population-data approach enabled by natural language processing
Lead Research Organisation:
SWANSEA UNIVERSITY
Abstract
Context
Epilepsy affects at least 600,000 people in the UK and over 50 million worldwide. People living with epilepsy face many problems including seizures, increased risk of other health conditions and greater risk of death. Despite recent treatment advances, the condition is drug-resistant (‘refractory’) in 30% of patients, with seizures continuing despite treatment with at least two anti-seizure medications. We do not know why people develop refractory epilepsy, cannot accurately predict who will develop it, and cannot treat it effectively. Understanding the causes of and improving treatments for refractory epilepsy is a top-ten priority recently identified by a national Epilepsy Priority-Setting Partnership.
Health-data research involves analysing and linking existing information about people and their health to improve healthcare. The wealth of data available in electronic health records can be ‘mined’ to improve our understanding of diseases — what causes them, who is most at risk and how to identify and treat them. Artificial intelligence (AI) and machine learning provide very powerful technologies for handling such large volumes of data.
We want to realise the great potential of data in electronic health records for epilepsy research. However, most of the detailed, disease-specific information is unstructured ‘free text’, which limits its usefulness. Specifically, the challenge is that large routinely-collected health datasets lack structured detailed epilepsy information such as epilepsy type, cause and—importantly—seizure frequency. Natural language processing (NLP), a form of AI, can automatically read and process unstructured text in medical notes and letters and could help us unlock this huge and detailed clinical-information resource to address important knowledge gaps in refractory epilepsy.
We will securely apply state-of-the-art NLP to population-scale data from three large diverse specialist epilepsy centres in England and Wales. In doing so, we will create one of the largest ever detailed epilepsy-research cohorts (datasets representing a group of people over time). Our NLP tools will extract and structure anonymised, detailed epilepsy information, including seizure frequency, from thousands of free-text clinic letters. By taking a data-science and AI approach we will address a further top-ten UK Epilepsy Research Priority while aligning with the Medical Research Council’s data-science vision.
Aim:
To improve prediction of refractory epilepsy and categorise patients according to the type and characteristics of their condition, enabling more personalised treatment.
Objectives:
Optimise NLP approaches for securely extracting detailed epilepsy information, at a population level, from epilepsy-clinic letters.
Use this information, with other population-level data, to gain a better understanding of refractory epilepsy and its characteristics.
Develop a model to predict individual risk of developing refractory epilepsy, enabling targeted treatment options at different disease stages.
Develop a clinical-user interface for our model to provide health professionals with patient-specific information in clinic; and assess patients’ and clinicians’ opinions about its use.
Potential applications and benefits
The ability to identify people at higher risk of refractory epilepsy—at initial epilepsy diagnosis and throughout the disease course—and to categorise refractory-epilepsy patients according to the specific characteristics of their condition will enable better-targeted treatments to be administered earlier. This will benefit the 180,000 people living with refractory epilepsy in the UK. Better targeting resources will also benefit the healthcare system, saving costs and improving care. Moreover, the new tools, methods and dataset produced will benefit researchers in epilepsy and other conditions.
Epilepsy affects at least 600,000 people in the UK and over 50 million worldwide. People living with epilepsy face many problems including seizures, increased risk of other health conditions and greater risk of death. Despite recent treatment advances, the condition is drug-resistant (‘refractory’) in 30% of patients, with seizures continuing despite treatment with at least two anti-seizure medications. We do not know why people develop refractory epilepsy, cannot accurately predict who will develop it, and cannot treat it effectively. Understanding the causes of and improving treatments for refractory epilepsy is a top-ten priority recently identified by a national Epilepsy Priority-Setting Partnership.
Health-data research involves analysing and linking existing information about people and their health to improve healthcare. The wealth of data available in electronic health records can be ‘mined’ to improve our understanding of diseases — what causes them, who is most at risk and how to identify and treat them. Artificial intelligence (AI) and machine learning provide very powerful technologies for handling such large volumes of data.
We want to realise the great potential of data in electronic health records for epilepsy research. However, most of the detailed, disease-specific information is unstructured ‘free text’, which limits its usefulness. Specifically, the challenge is that large routinely-collected health datasets lack structured detailed epilepsy information such as epilepsy type, cause and—importantly—seizure frequency. Natural language processing (NLP), a form of AI, can automatically read and process unstructured text in medical notes and letters and could help us unlock this huge and detailed clinical-information resource to address important knowledge gaps in refractory epilepsy.
We will securely apply state-of-the-art NLP to population-scale data from three large diverse specialist epilepsy centres in England and Wales. In doing so, we will create one of the largest ever detailed epilepsy-research cohorts (datasets representing a group of people over time). Our NLP tools will extract and structure anonymised, detailed epilepsy information, including seizure frequency, from thousands of free-text clinic letters. By taking a data-science and AI approach we will address a further top-ten UK Epilepsy Research Priority while aligning with the Medical Research Council’s data-science vision.
Aim:
To improve prediction of refractory epilepsy and categorise patients according to the type and characteristics of their condition, enabling more personalised treatment.
Objectives:
Optimise NLP approaches for securely extracting detailed epilepsy information, at a population level, from epilepsy-clinic letters.
Use this information, with other population-level data, to gain a better understanding of refractory epilepsy and its characteristics.
Develop a model to predict individual risk of developing refractory epilepsy, enabling targeted treatment options at different disease stages.
Develop a clinical-user interface for our model to provide health professionals with patient-specific information in clinic; and assess patients’ and clinicians’ opinions about its use.
Potential applications and benefits
The ability to identify people at higher risk of refractory epilepsy—at initial epilepsy diagnosis and throughout the disease course—and to categorise refractory-epilepsy patients according to the specific characteristics of their condition will enable better-targeted treatments to be administered earlier. This will benefit the 180,000 people living with refractory epilepsy in the UK. Better targeting resources will also benefit the healthcare system, saving costs and improving care. Moreover, the new tools, methods and dataset produced will benefit researchers in epilepsy and other conditions.