Project Title: Digital Discourse as Clinical Data
Lead Research Organisation:
St George's, University of London
Department Name: Molecular & Clinical Sci Research Inst
Abstract
Background:Decline in short term memory is the hallmark symptom of Alzheimer's disease (AD), yet early brain changes may impact spoken and written language years before the onset of other more noticeable symptoms. Changes in semantic content and syntactic complexity have been found in the speech of individuals with Mild Cognitive Impairment (MCI) who later went on to develop AD (Ahmed, de Jager, Haigh & Garrard, 2013). President Ronald Reagan was found to have used less unique words, and more fillers and non-specific nouns six years prior to his diagnosis of dementia (Berisha, Wang, LaCross & Liss, 2015). Investigating written language, researchers were able to predict who would develop AD decades later, by analysing essays written by Nuns at an average age of 22 (Snowdon, 1997). Vocabulary of the writer Iris Murdoch was found to decline in her final novel, which, unlike earlier works, was unpopular with critics and fans alike; she was diagnosed with dementia two years after its publication (Garrard, Maloney, Hodges, & Patterson, 2005). Traditionally, collecting and analysing samples of language has been subjective and time-consuming, but advances in computer science offer fast, objective methods to extract features from language and evaluate their use clinically. A test of linguistic impairment could be an inexpensive, unobtrusive alternative to current screening tools.
Aims:My PhD project will build on this work with the aim of answering the following questions: i) which linguistic features best predict later development of AD, indicating early pathological brain changes? ii) what methods are most effective for collecting speech samples? iii) are other factors, such as lifestyle, linked to changes in these markers? And iv) can Magnetic Resonance Imaging (MRI) of the brain reveal neuroanatomical correlates of language change?
Methods:In order to answer these questions, I am undertaking two studies. To investigate written language, I am building on a Cognitive Archaeology database, containing texts written over three decades by 80 individuals, half of whom later developed AD. To investigate spoken language, I have recruited 50 participants to the Characterising Cognitive Decline study (25 individuals with a recent diagnosis of probable mild AD or MCI, and 25 healthy controls). They complete a range of widely used and novel tests designed to collect a sample of speech, twice over one year, which I record and transcribe. Participants will also be asked to participate in a sub-study, writing a response to a written question once per month over the year, and undergo brain scanning. To analyse the spoken and written language collected in these studies, I am using innovative computer science tools, such as Natural Language Processing (NLP). For example, the Natural Language Tool Kit (NLTK) is a library built on the Python programming language, which allows features of language, such as syntactic complexity, to be extracted. Using machine learning, an algorithm will be developed that identifies which features of language are important for early detection of AD. Using statistics, lifestyle factors in relation to changes in language will be explored, and brain scans will also be analysed to investigate tissue microstructure.
Conclusion:It is currently predicted that by 2025 there will be 1million people with dementia in the UK (Alzheimer's Society, 2014), with devastating effect on quality of life and pressure on services. Increasing our understanding of language changes in AD would have far reaching consequences for early diagnosis, therapy and communication. An effective tool to detect subtle decline could pick up on cases currently missed, and improve selection of participants for clinical trials. Throughout this PhD I am greatly enhancing my skills in machine learning (a branch of artificial intelligence), statistics, data analytics and computation, specifically meeting the MRC prRC priority of quantitative skills.
Aims:My PhD project will build on this work with the aim of answering the following questions: i) which linguistic features best predict later development of AD, indicating early pathological brain changes? ii) what methods are most effective for collecting speech samples? iii) are other factors, such as lifestyle, linked to changes in these markers? And iv) can Magnetic Resonance Imaging (MRI) of the brain reveal neuroanatomical correlates of language change?
Methods:In order to answer these questions, I am undertaking two studies. To investigate written language, I am building on a Cognitive Archaeology database, containing texts written over three decades by 80 individuals, half of whom later developed AD. To investigate spoken language, I have recruited 50 participants to the Characterising Cognitive Decline study (25 individuals with a recent diagnosis of probable mild AD or MCI, and 25 healthy controls). They complete a range of widely used and novel tests designed to collect a sample of speech, twice over one year, which I record and transcribe. Participants will also be asked to participate in a sub-study, writing a response to a written question once per month over the year, and undergo brain scanning. To analyse the spoken and written language collected in these studies, I am using innovative computer science tools, such as Natural Language Processing (NLP). For example, the Natural Language Tool Kit (NLTK) is a library built on the Python programming language, which allows features of language, such as syntactic complexity, to be extracted. Using machine learning, an algorithm will be developed that identifies which features of language are important for early detection of AD. Using statistics, lifestyle factors in relation to changes in language will be explored, and brain scans will also be analysed to investigate tissue microstructure.
Conclusion:It is currently predicted that by 2025 there will be 1million people with dementia in the UK (Alzheimer's Society, 2014), with devastating effect on quality of life and pressure on services. Increasing our understanding of language changes in AD would have far reaching consequences for early diagnosis, therapy and communication. An effective tool to detect subtle decline could pick up on cases currently missed, and improve selection of participants for clinical trials. Throughout this PhD I am greatly enhancing my skills in machine learning (a branch of artificial intelligence), statistics, data analytics and computation, specifically meeting the MRC prRC priority of quantitative skills.
Publications
Clarke N
(2018)
P2-515: CHARACTERISING SPOKEN LANGUAGE DEFICITS IN MILD ALZHEIMER'S DISEASE AND MILD COGNITIVE IMPAIRMENT
in Alzheimer's & Dementia
Clarke N
(2018)
P1-535: PREDICTING DEMENTIA FROM WRITTEN TEXTS USING FEATURE EXTRACTION AND MACHINE LEARNING
in Alzheimer's & Dementia
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
MR/N013638/1 | 30/09/2016 | 29/09/2025 | |||
1782913 | Studentship | MR/N013638/1 | 30/09/2016 | 29/06/2020 | Natasha Clarke |
Title | Characterising Cognitive Decline Database |
Description | Collected baseline and twelve month follow-up data for 50 participants taking part in the Characterising Cognitive Decline study. Data includes neuropscyhology, health information and transcripts of language tasks. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | No |
Impact | Posters analysing portion of data presented at the Alzheimer's Association International Conference in 2018. |
Title | Cognitive Aracheology Database |
Description | Transcription of the cognitive archeology database is nearing completion. The database contains texts written across the lifetime of people with and without dementia, as well as lifestyle information. |
Type Of Material | Database/Collection of data |
Year Produced | 2018 |
Provided To Others? | No |
Impact | Analysis of portion of dataset presented at Alzheimer's Association International Conference in 2018. |
Title | NLP Scripts |
Description | New Natural Language Processing (NLP) coding scripts written in Python coding language. |
Type Of Material | Data analysis technique |
Year Produced | 2018 |
Provided To Others? | No |
Impact | Analysis techniques utilised to analyse data which was presented at AAIC 2018. |
Description | Article published on blog |
Form Of Engagement Activity | Engagement focused website, blog or social media channel |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | Winning article of science communication competition, about research project, published on Medical Research Council blog. |
Year(s) Of Engagement Activity | 2018 |
URL | https://mrc.ukri.org/news/blog/artificial-intelligence-alzheimers-disease/?redirected-from-wordpress |
Description | Presentation to Dementia Professionals |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Professional Practitioners |
Results and Impact | Presented about research in general and specific research project to professionals involved in dementia. |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation to Support Group |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Patients, carers and/or patient groups |
Results and Impact | Presented background to research and plan for study to patients and carers attending a support group. |
Year(s) Of Engagement Activity | 2017,2018 |