Project Title: Digital Discourse as Clinical Data

Lead Research Organisation: St George's University of London
Department Name: Inst of Cardiovascular & Cell Science

Abstract

Background:Decline in short term memory is the hallmark symptom of Alzheimer's disease (AD), yet early brain changes may impact spoken and written language years before the onset of other more noticeable symptoms. Changes in semantic content and syntactic complexity have been found in the speech of individuals with Mild Cognitive Impairment (MCI) who later went on to develop AD (Ahmed, de Jager, Haigh & Garrard, 2013). President Ronald Reagan was found to have used less unique words, and more fillers and non-specific nouns six years prior to his diagnosis of dementia (Berisha, Wang, LaCross & Liss, 2015). Investigating written language, researchers were able to predict who would develop AD decades later, by analysing essays written by Nuns at an average age of 22 (Snowdon, 1997). Vocabulary of the writer Iris Murdoch was found to decline in her final novel, which, unlike earlier works, was unpopular with critics and fans alike; she was diagnosed with dementia two years after its publication (Garrard, Maloney, Hodges, & Patterson, 2005). Traditionally, collecting and analysing samples of language has been subjective and time-consuming, but advances in computer science offer fast, objective methods to extract features from language and evaluate their use clinically. A test of linguistic impairment could be an inexpensive, unobtrusive alternative to current screening tools.

Aims:My PhD project will build on this work with the aim of answering the following questions: i) which linguistic features best predict later development of AD, indicating early pathological brain changes? ii) what methods are most effective for collecting speech samples? iii) are other factors, such as lifestyle, linked to changes in these markers? And iv) can Magnetic Resonance Imaging (MRI) of the brain reveal neuroanatomical correlates of language change?

Methods:In order to answer these questions, I am undertaking two studies. To investigate written language, I am building on a Cognitive Archaeology database, containing texts written over three decades by 80 individuals, half of whom later developed AD. To investigate spoken language, I have recruited 50 participants to the Characterising Cognitive Decline study (25 individuals with a recent diagnosis of probable mild AD or MCI, and 25 healthy controls). They complete a range of widely used and novel tests designed to collect a sample of speech, twice over one year, which I record and transcribe. Participants will also be asked to participate in a sub-study, writing a response to a written question once per month over the year, and undergo brain scanning. To analyse the spoken and written language collected in these studies, I am using innovative computer science tools, such as Natural Language Processing (NLP). For example, the Natural Language Tool Kit (NLTK) is a library built on the Python programming language, which allows features of language, such as syntactic complexity, to be extracted. Using machine learning, an algorithm will be developed that identifies which features of language are important for early detection of AD. Using statistics, lifestyle factors in relation to changes in language will be explored, and brain scans will also be analysed to investigate tissue microstructure.

Conclusion:It is currently predicted that by 2025 there will be 1million people with dementia in the UK (Alzheimer's Society, 2014), with devastating effect on quality of life and pressure on services. Increasing our understanding of language changes in AD would have far reaching consequences for early diagnosis, therapy and communication. An effective tool to detect subtle decline could pick up on cases currently missed, and improve selection of participants for clinical trials. Throughout this PhD I am greatly enhancing my skills in machine learning (a branch of artificial intelligence), statistics, data analytics and computation, specifically meeting the MRC prRC priority of quantitative skills.

Publications

10 25 50
 
Title Characterising Cognitive Decline Database 
Description Collected baseline and twelve month follow-up data for 50 participants taking part in the Characterising Cognitive Decline study. Data includes neuropscyhology, health information and transcripts of language tasks. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? No  
Impact Posters analysing portion of data presented at the Alzheimer's Association International Conference in 2018. 
 
Title Cognitive Aracheology Database 
Description Transcription of the cognitive archeology database is nearing completion. The database contains texts written across the lifetime of people with and without dementia, as well as lifestyle information. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? No  
Impact Analysis of portion of dataset presented at Alzheimer's Association International Conference in 2018. 
 
Title NLP Scripts 
Description New Natural Language Processing (NLP) coding scripts written in Python coding language. 
Type Of Material Data analysis technique 
Year Produced 2018 
Provided To Others? No  
Impact Analysis techniques utilised to analyse data which was presented at AAIC 2018. 
 
Description Article published on blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact Winning article of science communication competition, about research project, published on Medical Research Council blog.
Year(s) Of Engagement Activity 2018
URL https://mrc.ukri.org/news/blog/artificial-intelligence-alzheimers-disease/?redirected-from-wordpress
 
Description Presentation to Dementia Professionals 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact Presented about research in general and specific research project to professionals involved in dementia.
Year(s) Of Engagement Activity 2018
 
Description Presentation to Support Group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Patients, carers and/or patient groups
Results and Impact Presented background to research and plan for study to patients and carers attending a support group.
Year(s) Of Engagement Activity 2017,2018