Map-based Visualisation and Statistical Inference with Dynamic Health Data

Lead Research Organisation: Lancaster University
Department Name: Medicine

Abstract

Whenever someone visits their GP or the hospital, their medical condition (incl. test results, vaccinations, etc.) is recorded in their personal NHS health file. This process generates a largely untapped wealth of data on the nation's health. A health atlas uses this medical information to generate maps in order to show which areas of a region (eg. town or country) are at a low or high risk of a particular disease. When assessing the risk of disease it is common to account for factors such as poverty, age and coexisting diseases. Such health atlases already exist, but are inflexible as they use old historical data and take a long time to produce. Consequently they do not display the present health landscape.

The fundamental aim of the project is to develop methods and tools for creating 'real-time' health atlases; as an example it will use the NHS medical records of the population living in the city of Salford, UK (over 200,000 people). The reason for targeting Salford is that it has over twenty years of records stored in a central state-of-the-art computer system, everyone's NHS records are regularly collected therefore they are completely up-to-date (over 200 million events have been recorded). This therefore opens the opportunity to create the first 'real-time' health atlas which will display, partly through maps, the current health state of the Salford population in 'real-time' via user-friendly interactive web applications. There will be two versions of the atlas one for the general public as described above. The other version will convey more detailed/specialist information so will only be available to health care professionals and scientists. In both cases great care will be taken to ensure the information displayed is sufficiently straightforward to understand. Privacy will be strictly protected; individuals will not be identifiable on this atlas.

Within the lifetime of the project, the real-time health atlas will focus on kidney disease which affects approximately 5% of the adult population in the UK although it is far more prevalent in some sectors eg. in the elderly. Furthermore, people who suffer from other medical conditions, such as cardiovascular disease, high blood pressure and diabetes, have a higher risk of developing kidney disease. Certain ethnic groups are more likely to have these medical conditions, consequently are more likely to develop kidney disease, hence this disease is an important public health concern. Therefore mapping this disease's distribution on a real-time health atlas is a timely development.

The real-time health atlas is to be developed by a Research Fellow based in the 'Combining Health Information, Computation and Statistics' Group in the School of Medicine at Lancaster University. The Fellow will work with colleagues at the Farr Institute of Health Informatics Research, based within The University of Manchester, and also with the tertiary care group for kidney disease within the Salford Royal NHS Foundation Trust. These three institutions, each with strong research records in their own fields, will lend their expertise to advise on the statistical and software development behind the real-time health atlas. This atlas sits at the interface between statistics and information technology where the funder, The Medical Research Council, recognises there is a need to invest, partly due to a UK skills shortage.

This real-time health atlas will be designed so that in the future, beyond the lifetime of this project, it can be used both for other diseases and in different geographical regions. The atlas will be up to date and interactively display disease specific information on the internet therefore everyone, from individuals to health care professionals to policy makers, will have the opportunity to improve people's quality of life by targeting specific areas where there is a greater likelihood of certain diseases; in general this should reduce long-term health care costs.

Technical Summary

The primary aim of the research is to develop the statistical and informatics methodology required for real-time health atlases; as an exemplar it will use primary care electronic health records (SIR) from the population of Salford city and focus on chronic kidney disease (CKD). This will extend the current static health atlases to real-time web applications which will be available in two forms; a 'public' web application will be aimed at the general public and a more sophisticated 'professional' web application will be made available to professionals (subject to data owner's permission). To achieve this statistical methodology will be developed along with open-source software for: querying SIR (SQL); statistical modelling (R-package); and interactively conveying the results via web applications (JavaScript) adhering to established human-computer interaction principles. The software will be flexible enough for the atlas, beyond the lifetime of the project, to be used for other diseases and in different locations. Furthermore it may inform future research projects or health care authorities if unexpected (spatial and/or temporal) disease patterns emerge.

The secondary aim is to train the Fellow so that she can make a career transition from physics to biostatistics and informatics. Formal statistics training will realised by her undertaking an MSc in statistics with a bias towards medical statistics and training in informatics will be via directed reading and short courses at the Farr Institute. The MSc project will give the Fellow additional experience of applying advanced statistical methods in a clinical setting; aim is to identify key risk-factors underlying chronic kidney disease progression to end-stage (outcome). Existing statistical methodology will suffice for modelling single time-to-event outcomes but joint modelling of multivariate time-to-event outcomes will require some novel methodological extension.

Planned Impact

For the benefit of academic researchers and health care professionals wherever possible all research will be disseminated to the appropriate communities (nephrology, biostatistics, health informatics) by open-access publications and open-source software. The beneficiaries for each activity are as follows:

i) Research Activity A: real-time health atlas

The atlas will be realised by developing new methods and software including two interactive web applications for conveying the health information to either the general public or professionals (eg. clinicians, scientists). I will design the software system in such a way that it will be possible to extend the real-time health atlas to other diseases and geographical areas. Consequently beyond the lifetime of this project other researchers will be able to apply it to their diseases and geographical regions of interest potentially benefitting individuals, health care professionals and policy makers; creating long-term opportunities for improvements in quality of life and potential for reducing the cost of ill-health on economies. The 'professional' web application atlas could inform future research or health care interventions if unexpected temporal and/or spatial patterns are identified and aid the development of hypotheses to understand variability in disease risks; either evidence or lack thereof could potentially convey important information to the public health and policy authorities.

The real-time health atlas is exemplified by chronic kidney disease (CKD) in Salford. It will potentially be of benefit to those with CKD, or those at risk of developing it. Raising awareness of CKD could encourage individuals to make life style changes especially if supported directly by their health care professions and indirectly by policy makers (who could perhaps in the future implement screen programs to proactively identify early stage CKD). Money channelled into targeting people at risk of CKD could delay or prevent the onset of the later stages of CKD and hence other disease complications, improving people's quality of life and reducing the economic cost of this disease on the NHS and country.

ii) Research Activity B: MSc project

The immediate beneficiaries of this research will be the research and clinical communities (e.g. through journal publication) however once this research has identified the key risk-factors which determine the rate of kidney disease progression this knowledge should in the longer term be a step towards aiding better health care strategies and treatments benefitting patients, their families and perhaps reducing long-term economic costs to the country.

During my MSc project I will work in a clinical environment by applying advanced statistics to a specific kidney disease research question, this will benefit me by exposing me to a clinical setting and benefit the renal clinicians by exposing them to advanced statistics. Undertaking cross-disciplinary research to the point of producing research output (e.g. journal publications) will benefit both research communities by building strong collaborative links and an understanding of each other's skills, limitations and environments.

iii) Fellowship training and beyond

During this proposed fellowship I will undertake an MSc in statistics with a bias towards medical statistics and train alongside experts in health informatics and statistics. This will put me in a strong position at the end of the fellowship to take up an academic position at the interface of health informatics and biostatistics, where there is currently a skills shortage in the UK. I will then be in a position to lead my own research team at this interface and support the training of the next generation of researchers.
 
Description Salford Kidney Study 
Organisation Salford Royal NHS Foundation Trust
Department Renal Services
Country United Kingdom 
Sector Hospitals 
PI Contribution Extensive statistical modelling using this collaborators longitudinal kidney disease database.
Collaborator Contribution Clinical expertise in kidney disease and an extensive database of over 3000 kidney disease patients in secondary care.
Impact Multi-disciplinary - statisticians and renal clinicians Masters by Research thesis - title "Risk factors for the rate of progression of chronic kidney disease in secondary care patients" - available at https://doi.org/10.17635/lancaster/thesis/882 Software: 'R markdown tutorial for formatting a masters or phd thesis' available from https://achale.gitlab.io/tutorialmarkdownthesis/
Start Year 2016
 
Description Small Animal Veterinary Surveillance Network 
Organisation University of Liverpool
Country United Kingdom 
Sector Academic/University 
PI Contribution This collaboration started before this award but is still on-going as a direct result of the software development which took place as part of the proposed work for this award. I have worked with Small Animal Veterinary Surveillance Network on producing a statistical model for disease surveillance which has resulted in one journal publication in 2019 (doi: 10.1038/s41598-019-53352-6). I'm continuing to work with them on developing a disease surveillance in near-real time - a working prototype is now up and running which is extensively based around software developed during this award.
Collaborator Contribution The prevision of data from their network of small animal veterinary surgeries, along with their considerable veterinary expertise.
Impact This is multi-disciplinary between vets and statisticians. Outcomes: Journal publication during 2019 - doi: 10.1038/s41598-019-53352-6 Interview: https://www.liverpool.ac.uk/media/livacuk/savsnet/Interview,with,Dr,Alison,Hale,FINAL.pdf
Start Year 2014
 
Title Map-based Visualisation with Dynamic Health 
Description The idea behind this software is to give healthcare professionals and policy makers the ability to view health data simultaneously in space (on a map) and time. For example this could make it easier to identify spatial and temporal trends. At the heart of this software which I have written is a JavaScript application that I have designed to display health data on spatial maps (e.g. Google Maps) and times series figures. My JavaScript application is embedded into an html/css based web page which allows the health data to be viewed using a standard web browser. This software is flexible in the sense it is not fixed to a particular data source. It is customised by using a prescribed format of metadata to display a variety of maps (point, areal, raster) and a variety of time series configurations (e.g. predicted data with confidence intervals). Likewise the html/css web page can be customised as required. The software can be put on any web server and when needed embedded into an existing web site (e.g. demo at http://www.lancaster.ac.uk/staff/haleac/healthatlas/). Alternatively if an end-user has some ESRI Shapefiles and appropriate health data they can display their map and time series data by uploading them through my Shiny app (e.g. demo at http://fhm-chicas-apps.lancs.ac.uk/shiny/users/haleac/healthatlas/). Working prototype on SriLanka Dengue disease data is accessible at https://www.lancaster.ac.uk/staff/haleac/srilanka/ Download the Dynamic Atlas app is open-source and can be downloaded from https://gitlab.com/achale/dynamicatlas; this is a working demo which can be viewed at https://achale.gitlab.io/dynamicatlas/. This is supported by a tutorial which can be accessed at https://achale.gitlab.io/dynamicatlastutorial/ and downloaded from https://gitlab.com/achale/dynamicatlastutorial/ Furthermore this software is also incorporated within a Shiny app http://fhm-chicas-apps.lancs.ac.uk/shiny/users/haleac/healthatlas/ 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact No notable impacts have been realised to date. This software is also now incorporated in a prototype disease surveillance tool which I have developed for the Small Animal Veterinary Surveillance Network - this is probably not reached the stage of being a 'notable impact' but I expect it will in time. I am also working with my Manchester University collaborators to incorporate this software into one of their existing systems and write a journal paper. 
URL http://www.lancaster.ac.uk/staff/haleac/
 
Title Mapping UK COVID-19 predictions over space and time 
Description Map-based Visualisation with Dynamic Health web app (initial version 2018) which was created during the period over which this award was held has now, as of the first quarter of 2021, this software has been used and also partly extended to display UK COVID-19 predictions. Note that this web app displays the predictions but does not compute them. The web app displays a variety of COVID-19 metrics including, but not limited to, predicted prevalence and reproduction number for each local authority district in the UK and is updated on a daily basis. Helper functions for configuring the Dynamic Atlas web app were written during 2021 see https://gitlab.com/achale/dhaconfig 
Type Of Technology Webtool/Application 
Year Produced 2021 
Open Source License? Yes  
Impact The software has been routinely used to display/show Lancaster's covid-19 predictions at SPI-M meetings during 2021 and the first quarter or 2022 
URL https://chicas-covid19.gitlab.io/bayesstm/
 
Title R markdown tutorial for formatting a masters or phd thesis. 
Description R Markdown provides a way to include data analysis and modelling results directly within a thesis or report. Given some data, R-code is then used to produce results directly within the thesis document. There is no need to create tables by hand or generate and save figure files. Producing a coherent thesis can be a little tricky in R markdown so I wrote a tutorial to explain how to properly format the text, equations, tables, figures, images, page numbers, appendix, and so on. My tutorial is available from https://achale.gitlab.io/tutorialmarkdownthesis/ I also created a skeleton thesis template which can be used on its own or in conjunction with my tutorial. 
Type Of Technology Webtool/Application 
Year Produced 2019 
Open Source License? Yes  
Impact None as yet. 
URL https://achale.gitlab.io/tutorialmarkdownthesis/
 
Description interview by Small Animal Veterinary Surveillance Network 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Interview conducted by Small Animal Veterinary Surveillance Network with the target audience being veterinary practices involved in this scheme. the interview is available at https://www.liverpool.ac.uk/media/livacuk/savsnet/Interview,with,Dr,Alison,Hale,FINAL.pdf
Year(s) Of Engagement Activity 2019
URL https://www.liverpool.ac.uk/savsnet/news/stories/title,1180294,en.html