Multi-Modal Reinforcement Learning Algorithms for Improving Context-Sensitive Closed-Loop Blood Glucose Control for Type 1 Diabetics

Lead Research Organisation: University of Bristol
Department Name: Electrical and Electronic Engineering

Abstract

Historically, type 1 diabetics must adhere to a strict regime of blood glucose monitoring and daily self-administered insulin injections to maintain their glucose levels within a healthy range. The advent of continuous glucose monitors (CGM) and insulin pumps has significantly improved the capabilities of type 1 diabetics in managing their condition. However, the majority of the burden for interpreting the data and selecting the insulin dosage still falls on the individual. Hybrid closed-loop insulin delivery systems are already available to meet this need, using control algorithms to automatically adjust basal insulin infusion rates based on blood glucose measurements. These systems have shown success in improving glucose levels and reducing glycaemic variability (Lal et al. 2019, McAuley et al. 2020), but are limited in their ability to handle instances in which blood glucose changes rapidly.
To address this, reinforcement learning (RL) algorithms have been utilised to retroactively respond to changes in blood glucose. These algorithms consider states consisting of glucose, insulin, and carbohydrate data and apply this information to select the optimum basal or bolus insulin dosage. RL algorithms have been tested in silico, yielding improvements in mean glucose levels when compared to commercial control algorithms (Yamagata et al. 2020, Fox et al. 2020). However, these approaches have relied on virtual patient cohorts for algorithmic training, which are limited in their application to real clinical populations. Furthermore, type 1 diabetics utilise a wealth of information beyond the described state spaces of current RL approaches, using knowledge relating to activity, stress, and illness to further inform their decision making.
This project will build on state-of-the-art RL algorithms for glucose control and incorporate novel indicators of blood glucose from real patient data, in order to improve the suitability of these algorithms for clinical application. This approach will develop model based RL (MBRL) algorithms for offline learning and then train them on available sources of CGM and insulin data. The preliminary stages of the project will consist of analysing these sources and assessing their limitations. This will focus on datasets containing passively collected variables which could be indicative of periods of stress, work, activity, non-engagement, or illness, such as those present in the OhioT1DM (Marling and Bunesca, 2020) or D1NAMO datasets (Dubosson et al. 2020). If the available data is insufficient, this will be concluded by a period of data collection consisting of adult participants each using a CGM and insulin pump. The participants' blood glucose levels, insulin usage and food intake will be continuously recorded over a 4-week period, with additional measurements of factors such as heart rate and step count being logged using commercial wearables. To utilise the small CGM datasets effectively, techniques for improving algorithmic sample-efficiency will also be explored. This will include augmenting the datasets using methods such as contrastive learning or data generation, modifying existing sample efficient RL algorithms, such as shallow MBRL or Bayesian RL and utilising transfer learning to train models on virtual patient cohorts and applying them to real patient data.
The performance of the RL algorithm will be evaluated in silico using trajectory inspection (Ji et al, 2020), by comparing the projected blood glucose to those achieved by the patient in the dataset. Following successful implementation, the algorithm will be adapted to improve its clinical practicality. This could include exploring methods for reducing the risk and burden associated with the algorithmic training process, introducing options for customised control based on user suggestions or modifying the algorithm to provide varying levels of control when applied to patients who are not engaged with the management of their diabetes.

Planned Impact

Impact on Health and Care
The CDT primarily addresses the most pressing needs of nations such as the UK - namely the growth of expenditure on long term health conditions. These conditions (e.g. diabetes, depression, arthritis) cost the NHS over £70Bn a year (~70% of its budget). As our populations continue to age these illnesses threaten the nation's health and its finances.

Digital technologies transforming our world - from transport to relationships, from entertainment to finance - and there is consensus that digital solutions will have a huge role to play in health and care. Through the CDT's emphasis on multidisciplinarity, teamwork, design and responsible innovation, it will produce future leaders positioned to seize that opportunity.

Impact on the Economy
The UK has Europe's 2nd largest medical technology industry and a hugely strong track record in health, technology and societal research. It is very well-placed to develop digital health and care solutions that meet the needs of society through the creation of new businesses.

Achieving economic impact is more than a matter of technology. The CDT has therefore been designed to ensure that its graduates are team players with deep understanding of health and social care systems, good design and the social context within which a new technology is introduced.

Many multinationals have been keen to engage the CDT (e.g. Microsoft, AstraZeneca, Lilly, Biogen, Arm, Huawei ) and part of the Director's role will be to position the UK as a destination for inwards investment in Digital Health. CDT partners collectively employ nearly 1,000,000 people worldwide and are easily in a position to create thousands of jobs in the UK.

The connection to CDT research will strongly benefit UK enterprises such as System C and Babylon, along with smaller companies such as Ayuda Heuristics and Evolyst.

Impact on the Public
When new technologies are proposed to collect and analyse highly personal health data, and are potentially involved in life or death decisions, it is vital that the public are given a voice. The team's experience is that listening to the public makes research better, however involving a full spectrum of the community in research also has benefits to those communities; it can be empowering, it can support the personal development of individuals within communities who may have little awareness of higher education and it can catalyse community groups to come together around key health and care issues.

Policy Makers
From the team's conversations with the senior leadership of the NHS, local leaders of health and social care transformation (see letters from NHS and Bristol City Council) and national reports, it is very apparent that digital solutions are seen as vital to the delivery of health and care. The research of the CDT can inform policy makers about the likely impact of new technology on future services.

Partner organisation Care & Repair will disseminate research findings around independent living and have a track record of translating academic research into changes in practice and policy.

Carers UK represent the role of informal carers, such as family members, in health and social care. They have a strong voice in policy development in the UK and are well-placed to disseminate the CDTs research to policy makers.

STEM Education
It has been shown that outreach for school age children around STEM topics can improve engagement in STEM topics at school. However female entry into STEM at University level remains dramatically lower than males; the reverse being true for health and life sciences. The CDT outreach leverages this fact to focus STEM outreach activities on digital health and care, which can encourage young women into computer science and impact on the next generation of women in higher education.

For academic impact see "Academic Beneficiaries" section.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023704/1 01/04/2019 30/09/2027
2452234 Studentship EP/S023704/1 01/10/2020 20/09/2024 Harry Emerson