Sample Size calculations for UPDATing clinical prediction models to Ensure their accuracy and fairness in practice (SS-UPDATE)

Lead Research Organisation: University of Birmingham
Department Name: Institute of Applied Health Research

Abstract

Healthcare research is in an exciting phase, with increasing access to information to link an individual's characteristics (such as age, family history or genetic information) with health outcomes (such as death, pain level, cancer). Researchers are using this information to help health professionals accurately predict an individual's future outcomes, to better personalise treatment, improve quality of life, and prolong life. For example, QRISK is used by doctors to calculate an individual's risk of heart disease within the next 10 years, and to guide who needs treatment to reduce their risk of heart disease occurring. Such prediction tools are known as 'clinical prediction models', and thousands are developed each year using statistical and artificial intelligence (AI) approaches.

Once a prediction model like QRISK has entered into clinical practice, it is important that it is regularly updated (e.g. yearly) as otherwise its accuracy wanes over time. For example, due to changes in treatments available, the co-morbidities (multiple health conditions) of patients, and emerging global problems (e.g. pandemics), an outdated model may wrongly predict a low risk for a truly high risk individual, or vice-verse, and so model updating is needed to recalibrate predictions. Similarly, a model often needs updating when transporting it from the original setting (e.g. USA, secondary care) to a new one (e.g. UK, primary care), or when aiming to improve a model's accuracy (and thus fairness) in subgroups defined by sex and ethnicity.

The reliability, accuracy and fairness of an updated prediction model depends heavily on the representativeness and sample size of the dataset used to update the model. However, there is currently no clear guidance for how researchers should identify the (minimum) sample size required for model updating - for example, how many participants and outcome events are needed, relative to the number of model parameters being estimated (updated)? Sadly, many updating datasets are too small, and this leads to updated models with inaccurate and potentially harmful predictions. Therefore, identifying a suitable sample size is vital for researchers to consider at the outset of model updating studies.

To address this, our project aims to provide guidance and methods for calculating the (minimum) sample size required to update a prediction model to ensure it is reliable, accurate and fair. We will achieve this using a series of work packages that: (i) review applied and methodology papers using (or proposing) a model updating method, to identify current approaches and shortcomings; (ii) develop sample size guidance and solutions (mathematical formulae) for a range of model updating methods for continuous, binary or time-to-event outcomes; and (iii) extend calculations to address model updates for subgroups (e.g. ethnic groups) to ensure models are generalisable and fair. All our work will be underpinned by real applications and disseminated through freely-available computer software, web apps, dedicated workshops (with researchers and patient groups), training courses, social media and tutorial videos.

Our findings will provide quality standards for researchers to adhere to when updating models, and allow funders, health professionals and regulators to identify updated models that are reliable and fair for use in patient counselling and decision making. This aligns with "Good Machine Learning Practice for Medical Device Development: Guiding Principles" issued by US Food and Drug Administration, Health Canada and UK Medicines and Healthcare Products Regulatory Agency in 2021 to produce safe, effective and ethical models.

Publications

10 25 50