Sample Size calculations for UPDATing clinical prediction models to Ensure their accuracy and fairness in practice (SS-UPDATE)

Lead Research Organisation: University of Birmingham

Department Name: Institute of Applied Health Research

Abstract

Healthcare research is in an exciting phase, with increasing access to information to link an individual's characteristics (such as age, family history or genetic information) with health outcomes (such as death, pain level, cancer). Researchers are using this information to help health professionals accurately predict an individual's future outcomes, to better personalise treatment, improve quality of life, and prolong life. For example, QRISK is used by doctors to calculate an individual's risk of heart disease within the next 10 years, and to guide who needs treatment to reduce their risk of heart disease occurring. Such prediction tools are known as 'clinical prediction models', and thousands are developed each year using statistical and artificial intelligence (AI) approaches.

Once a prediction model like QRISK has entered into clinical practice, it is important that it is regularly updated (e.g. yearly) as otherwise its accuracy wanes over time. For example, due to changes in treatments available, the co-morbidities (multiple health conditions) of patients, and emerging global problems (e.g. pandemics), an outdated model may wrongly predict a low risk for a truly high risk individual, or vice-verse, and so model updating is needed to recalibrate predictions. Similarly, a model often needs updating when transporting it from the original setting (e.g. USA, secondary care) to a new one (e.g. UK, primary care), or when aiming to improve a model's accuracy (and thus fairness) in subgroups defined by sex and ethnicity.

The reliability, accuracy and fairness of an updated prediction model depends heavily on the representativeness and sample size of the dataset used to update the model. However, there is currently no clear guidance for how researchers should identify the (minimum) sample size required for model updating - for example, how many participants and outcome events are needed, relative to the number of model parameters being estimated (updated)? Sadly, many updating datasets are too small, and this leads to updated models with inaccurate and potentially harmful predictions. Therefore, identifying a suitable sample size is vital for researchers to consider at the outset of model updating studies.

To address this, our project aims to provide guidance and methods for calculating the (minimum) sample size required to update a prediction model to ensure it is reliable, accurate and fair. We will achieve this using a series of work packages that: (i) review applied and methodology papers using (or proposing) a model updating method, to identify current approaches and shortcomings; (ii) develop sample size guidance and solutions (mathematical formulae) for a range of model updating methods for continuous, binary or time-to-event outcomes; and (iii) extend calculations to address model updates for subgroups (e.g. ethnic groups) to ensure models are generalisable and fair. All our work will be underpinned by real applications and disseminated through freely-available computer software, web apps, dedicated workshops (with researchers and patient groups), training courses, social media and tutorial videos.

Our findings will provide quality standards for researchers to adhere to when updating models, and allow funders, health professionals and regulators to identify updated models that are reliable and fair for use in patient counselling and decision making. This aligns with "Good Machine Learning Practice for Medical Device Development: Guiding Principles" issued by US Food and Drug Administration, Health Canada and UK Medicines and Healthcare Products Regulatory Agency in 2021 to produce safe, effective and ethical models.

Funded Value:

£524,005

Funded Period:

Apr 24 - Apr 27

Funder:

MRC

Project Status:

Active

Project Category:

Research and Innovation

Project Reference:

MR/Z503873/1

Principal Investigator:

Richard Riley

Health Category:

Unclassified

Organisations

People	ORCID iD
Richard Riley (Principal Investigator)
Joie Ensor (Co-Investigator)	http://orcid.org/0000-0001-7481-0282
Lucinda Archer (Co-Investigator)	http://orcid.org/0000-0003-2504-2613
Alastair Denniston (Co-Investigator)	http://orcid.org/0000-0001-7849-0087
Kym Snell (Co-Investigator)	http://orcid.org/0000-0001-9373-6591
Glen Martin (Co-Investigator)	http://orcid.org/0000-0002-3410-9472
Joseph Alderman (Co-Investigator)	http://orcid.org/0000-0001-8273-9009
Gary Collins (Co-Investigator)	http://orcid.org/0000-0002-2772-2316
Paula Dhiman (Co-Investigator)	http://orcid.org/0000-0002-0989-0623
Krishnarajah Nirantharakumar (Co-Investigator)	http://orcid.org/0000-0002-6816-1279

Publications

Author Name

Title Publication Date Published

10 25 50

Riley RD (2025) A decomposition of Fisher's information to inform sample size for developing fair and precise clinical prediction models -- Part 2: time-to-event outcomes in arXiv

Riley RD (2025) A decomposition of Fisher's information to inform sample size for developing fair and precise clinical prediction models -- part 1: binary outcomes in arXiv

Research Tools and Methods
Collaboration
Software and Technical Products
Engagement Activities


Title	TRIPOD+AI
Description	TRIPOD+AI provides harmonised guidance for reporting prediction model studies, irrespective of whether regression modelling or machine learning methods have been used. The new checklist supersedes the TRIPOD 2015 checklist, which should no longer be used. I was part of the leadership group that developed TRIPOD+AI and our BMJ paper presents the expanded 27 item checklist with more detailed explanation of each reporting recommendation, and the TRIPOD+AI for Abstracts checklist. TRIPOD+AI aims to promote the complete, accurate, and transparent reporting of studies that develop a prediction model or evaluate its performance. Complete reporting will facilitate study appraisal, model evaluation, and model implementation.
Type Of Material	Improvements to research infrastructure
Year Produced	2024
Provided To Others?	Yes
Impact	Recommended by EQUATOR and leading medical journals, the TRIPOD+AI has already been cited 400 times since publication last year and will help improve the quality of reporting in prediction model research.
URL	https://www.bmj.com/content/385/bmj-2023-078378


Description	NICE guidance document on clinical prediction models
Organisation	National Institute for Health and Care Excellence (NICE)
Department	NICE International
Country	United Kingdom
Sector	Public
PI Contribution	Working on a NICE guidance document for how to develop, validate and appraise clinical prediction models
Collaborator Contribution	Richard Riley has co-written the guidance
Impact	Guidance paper is forthcoming
Start Year	2024


Title	pmsampsize - software module in Python for sample size required for model development stuies
Description	Calculate the sample size for developing a prediction model
Type Of Technology	Software
Year Produced	2024
Impact	Researchers are using it to design their model development studies


Title	pmsampsize: Software in Stata for calculating the sample size required for development of prediction models
Description	Our package (led by Joie Ensor) for calculating the sample size needed to develop a prediction model is constantly updated as new guidance emerges from our grants and other work
Type Of Technology	Software
Year Produced	2024
Impact	Software has been downloaded >40k times since launch in 2020, and is constantly updated during our grants
URL	https://ideas.repec.org/c/boc/bocode/s458569.html


Title	pmsampsize: Software in Stata for calculating the sample size required for development of prediction models
Description	Our package (led by Joie Ensor) for calculating the sample size needed to develop a prediction model is constantly updated as new guidance emerges from our grants and other work
Type Of Technology	Software
Year Produced	2024
Impact	Researchers routinely use this package to inform the sample size for their model development studies - package downloaded >40k times
URL	https://cran.r-project.org/web/packages/pmsampsize/index.html


Title	pmstabilityss - Stata module for calculating the sample size required for individual-level stability in risk predictions
Description	Open source in public domain - calculates the sample size required for individual-level stability in risk predictions
Type Of Technology	Software
Year Produced	2025
Impact	We anticipate researchers will use this to inform the design of their model development and updating studies


Title	pmstabilitytte - Stata module for calculating the sample size needed for precise predictions from a time-to-event prediction model
Description	In public domain - the package calculates the individual uncertainty anticipated when developing or updating a model with a particular sample size
Type Of Technology	Software
Year Produced	2025
Impact	Researchers will use this to inform the sample size needed for developing or updating their models


Title	pmvalsampsize - software module in Python for sample size required for model validation studies (2024)
Description	Calculate the sample size required for evaluating model and their performance, in the package Python - we are constantly updating and extending this
Type Of Technology	Software
Year Produced	2024
Impact	Researchers are using this to calculate the sample size for their model validation studies


Title	pmvalsampsize: Software in Stata for calculating the sample size required for validation of prediction models
Description	This calculates the sample size needed for studies evaluating a model, which is important part of risk of bias assessments in reviews of models, or for those planning studies to evaluate models
Type Of Technology	Software
Year Produced	2023
Impact	Researchers are using this to inform their sample size for model evaluation - we are constantly updating and extending this
URL	https://ideas.repec.org/c/boc/bocode/s459226.html#:~:text=pmvalsampsize%20computes%20the%20minimum%2...


Description	3-day training course: Statistical Methods for Risk Prediction & Prognostic Models (2024 - delivered twice)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	Short course to disseminate research methods to international participants, about statistical methods for risk prediction modelling, including topics such as sample size, model development, critical appraisal, model evaluation, meta-analysis, etc
Year(s) Of Engagement Activity	2024


Description	Invited Oral Presentation: Prediction models for healthcare: an introduction (Tata Memorial Centre, Mumbai, India)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Seminar to discuss methodological issues in prediction model research, and our sample size guidance. Attendees gained new knowledge and insights to improve their studies going forwards
Year(s) Of Engagement Activity	2024


Description	Invited Oral Presentation: Clinical prediction models: a playground for healthcare research (Cardiff)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Other audiences
Results and Impact	Seminar to discuss methodological issues in prediction model research, and our sample size guidance. Attendees gained new knowledge and insights to improve their studies going forwards
Year(s) Of Engagement Activity	2024


Description	Invited seminar: Size Matters: The importance of sample size on the quality and utility of AI-based prediction models for healthcare (University of Liverpool)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	Seminar to discuss methodological issues in prediction model research, and our sample size guidance. Attendees gained new knowledge and insights to improve their understanding of research in prediction for their career
Year(s) Of Engagement Activity	2024,2025


Description	MEMTAB 2025 Conference (Hosting, Organising and Delivering)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	MEMTAB is the leading international conference about methods to evaluate models, tests & biomarkers for healthcare. It allows debate & dissemination of best methods for developing, evaluating & identifying reliable models, tests & biomarkers for use in clinical practice. In 2025, we organised the 7th International conference, held at the University of Birmingham and raised the conference theme: "Methodology That Stands the Test". We pushed participants to understanding of what constitutes the research evidence needed for models, tests and biomarkers to be reliably endorsed, communicated and deployed in practice. We had 200 participants from around the world, including PhD students, Clinical Fellows, Methodologists, GPs and healthcare professionals, regulators, and economists, and included sessions on sample size, systematic reviews and PPIE involvement in methodology research for prediction models. Participants were given methodology knowledge that changes they way they will do their research in practice and how to evaluate and regulate models and tests in practice.
Year(s) Of Engagement Activity	2025
URL	https://uobevents.eventsair.com/memtab-2025/


Description	MEMTAB 2025 Short course delivery: An Introduction to Risk Prediction Models and Sample Size Calculations
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	We delivered a 1-day short course introducing the key phases of clinical prediction models, and the theory and software for sample size calculations for development, updating and evaluation, to 30 participants attending the MEMTAB conference. Participants learnt the tools and approaches to improve their research design and analyses moving forwards
Year(s) Of Engagement Activity	2025


Description	Oral presentation: Sample size calculations for accuracy-based measures (ISCB 2024, Greece)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	Session on predition model research at the ISCB methodology conference
Year(s) Of Engagement Activity	2024


Description	Oral presentation: Sample size for targeting precise individual-level risk estimates for binary outcomes (Royal Statistical Society, Brighton)
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	Presention within a prediction model session at the Royal Statistical Society conference in Brighton, Sept 2024
Year(s) Of Engagement Activity	2024


Description	PPIE group for prediction model methodology
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Patients, carers and/or patient groups
Results and Impact	We have facilitated a new PPIE group focused on supporting methodology research for clinical prediction models. This group have provided input toward existing projects on uncertainty and sample size, and will contribute to new research discussions and outputs from our methodology work going forward. Led by Kym Snell and Paula Dhiman, the group have met in the early evening online, to have open discussions about expectations and roles, and to learn about our work and for them both sides to identify how they can contribute effectively.
Year(s) Of Engagement Activity	2024


Description	Prognosis Research in Healthcare Summer School
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Our summer school was attended by 18 participants and we taught about research methods for primary studies and systematic reviews of prognosis research including prediction models. Participants were given methodology knowledge that changes they way they will do their research in practice
Year(s) Of Engagement Activity	2024


Description	Young Statisticians Meeting - Workshop on Sample Size for Prediction Models (organisation and delivery)
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Other audiences
Results and Impact	40 participants (all career-young statisticians and data scientists) came to learn about sample size calculations for risk prediction modelling, which will change how they do their research going forward
Year(s) Of Engagement Activity	2024


Description	invited oral presentation (February, 2025): Harnessing uncertainty in clinical prediction models using Stata
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Other audiences
Results and Impact	>350 participants worldwide attending Joie Ensor's invited talk on prediction model research, sample size and uncertainty ... which led to dissemination of our new software modules
Year(s) Of Engagement Activity	2025

Abstract

Organisations

People

ORCID iD

Publications