Integrating hospital outpatient letters into the healthcare data space
Lead Research Organisation:
University of Manchester
Department Name: Computer Science
Abstract
The importance of analysing health data collected as part of clinical care and stored in electronic health records is well-established. This has led to vital research about the occurrence and progression of disease, treatment effectiveness and safety, and health service delivery. The current Covid-19 pandemic has demonstrated the public health need to efficiently use data collected at the point of care to rapidly understand patterns, risk factors and outcomes of emerging diseases. Much of this work comes from primary care electronic health records, where general practitioners (GPs) enter and use structured, coded healthcare data. The picture in hospitals, however, is very different.
One in four people in the UK live with one or more long-term conditions like cardiovascular diseases, chronic respiratory diseases, type 2 diabetes, arthritis and cancer, which account for 70% of the NHS budget. Specialised opinion about management of long-term conditions (LTCs) is provided through hospital outpatient care. Data and insight from outpatient clinics, however, is almost entirely absent. There is, surprisingly, no national system for recording diagnoses in hospital outpatient clinics. Information about key clinical events is instead recorded in outpatient letters, which are primarily used to communicate with patients and GPs. The ways in which letters are written and their sensitive content mean that they are not available for larger-scale "secondary use", i.e. to support clinical practice, research or service improvement. For example, shielding for the current pandemic relied on hospital clinical teams going through patient letters manually to identify those who needed shielding based on free-text information about diagnoses and medications, with clear time constraints and risks to under- and over-shield patients.
Natural language processing (NLP) and text mining develop computer algorithms to automatically extract relevant information from free-text documents. This project will establish a partnership between academia, secondary care and industry to develop a standards-based information management framework to safely unlock information stored in outpatient letters, link it with other health data and demonstrate its impact and benefits through two case studies. We will develop new methods to extract key clinical events from letters and represent their details (e.g. medication used, duration of symptoms) in a computerised form so that it can be easily accessed. In doing so, we will use the NHS-adopted standards so that the outpatient letters can be linked to other hospital databases and do not live in their own silo. The protection of sensitive data that potentially appear in outpatient data is a prime concern, so we will develop clear rules on who and how can access such data, in particular considering that third parties (e.g. industry) may need to access that data for developing their tools. These rules will be developed in a close collaboration between patient representatives, clinicians and specialists to ensure safeguards, public trust and transparency of decision making.
We will demonstrate the potential impact of the proposed methods through two case studies with our clinical and business partners. Our first case study will demonstrate how the proposed models can assist in timely, efficient, dynamic and transparent identification of patients for shielding in a pandemic, or for vaccination prioritisation. In the second case study, we will illustrate how the same information can be used address important gaps in our knowledge about health and care, including, for example, disease prevalence and drug utilisation patterns. All outputs will be developed in a way that can be scaled beyond the single clinical site and single speciality.
One in four people in the UK live with one or more long-term conditions like cardiovascular diseases, chronic respiratory diseases, type 2 diabetes, arthritis and cancer, which account for 70% of the NHS budget. Specialised opinion about management of long-term conditions (LTCs) is provided through hospital outpatient care. Data and insight from outpatient clinics, however, is almost entirely absent. There is, surprisingly, no national system for recording diagnoses in hospital outpatient clinics. Information about key clinical events is instead recorded in outpatient letters, which are primarily used to communicate with patients and GPs. The ways in which letters are written and their sensitive content mean that they are not available for larger-scale "secondary use", i.e. to support clinical practice, research or service improvement. For example, shielding for the current pandemic relied on hospital clinical teams going through patient letters manually to identify those who needed shielding based on free-text information about diagnoses and medications, with clear time constraints and risks to under- and over-shield patients.
Natural language processing (NLP) and text mining develop computer algorithms to automatically extract relevant information from free-text documents. This project will establish a partnership between academia, secondary care and industry to develop a standards-based information management framework to safely unlock information stored in outpatient letters, link it with other health data and demonstrate its impact and benefits through two case studies. We will develop new methods to extract key clinical events from letters and represent their details (e.g. medication used, duration of symptoms) in a computerised form so that it can be easily accessed. In doing so, we will use the NHS-adopted standards so that the outpatient letters can be linked to other hospital databases and do not live in their own silo. The protection of sensitive data that potentially appear in outpatient data is a prime concern, so we will develop clear rules on who and how can access such data, in particular considering that third parties (e.g. industry) may need to access that data for developing their tools. These rules will be developed in a close collaboration between patient representatives, clinicians and specialists to ensure safeguards, public trust and transparency of decision making.
We will demonstrate the potential impact of the proposed methods through two case studies with our clinical and business partners. Our first case study will demonstrate how the proposed models can assist in timely, efficient, dynamic and transparent identification of patients for shielding in a pandemic, or for vaccination prioritisation. In the second case study, we will illustrate how the same information can be used address important gaps in our knowledge about health and care, including, for example, disease prevalence and drug utilisation patterns. All outputs will be developed in a way that can be scaled beyond the single clinical site and single speciality.
Publications
Alfattni G
(2021)
Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.
in Journal of biomedical informatics
Alhassan A
(2025)
Discontinuous named entities in clinical text: A systematic literature review.
in Journal of biomedical informatics
Ford E
(2025)
What is the patient re-identification risk from using de-identified clinical free text data for health research?
in AI and Ethics
Han L
(2024)
Neural machine translation of clinical text: an empirical investigation into multilingual pre-trained language models and transfer-learning.
in Frontiers in digital health
| Description | So far, this grant has helped establish a strong PPIE national network that focuses on healthcare text analytics. A series of PPIE events have been organised to raise awarness of the opprotunities and risk of healthcare text analytics. To complement the above, a national free-text governance network is being established to help and advise on using healtcare free-text data to support research both in clinical NLP but also for epidemiological research. The grant also helped establish priorities for healthcare data donations. |
| First Year Of Impact | 2023 |
| Sector | Digital/Communication/Information Technologies (including Software),Education,Healthcare,Pharmaceuticals and Medical Biotechnology |
| Impact Types | Societal Policy & public services |
| Description | Advisory committee for AI in healthcare |
| Geographic Reach | National |
| Policy Influence Type | Participation in a guidance/advisory committee |
| Impact | This is a new committee established at the Trust. |
| Description | Configurable federated de-identification of clinical free-text data to unlock the research potential of unstructured patient data to improve health and treatment outcomes |
| Amount | £13,000 (GBP) |
| Organisation | University of Manchester |
| Sector | Academic/University |
| Country | United Kingdom |
| Start | 04/2022 |
| End | 09/2022 |
| Description | This patient does not exist! Computing Plausibility and Privacy Preserving Synthetic Datasets to Support Clinical AI and Informatic (NWCybercom Project, funded by Research England's Connecting Capability Fund) |
| Amount | £50,000 (GBP) |
| Organisation | United Kingdom Research and Innovation |
| Sector | Public |
| Country | United Kingdom |
| Start | 03/2025 |
| End | 09/2025 |
| Title | drugprepr: Prepare Electronic Prescription Record Data to Estimate Drug Exposure |
| Description | Prepare prescription data (such as from the Clinical Practice Research Datalink) into an analysis-ready format, with start and stop dates for each patient's prescriptions. |
| Type Of Material | Improvements to research infrastructure |
| Year Produced | 2021 |
| Provided To Others? | Yes |
| Impact | Used to prepare drug exposure data in the Centre for Epidemiology. |
| URL | https://cran.r-project.org/web/packages/drugprepr/index.html |
| Description | Co-Production Collective |
| Organisation | University College London |
| Country | United Kingdom |
| Sector | Academic/University |
| PI Contribution | Providing expertise in clinical NLP, and research questions for co-production. |
| Collaborator Contribution | Providing expertise and co-organising PPIE meetings for co-production. |
| Impact | A series of 6 workshops and focus groups organised. A piublication in preparation. |
| Start Year | 2022 |
| Description | Partnership with Akrivia Health |
| Organisation | Akrivia Health |
| Country | United Kingdom |
| Sector | Hospitals |
| PI Contribution | Discussions between R&D teams at UoM and Akrivia Health, in the area of generative AI. UoM contributes expertise in NLP. |
| Collaborator Contribution | Discussions between R&D teams at UoM and Akrivia Health, in the area of generative AI. Akrivia contributes expertise in clinical NLP and data access |
| Impact | Discussions |
| Start Year | 2023 |
| Description | Partnership with IQVIA/Linguamatics |
| Organisation | IQVIA |
| Department | IQVIA, UK |
| Country | United Kingdom |
| Sector | Private |
| PI Contribution | Identifying computational standards for training and deployment of clinical NLP software. This has followed from the Healtex industrial forum discussions |
| Collaborator Contribution | A case study for development and deployment of NLP tools, in collaboration with clinical partners. |
| Impact | Joint funding. Publications to follow-up. |
| Start Year | 2020 |
| Title | MASK - de-identification of clinical narrative |
| Description | Medical health records and clinical summaries contain a vast amount of important information in textual form that can help advancing research on treatments, drugs and public health. However, the majority of these information is not shared because they contain private information about patients, their families, or medical staff treating them. Regulations such as HIPPA in the US, PHIPPA in Canada and GDPR regulate the protection, processing and distribution of this information. In case this information is de-identified and personal information are replaced or redacted, they could be distributed to the research community. In this paper, we present MASK, a software package that is designed to perform the de-identification task. The software is able to perform named entity recognition using some of the state-of-the-art techniques and then mask or redact recognized entities. The user is able to select named entity recognition algorithm (with pre-trained models, including BERT, GLoVe and ELMo embedding) and masking algorithm (e.g. shift dates, replace names/locations, totally redact entity). |
| Type Of Technology | Software |
| Year Produced | 2023 |
| Open Source License? | Yes |
| Impact | Used as part of HIPS and Jigsaw projects. |
| Description | (Un)locking clinical free-text documentation |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation at the 'Clinical free-text workshop' in Edinburgh in March 2023. Discussion on specification for free-text de-identification in healthcare. |
| Year(s) Of Engagement Activity | 2023 |
| Description | 19th Workshop on Multiword Expressions (MWE 2023) |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Lifeng Han organised 19th Workshop on Multiword Expressions (MWE 2023), and chaired Special track on MWEs in clinical NLP. He was also the programme co-chair. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://multiword.org/mwe2023/ |
| Description | A Transformer-based Machine Learning Framework using Conditional Random Fields as Decoder for Clinical Text Mining (HealTAC 2022) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation by Lifeng Han at HealTAC 2022: the 5th Healthcare Text Analytics Conference · Jun 15, 2022 (London hub) Abstract: "Clinical Natural Language Processing (NLP) methods are increasingly used in different healthcare applications, including identification of drug exposure, disease severity progression, relation extraction, etc. However, the majority of published models use either statistical modelling or neural network based models such as LSTMs. To take advantage of the strength from both paradigms and further improve the model performances in this domain, we explore a new clinical NLP framework that uses state of the art Transformer neural models as encoder in combination with Conditional Random Fields (CRFs) as decoder. To overcome the data scarce issue where the manually annotated clinical data is hard to acquire, we propose to use graph-based label propagation method to extend labelled dataset for model learning. We will test the framework for the drug information extraction task from n2c2 challenges. Git: https://github.com/poethan/TransformerCRF" |
| Year(s) Of Engagement Activity | 2023 |
| Description | BeeManc at the PLABA Track of TAC 2023: Investigating LLMs and Controllable Attributes for Improving Biomedical Text Readability (PLABA 2023) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presented at "Plain Language Adaptation of Biomedical Abstracts" (PLABA 2023) @ Fifteenth Text Analysis Conference (TAC 2023); hybrid - presented online by Lifeng Han + Zihao Li. Workshop: November 13, 2023 |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://tac.nist.gov/2023/ |
| Description | BeeManc at the PLABA Track of TAC-2024 |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | BeeManc at the PLABA Track of TAC-2024: RoBERTa for task 1 and LLaMA3. 1 and GPT-4o for task 2 |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://github.com/HECTA-UoM/PLABA2024 |
| Description | CANTONMT: Cantonese to English NMT Platform with Fine-Tuned Models using Real and Synthetic Back-Translation Data |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Talk at EAMT 2024, Sheffield, UK |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://aclanthology.org/2024.eamt-1.49/ |
| Description | Clinical NLP workshop |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Panel discussion with clinical NLP colleges from Oxford and Sheffield on pre-trained clinical language models, fusion with ontologies and knowledge graphs. Talks by Aline Villavicencio and Hang Dong (29/30 November 2022). |
| Year(s) Of Engagement Activity | 2022 |
| Description | Clinical Text Data bank |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Policymakers/politicians |
| Results and Impact | Talk by G. Nenadic at ' Clinical free-text workshop' in Edinburgh in March 2023. Presented the findings/ideas on clinical free-text data donation. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Diagnosis Certainty and Progression: A Natural Language Processing Approach to Enable Characterisation of the Evolution of Diagnoses in Clinical Notes (HealTAC 2022) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation by Lifeng Han at HealTAC 2022: the 5th Healthcare Text Analytics Conference · Jun 15, 2022 (mode: London hub in person, hybrid) Abstract: "The accurate identification of diagnoses in free clinical narratives is decisive for characterizing the patients in a medical cohort. Therefore, the knowledge extraction and information retrieval tasks must be addressed carefully. Clinical notes might present multiple qualifiers that could change the meaning of a statement: negation, speculation, temporal information, family history and so on. It is not unusual for caregivers to preserve uncertainty using broad and ambiguous terms when they have not full evidence of the disease status of a patient." |
| Year(s) Of Engagement Activity | 2023 |
| Description | Exploring foundation models |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Participation in an event organised by the Alan Turing Institute: "Exploring foundation models" 22.02.2023 |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.turing.ac.uk/events/exploring-foundation-models |
| Description | Exploring the Value of Pre-trained Language Models for Clinical Named Entity Recognition (IEEE Big Data 2023) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation by Lifeng Han at 2023 IEEE International Conference on Big Data (BigData). December 15th - 18th, 2023, Sorrento, Italy |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://bigdataieee.org/BigData2023/ |
| Description | Extraction of Medication and Temporal Relation from Clinical Text using Neural Language Models (IEEE Big Data 2023) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation by Lifeng Han at 2023 IEEE International Conference on Big Data (BigData). December 15th - 18th, 2023, Sorrento, Italy |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://bigdataieee.org/BigData2023/ |
| Description | Future Clinical NLP work |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | A discussion presentation by Goran Nenadic at ' Clinical free-text workshop' in Edinburgh in March 2023. The role, challenges and opportunities of generative AI in healthcare discussed. |
| Year(s) Of Engagement Activity | 2023 |
| Description | HealTAC 2022 conference |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | HealTAC 2022 was the fifth UK healthcare text analytics conference organised by Healtex. It was again a huge success - we had over 100 attendees gathered this time for a 3-day online event. It brought the academic, clinical, industrial and patient communities together to discuss the current state of the art in processing healthcare free text and share experience, results and challenges. The conference featured two keynotes from leading experts in healthcare text analytics: Dr Ozlem Uzuner (George Mason University): "Building semantic representations of clinical notes: opportunities, challenges, and progress in natural language processing on electronic health records" and Prof James Teo (King's College Hospital):"Embedding text analytics into real-world clinical systems". There were also several research paper presentations, 20 posters, two panels ('How does PPIE add value in text analytics research?' and 'Text mining in veterinary medicine'), an industry forum ('How can NLP enable personalised medicine?') with several demo sessions for various software solutions from industry and NHS. Two tutorials ('Patient and Public Involvement and Engagement (PPIE): Hands on Guidance for Clinical Text Analytics' and 'De-identification of clinical and medical texts using MASK and MedCAT') were organised. We also had a PhD and Early career forum where five early career researchers presenting their projects and receiving feedback from an expert panel and the audience. HealTAC is now an annual community event. |
| Year(s) Of Engagement Activity | 2022 |
| URL | https://healtac2022.github.io/ |
| Description | HealTAC conference poster |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | The accurate identification of diagnoses in free clinical narratives is decisive for characterizing the patients in a medical cohort. Thefore, the knowledge extraction and information retrieval tasks must be addressed carefully. Clinical notes might present multiple qualifiers that could change the meaning of a statement: negation, speculation, temporal information, family history and so on. It is not unusual for caregivers to preserve uncertainty using broad and ambiguous terms when they have not full evidence of the disease status of a patient. |
| Year(s) Of Engagement Activity | 2022 |
| URL | https://www.researchgate.net/publication/364051372_Diagnosis_Certainty_and_Progression_A_Natural_Lan... |
| Description | Healthcare NLP in industry |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | Regional |
| Primary Audience | Professional Practitioners |
| Results and Impact | Discussion with NLP companies on how to engage with academia and NHS. DeepCognito and RecourseAI - gave talks. 6 December 2022. |
| Year(s) Of Engagement Activity | 2022 |
| Description | Healthcare free-text Data & Access |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | A panel on ' Clinical free-text workshop' in Edinburgh in March 2023. Panelists: Luke Daines (chair), Arlene Casey, Bea Alex, Goran Nenadic |
| Year(s) Of Engagement Activity | 2023 |
| Description | How can we conceptualise and measure re-identification risk from de-identified clinical free text data? |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation and discussion on How can we conceptualise and measure re-identification risk from de-identified clinical free text data? at HealTAC 2024 |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://healtac2024.github.io/programme/ |
| Description | Investigating Massive Multilingual Pre-Trained Machine Translation Models for Clinical Domain via Transfer Learning (Clinical NLP 23) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Online co-presented by L Han at the 5th Clinical Natural Language Processing Workshop; ACL 2023. 14 Jul 2023, Toronto, Canada. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Keynote: MWEs in Clinical NLP and Healthcare Text Analysis |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A keynote by Goran Nenadic and Asma Ben Abacha (Microsoft) at the 19th Workshop on Multiword Expressions (MWE 2023). |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://multiword.org/mwe2023/ |
| Description | M3: Extracting medication and related attributes from outpatient letters (HealTAC 2023) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Poster presnted by A Hussain, H Alrdahi, L Han at HealTAC 2023: HEALTHCARE TEXT ANALYTICS CONFERENCE 2023 MANCHESTER, JUNE 14-16 |
| Year(s) Of Engagement Activity | 2023 |
| Description | MTUncertainty: Assessing the Need for Post-editing of Machine Translation Outputs by Fine-tuning OpenAI LLMs |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Talk at EAMT 2024, Sheffield, UK |
| Year(s) Of Engagement Activity | 2024 |
| URL | https://aclanthology.org/2024.eamt-1.29/ |
| Description | MedTem2.0: Prompt-based Temporal Classification of Treatment Events from Discharge Summaries (ACL2023:SRW) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Presentation/talk at ACL2023:SRW · Jun 15, 2023 online Abstract: Discharge summaries are comprehensive medical records that encompass vital information about a patient's hospital stay. A crucial aspect of discharge summaries is the temporal information of treatments administered throughout the patient's illness. With an extensive volume of clinical documents, manually extracting and compiling a patient's medication list can be laborious, time-consuming, and susceptible to errors. The objective of this paper is to build upon the recent development on clinical NLP by temporally classifying treatments in clinical texts, specifically determining whether a treatment was administered between the time of admission and discharge from the hospital. State-of-the-art NLP methods including prompt-based learning on Generative Pre-trained Transformers (GPTs) models and fine-tuning on pre-trained language models (PLMs) such as BERT were used to classify temporal relations between treatments and hospitalisation periods in discharge summaries. Fine-tuning with the BERT model achieved an F1 score of 92.45% and a balanced accuracy of 77.56%, while prompt learning using the T5 model and mixed templates resulted in an F1 score of 90.89% and a balanced accuracy of 72.07%. Our codes and data are available at |
| Year(s) Of Engagement Activity | 2023 |
| Description | Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date overview (LREC tutorial) |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Starting from 1950s, Machine Translation (MT) was challenged from different scientific solutions which included rule-based methods, example-based and statistical models (SMT), to hybrid models, and very recent years the neural models (NMT). While NMT has achieved a huge quality improvement in comparison to conventional methodologies, by taking advantages of huge amount of parallel corpora available from internet and the recently developed super computational power support with an acceptable cost, it struggles to achieve real human parity in many domains and most language pairs, if not all of them. Alongside the long road of MT research and development, quality evaluation metrics played very important roles in MT advancement and evolution. In this tutorial, we overview the traditional human judgement criteria, automatic evaluation metrics, unsupervised quality estimation models, as well as the meta-evaluation of the evaluation methods. Among these, we will also cover the very recent work in the MT evaluation (MTE) fields taking advantages of large size of pre-trained language models for automatic metric customisation towards exactly deployed language pairs and domains. In addition, we also introduce the statistical confidence estimation regarding sample size needed for human evaluation in real practice simulation. |
| Year(s) Of Engagement Activity | 2022 |
| Description | NLP for Mental Health |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | A meeting to discuss how clinical NLP applications in Mental Health could be shared, co-designed and co-developed. Participants from King's College, Cambridge, Manchester and Oxford. |
| Year(s) Of Engagement Activity | 2022 |
| Description | Opportunities and risks of generative AI in healthcare |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | Invited talk by Goran Nenadic at 'Symposium on AI in healthcare' in Brighton, in September 2023. |
| Year(s) Of Engagement Activity | 2023 |
| Description | Opportunities and risks of generative AI in medicines information management |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | A keynote by Goran Nenadic at the UK Medicines Information conference in 2023 (November) |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://www.ukmi.nhs.uk/Conference?ContentID=533b571d-5494-4750-97f9-07d760997bd6 |
| Description | PPIE Introductory Workshop |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Patients, carers and/or patient groups |
| Results and Impact | An introductory PPIE session with the project's PPIE advisory group, to define and discuss terms of reference, research questions, etc. November 15, 2022. |
| Year(s) Of Engagement Activity | 2022 |
| Description | PPIE Workshop 1 |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Patients, carers and/or patient groups |
| Results and Impact | First in a series of PPIE workshops discussing outpatient letters, their role and challenges. 30 November 2022 |
| Year(s) Of Engagement Activity | 2022 |
| Description | PPIE advisory group |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Public/other audiences |
| Results and Impact | A strong PPIE group established, led by our PPIE co-inverstigators. |
| Year(s) Of Engagement Activity | 2022,2023,2024 |
| Description | PPIE workshops (6) |
| Form Of Engagement Activity | Participation in an activity, workshop or similar |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Public/other audiences |
| Results and Impact | A series of 5 workshops have been organised with national patient and public representatives to discuss clinical information extraction from outpationt letters. An additional event has been organised with a local patient population (Salford) to discuss using their outpatient letters for research. |
| Year(s) Of Engagement Activity | 2022,2023 |
| Description | Panel discussion: Multiword Expressions in Knowledge-intensive Domains: Clinical Text as a Case Study |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Panel discussion: Multiword Expressions in Knowledge-intensive Domains: Clinical Text as a Case Study. Panelists: Asma Ben Abacha, Goran Nenadic, Stefan Schulz and Kirk Roberts. |
| Year(s) Of Engagement Activity | 2023 |
| URL | https://multiword.org/mwe2023/ |
| Description | Panel: 'Sharing models, creating toolkits' |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | A discussion panel at 'Clinical free-text workshop' in Edinburgh, in March 2023. Participants: Bea Alex, Goran Nenadic, Patrick, Huayu Zhang, Arlene Casey |
| Year(s) Of Engagement Activity | 2023 |
| Description | Panel: Annotation guidelines: from clinical needs to textual annotations |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | The panel has discussed the practice of producing effective clinical annotation guidelines that both capture clinical intention and provide usable recipes for textual annotation. What are common principles and steps in developing clinical annotation guidelines? What lessons have we learnt so far? How can we make clinical annotation guidelines FAIR (Findabile, Accessibile, Interoperable, and Reusable)? Panel: Rob Stewart (KCL, chair), Ben Fell (Akrivia Health), Warren Del-Pinto (University of Manchester), Eulàlia Farré-Maduell (Barcelona Supercomputing Center), Imane Guellil (University of Edinburgh) |
| Year(s) Of Engagement Activity | 2023 |
| URL | http://healtex.org/healtac-2023/programme/ |
| Description | Topic Modelling of News Articles on Covid19: Investigation using Statistical and Neural Methods (HealTAC 2023) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Poster presented by Lifeng Han at HealTAC 2023: HEALTHCARE TEXT ANALYTICS CONFERENCE 2023 MANCHESTER, JUNE 14-16 |
| Year(s) Of Engagement Activity | 2023 |
| Description | Topic Modelling of Swedish Newspaper Articles about Coronavirus: a Case Study using Latent Dirichlet Allocation Method (2023 IEEE ICHI) |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | International |
| Primary Audience | Professional Practitioners |
| Results and Impact | Online presentation by B Griciute, L Han at 2023 IEEE 11th International Conference on Healthcare Informatics. Houston, Texas, USA June 26th-29th, 2023 |
| Year(s) Of Engagement Activity | 2023 |
| Description | VetText working group |
| Form Of Engagement Activity | A formal working group, expert panel or dialogue |
| Part Of Official Scheme? | No |
| Geographic Reach | National |
| Primary Audience | Professional Practitioners |
| Results and Impact | VetText working group meeting to discuss the opportunities and challenges of veterinary and clinical NLP. Participants from Manchester and Liverpool. 28 November 2022. |
| Year(s) Of Engagement Activity | 2022 |
