Supporting Evidence-based Public Health Interventions using Text Mining

Lead Research Organisation: University of Manchester

Department Name: Computer Science

Abstract

Evidence-based public health (EBPH) reviews play a central role in public health policy, practice and guidance. Their development currently involves first searching, then screening and synthesizing evidence from the vast amount of literature. Unlike systematic reviews, EBPH reviews require dynamic and multidimensional views of relevant information from the literature, without relying on a priori research questions.
As a result, EBPH reviewing is a time consuming and resource intensive process that can take more than a year to complete. Since crucial information can be difficult to locate, and indeed understand given the complex nature of EBPH problems, the multiple causes and interrelations between interventions, diseases, populations and outcomes can remain hidden.

This project will address these limitations by exploring new research methods, which combine text mining and machine learning to produce novel search while screening tools for public health reviews. Text mining methods will discover automatically knowledge from unstructured data and machine learning will support the prioritisation and ranking of the extracted information into meaningful topics. The combination of text mining and machine learning methods will reduce the burden of producing public health reviews which will be completed more quickly, thus meeting policy and practice timescales and increasing their cost efficiency. They also allow more timely and reliable reviews, thus improving decision making across the health sector.

The project will be informed throughout by close interaction with the Centre of Public Health at NICE , who will also carry out qualitative and quantitative evaluation based on the implementation of a novel search while screening pilot system. Evaluation will be carried out on reviews related with non-communicable diseases related with prevention of hazardous and harmful drinking and excessive alcohol consumption. Moreover, given the national and international importance of EBPH reviewing, the project has developed a multistrand pathways to impact document to engage with a variety of key EBPH stakeholders both in the UK and internationally.

Technical Summary

This project will address current limitations in Evidence-based public health (EBPH) interventions by exploring new research methods, beyond the PICO framework, which combine text mining and machine learning to produce novel search while screening tools for public health. The PICO framework is conventionally used to structure pre-defined research questions matching clinical needs, helping to identify the Population, the Intervention, the Comparator and the Outcome. However, it is not well suited to the needs of EBPH reviews such as those conducted by the National Institute for Health and Care Excellence (NICE). PH questions are always complex involving behaviour, culture and organizations and often need to be described using abstract, fuzzy terminology in ways that make defining all the parameters in PICO a priori extremely problematic
We will investigate novel approaches to EBPH reviewing based on text mining-based unsupervised methods for the discovery of direct and indirect associations to support a dynamic and multi-dimensional relevance required for public health reviews. In particular, it will build on distributional semantics methods to improve term and document similarity measures by including contextual information in a novel way. Novel descriptive clustering algorithms will be developed that will use these measures to group documents, to analyse their topics to yield meaningful cluster labels and to simultaneously yield high quality document and label clusters. The project will also produce new ranking algorithms to order and visualise meaningful associations in an interactive manner, suitable for EBPH reviewing.
A pilot Web-based system providing a quick, interactive way of screening while searching, and visualisation, based on terms, their associations, descriptive clusters, labels and their ranking will form the basis for quantitative and qualitative evaluation for public health reviews related with alcohol misuse and consumption.

Planned Impact

Economic and societal impact.
Text mining and machine learning based searching will contribute towards transforming evidence-based public health and influencing development of guidelines at a national and international level via NICE. Specifically, the new research will minimise the impact of publication bias in reviews and will extract more accurate and pertinent information from the literature, thus cutting both costs and time. Public health guidelines and best practice methods rely on EBPH: by producing more reliable and timely reviews, we will improve decision making in the health sector. The outcomes of the project will impact on the way policy makers, reviewers and researchers access information and discover knowledge. Policy impacts will be explored for national strategies in the area of obesity by Prof. Kelly, who is director of the Centre for Public Health Excellence at NICE, via public engagements.
The global economic impact of preventable ill health (WHO) will continue to increase at an alarming rate and we must work to improve awareness of diseases at different levels: societal, financial, clinical, psychological, etc. Thus, methods that provide cost-effective approaches to understanding interconnections between topics and better coverage of EBPH contribute towards mitigating the cost of public health.
Our proposed methods will also help improve UK competitive position in a digital market through better language technology products and services. Our project brings together a mixture of unsupervised techniques, which are re-usable and re-targettable in supporting and enabling language technology based access (via semantic search). Thus, our advanced search and screening techniques will be applicable in almost any other domain, such as energy, security, national libraries, institutional repositories, etc.
Other economic impacts will arise from exploitation of our software products in a range of health management and monitoring products.

Academic impact
Unsupervised ways of combining text mining and machine learning will impact on the digital economy in many domains. Semantic search via knowledge technology is a sine qua non to address the UK's competitive advantage across many sectors (biology, energy, security, etc.). The advanced methods developed here will be generalisable to literature reviewing in general and could thus lead to significant annual gains in academics' productivity - in excess of the £157m p.a. in additional productivity that basic text mining is estimated to yield for literature reviewing by UK academics, according to JISC.
We will provide training in our novel search while screening methods to evidence-based public health (EBPH) reviewers: for EBPH project partners and at summer schools and other events.
The proposed project also addresses important, longstanding but little understood problems associated with systematic review methods. Current information retrieval methods for systematic reviews are grounded on a narrow and restrictive theoretical basis. The result is that a single 'one-size-fits-all' method is the only available option for every type of information retrieval challenge in the increasingly complex portfolio of evidence synthesis types and methods. The proposed project has the potential to transform the theoretical basis of systematic review search methods, not only for public health problems, but for all types of systematic review and evidence synthesis.

Funded Value:

£655,668

Funded Period:

Mar 14 - Sep 17

Funder:

MRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

MR/L01078X/1

Principal Investigator:

Sophia Ananiadou

Research Subject:

Social Policy (24%)

Tools, technologies & methods (72%)

Research Topic:

Bioinformatics (48%)

Social Policy (24%)

Tools for the biosciences (24%)

Organisations

People	ORCID iD
Sophia Ananiadou (Principal Investigator)	http://orcid.org/0000-0002-4097-9191
John Goulermas (Co-Investigator)	http://orcid.org/0000-0003-0381-124X
John McNaught (Co-Investigator)
Makoto Miwa (Researcher)

Publications

Author Name

Title Publication Date Published

|< < 1 2 3 > >|

10 25 50

Kontonatsios G (2017) A semi-supervised approach using label propagation to support citation screening. in Journal of biomedical informatics

Kontonatsios, G. (2014) Combining String and Context Similarity for Bilingual Term Alignment from Comparable Corpora.

Korkontzelos I (2016) Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts. in Journal of biomedical informatics

Korkontzelos, I. (2016) Ensemble Classification of Grants using LDA-based features

Korkontzelos, I. (2016) Ensemble classification of grants using LDA-based features

Mihaila, C. (2015) Healthcare Data Analytics

Miwa M (2014) Reducing systematic review workload through certainty-based screening. in Journal of biomedical informatics

Miwa M (2015) Adaptable, high recall, event extraction system with minimal configuration. in BMC bioinformatics

Mo Y (2015) Supporting systematic reviews using LDA-based document representations. in Systematic reviews

Mu T (2014) Descriptive document clustering via discriminant learning in a co-embedded space of multilevel similarities in Journal of the Association for Information Science and Technology

Policy Influence
Further Funding
Research Databases and Models
Research Tools and Methods
Collaboration
Intellectual Property
Software and Technical Products
Spin Outs
Engagement Activities


Description	Copyright and Licensing in relation to Text and Data Mining
Geographic Reach	Multiple continents/international
Policy Influence Type	Contribution to a national consultation/review
Impact	The National Centre for Text Mining played a leading role in advising on policy and development of UK legislation regarding a copyright exception in relation to text mining. Contributions included talks at events at the Houses of Parliament, the European Parliament, London School of Economics, and participation in consultations by the IPO and the EC (on the wider issue of copyright and licensing issues in the EU). Advice was also given on numerous occasions by request of the IPO during development of the legislation which came into force on 1st June 2014. It is somewhat too early to ascertain impact, however this has already led to major initiatives such as Europe PubMed Central being able to lawfully text mine full papers as well as increased levels of text mining within such bodies as the British Library and also within institutional repositories. It has also led to increased scope and expected impact of research projects as these can tackle for the first time large scale text mining of full text articles which are lawfully subscribed to in addition to open access material.
URL	http://www.jisc.ac.uk/sites/default/files/value-text-mining.pdf


Description	Text mining supporting systematic reviews (NICE, Cochrane)
Geographic Reach	National
Policy Influence Type	Influenced training of practitioners or researchers
Impact	Text mining results improving the efficiency of conducting systematic reviews with the EPPI centre and NICE


Description	SLiM: Pilot study of the utility of text mining and machine learning tools to accelerate systematic review and meta-analysis of findings of in vivo research
Amount	£351,857 (GBP)
Funding ID	MR/N015665/1
Organisation	Medical Research Council (MRC)
Sector	Public
Country	United Kingdom
Start	03/2016
End	03/2018


Description	Supporting the spread of effective integration models for older people living in care homes: A mixed method approach
Amount	£350,000 (GBP)
Funding ID	Research for Social Care (RfSC), Research for Patient Benefit (RfPB) Programme: NIHR201872
Organisation	University of Manchester
Sector	Academic/University
Country	United Kingdom
Start	03/2021
End	09/2023


Title	Descriptive clustering for citation screening
Description	Descriptive document clustering aims to automatically discover groups of seman- tically related documents and to assign a meaningful label to characterise the content of each cluster. In this paper, we present a descriptive clustering ap- proach that employs a distributed repre- sentation model, namely the paragraph vector model, to capture semantic simi- larities between documents and phrases. The proposed method uses the learned joint representation of phrases and doc- uments (i.e., a co-embedding) to auto- matically select a descriptive phrase that best represents a document cluster. We evaluate our method by comparing its performance to an existing state-of-the- art descriptive clustering method that also uses co-embedding but relies on a bag- of-words representation. Results obtained on two benchmark datasets demonstrate that, compared to the existing approach, the paragraph vector-based method ob- tains superior performance in both iden- tifying clusters and assigning appropriate descriptive labels to them.
Type Of Material	Improvements to research infrastructure
Year Produced	2016
Provided To Others?	Yes
Impact	Included in the RobotAnalyst
URL	http://www.nactem.ac.uk/robotanalyst/


Title	Descriptive clustering for systematic reviews
Description	Descriptive document clustering aims at discovering clusters of semantically interrelated documents together with meaningful labels to summarize the content of each document cluster. In this work, we propose a novel descriptive clustering framework, referred to as CEDL. It relies on the formulation and generation of 2 types of heterogeneous objects, which correspond to documents and candidate phrases, using multilevel similarity information. CEDL is composed of 5 main processing stages. First, it simultaneously maps the documents and candidate phrases into a common co-embedded space that preserves higher-order, neighbor-based proximities between the combined sets of documents and phrases. Then, it discovers an approximate cluster structure of documents in the common space. The third stage extracts promising topic phrases by constructing a discriminant model where documents along with their cluster memberships are used as training instances. Subsequently, the final cluster labels are selected from the topic phrases using a ranking scheme using multiple scores based on the extracted co-embedding information and the discriminant output. The final stage polishes the initial clusters to reduce noise and accommodate the multitopic nature of documents. The effectiveness and competitiveness of CEDL is demonstrated qualitatively and quantitatively with experiments using document databases from different application fields.
Type Of Material	Improvements to research infrastructure
Year Produced	2015
Provided To Others?	Yes
Impact	Several teams are using our software CEDL, e.g. NLM
URL	http://www.nactem.ac.uk/Mining4EBPH/


Title	The RobotAnalyst
Description	RobotAnalyst was developed as part of the Supporting Evidence-based Public Health Interventions using Text Mining project to support searching and screening in systematic reviews. RobotAnalyst builds upon state of the art text mining technologies, including topic modelling and feedback-based text classification models, to minimise the human workload involved in the study identification phase. RobotAnalyst offers the following services to the users: Create a new collection: allows upload of a set of citations, encoded in standardised RIS format, into the system. Update a collection: allows update of an existing citation list (already uploaded to the system) with additional citations retrieved either from your local disk (also a RIS-formatted file) or from PubMed. Faceted Search: a search engine that enables users to search for relevant studies by applying various filters (e.g., keywords, authors, year, type of publication, name of journal, etc). Topic-based search: RobotAnalyst automatically induces clusters of thematically related citations. Additionally, the system generates a network graph in which each node represents a topic (described by a set of keywords) while an edge determines the semantic similarity between two topics. The topic-based search engine allows users to re-order the list of studies in terms of their relevance to a specified topic (i.e., the most relevant studies are placed towards the top of the list). Similarity-based search: Users can retrieve citations related to a given study. The system uses words from titles and abstracts to compute pairwise similarities between citations. Semi-automatic citation screening: To directly reduce the time and cost needed to complete the screening phase of a systematic review, RobotAnalyst implements a feedback-based (i.e., active learning) text classification model that aims to automatically exclude irrelevant studies while keeping all eligible studies in the final review. The active learner is iteratively trained on an increasing number of validated labelled citations. At each learning cycle, the model selects a small sample of automatically labelled citations and interactively requests feedback from the analyst (i.e., a systematic reviewer corrects erroneous predictions made by the model). The manually corrected sample of citations is then used to re-train (update) the model. We have conducted experiments which demonstrate that active learning classification approaches can substantially decrease the screening burden without reducing the sensitivity of the review. Save a screened dataset: The analyst can terminate the screening process once all eligible studies have been identified. RobotAnalyst will then produce two files (RIS format), one for studies to be included and one for studies to be excluded.
Type Of Material	Improvements to research infrastructure
Year Produced	2016
Provided To Others?	Yes
Impact	A number of teams are currently using the RobotAnalyst: a) NICE, Public Health England; b) Cochrane Switzerland; c) Public Health England; d) University of Liverpool; e) Skåne university hospital
URL	http://www.nactem.ac.uk/robotanalyst/


Title	Topic detection using paragraph vectors to support active learning in systematic reviews
Description	Systematic reviews require expert reviewers to manually screen thousands of citations in order to identify all relevant articles to the review. Active learning text classification is a supervised machine learning approach that has been shown to significantly reduce the manual annotation workload by semi-automating the citation screening process of systematic reviews. In this paper, we present a new topic detection method that induces an informative representation of studies, to improve the performance of the underlying active learner. Our proposed topic detection method uses a neural network-based vector space model to capture semantic similarities between documents. We firstly represent documents within the vector space, and cluster the documents into a predefined number of clusters. The centroids of the clusters are treated as latent topics. We then represent each document as a mixture of latent topics. For evaluation purposes, we employ the active learning strategy using both our novel topic detection method and a baseline topic model (i.e., Latent Dirichlet Allocation). Results obtained demonstrate that our method is able to achieve a high sensitivity of eligible studies and a significantly reduced manual annotation cost when compared to the baseline method. This observation is consistent across two clinical and three public health reviews.
Type Of Material	Improvements to research infrastructure
Year Produced	2016
Provided To Others?	Yes
Impact	Method is used by the systematic review teams in Public Health England (NICE)
URL	https://nactem.ac.uk/pvtopic/


Title	Data from: A cross-lingual similarity measure for detecting biomedical term translations
Description
Type Of Material	Database/Collection of data
Year Produced	2015
Provided To Others?	Yes


Description	Evaluation of the RobotAnalyst
Organisation	The Cochrane Collaboration
Department	Cochrane Switzerland
Country	Switzerland
Sector	Charity/Non Profit
PI Contribution	Development of an automated system for screening systematic reviews the RobotAnalyst
Collaborator Contribution	Reviewers from the Cochrane Switzerland group, at the Institute of Social and Preventive Medicine (IUMSP), Lausanne University Hospital, evaluated our system to inform patient safety and quality of hospital care. Results from both sites highlighted the ability of RobotAnalyst to prioritise relevant references early in the screening process.
Impact	This is a collaboration between clinicians and systematic review analysts with text miners. The collaboration resulted in a joint paper, currently under review (submitted to Research Methods)
Start Year	2017


Title	TerMine
Description	Automatically recognises technical terms from text
IP Reference
Protection	Copyrighted (e.g. software)
Year Protection Granted
Licensed	Yes
Impact	Licenced to Elsevier and other companies


Title	Thalia and RobotAnalyst
Description	An automated method to search and screen the literature
IP Reference
Protection	Trade Mark
Year Protection Granted	2017
Licensed	Yes
Impact	Visibility of the tools by the University of Oxford, Infectious Diseases Data Observatory (IDDO)


Title	RobotAnalyst
Description	RobotAnalyst was developed as part of the Supporting Evidence-based Public Health Interventions using Text Mining project to support searching and screening in systematic reviews. RobotAnalyst builds upon state of the art text mining technologies, including topic modelling and feedback-based text classification models, to minimise the human workload involved in the study identification phase.
Type Of Technology	Webtool/Application
Year Produced	2016
Impact	A number of teams from Public Health, hospitals, Cochrane are using the RobotAnalyst to Create a new collection; Update a collection; Faceted Search: a search engine that enables users to search for relevant studies by applying various filters (e.g., keywords, authors, year, type of publication, name of journal, etc). Topic-based search: RobotAnalyst automatically induces clusters of thematically related citations. Additionally, the system generates a network graph in which each node represents a topic (described by a set of keywords) while an edge determines the semantic similarity between two topics. The topic-based search engine allows users to re-order the list of studies in terms of their relevance to a specified topic (i.e., the most relevant studies are placed towards the top of the list). Similarity-based search: Users can retrieve citations related to a given study. The system uses words from titles and abstracts to compute pairwise similarities between citations. Semi-automatic citation screening: To directly reduce the time and cost needed to complete the screening phase of a systematic review, RobotAnalyst implements a feedback-based (i.e., active learning) text classification model that aims to automatically exclude irrelevant studies while keeping all eligible studies in the final review. The active learner is iteratively trained on an increasing number of validated labelled citations. At each learning cycle, the model selects a small sample of automatically labelled citations and interactively requests feedback from the analyst (i.e., a systematic reviewer corrects erroneous predictions made by the model). The manually corrected sample of citations is then used to re-train (update) the model. We have conducted experiments which demonstrate that active learning classification approaches can substantially decrease the screening burden without reducing the sensitivity of the review. Save a screened dataset: The analyst can terminate the screening process once all eligible studies have been identified. RobotAnalyst will then produce two files (RIS format), one for studies to be included and one for studies to be excluded.
URL	http://www.nactem.ac.uk/robotanalyst/


Title	Thalia: a faceted semantic search system
Description	The main purpose of Thalia is to enable semantic search in the context of biomedical literature by leveraging previous named entity (NE) annotation efforts. The key strategy to achieve a semantic behaviour is to normalise NEs, i.e., linking entities to concepts in an openly available ontology, which effectively allows to map a concept with its multiple word forms. Thalia covers the entire PubMed, which at the point of this challenge contains about 27 million references. Thalia includes annotations of several types (Chemicals, Diseases, Drugs, Genes, Metabolites, Proteins, Species and Anatomic entities).
Type Of Technology	Webtool/Application
Year Produced	2017
Open Source License?	Yes
Impact	The semantic search system has been used to support a precision medicine challenge, retrieving documents containing potential treatments and clinical trials for specific patient characteristics.
URL	http://nactem-copious.man.ac.uk/Thalia/thalia.html


Company Name	Eurimatics
Description	Eurimatics develops text mining and machine learning methods that aim to improve the efficiency of evidence-based public health reviews.
Year Established	2018
Impact	The system RobotAnalyst is currently expanded to make it available to several users such as NICE, Cochrane, etc.


Description	8th Culture and International Mental Health Conference
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	8th Culture and International Mental Health Conference, part of the The South Asia Self Harm Initiative Global Challenge Research Fund: South Asia Self Harm research capability building Initiative (GCRF-SASHI)
Year(s) Of Engagement Activity	2018
URL	http://sashi.bangor.ac.uk/staff.php.en


Description	Cochrane Colloquium Workshop
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Workshop to inform how Text mining methods support the development of sensitive search strategies in public health reviews. Objectives: To cover text mining methods to support study identification in public health reviews. Specifically, we aim to: 1. discuss limitations of conventional keyword-based search engines (e.g. PubMed) that are ill-suited to the development of sensitive search strategies; 2. provide an overview of text mining methods for generating semantic metadata over large scale textual collections; and 3. demonstrate the use of semantically enriched search engines that enable interactive, exploratory searching of relevant evidence. Description: The unstructured and ambiguous nature of natural language in public health literature, poses a barrier to the accessibility and discovery of information. We will discuss challenges to information discovery that are inherent in keyword-based search engines. We will then demonstrate potential solutions offered by semantic search systems, enhanced by text mining methods. Participants will receive an introduction to various semantic search features (e.g. faceted search, automatic query expansion, queries as natural language questions) and asked to construct complex queries using on-line semantic search systems. This will provide an appreciation of how text mining can support the development of sensitive search strategies. The intended outcome of this workshop is to highlight benefits and limitations of these emerging technologies.
Year(s) Of Engagement Activity	2016
URL	http://2016.colloquium.cochrane.org/workshops/text-mining-methods-support-development-sensitive-sear...


Description	Conference of European Statistics Stakeholders
Form Of Engagement Activity	A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme?	Yes
Geographic Reach	International
Primary Audience	Policymakers/politicians
Results and Impact	The aims of the conference are is to enhance the dialogue between European methodologists, producers, and users of European Statistics identifying the requirements of the users (ESAC), the best practices of the production (EUROSTAT, NSIs), the innovative ways of visualising and communicating statistics, and the new methodological ideas for collecting and analysing data (FENStatS). Specific topics of high interest regard the development of the European Statistical System towards 2020 and beyond; to investigate and present themes of research in official statistics within the scientific community, to explore the enabling instruments such as the Horizon 2020 Research Framework Programme, compare and share best practices of production, and a good opportunity to meet national and European users of statistics. The Conference is an operative tool to facilitate the evolution of statistics towards the 2020 modernisation targets. Funding opportunities (event will take place 24/11/2014)
Year(s) Of Engagement Activity	2014
URL	http://cdss.sta.uniroma1.it/index.php/dssconference/cess2014/


Description	Exhibitor in GIN Conference Manchester
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Policymakers/politicians
Results and Impact	The G-I-N conference gathers together people who work with guidelines in health and care. We exhibited RobotAnalyst and other text mining tools and services related with evidence-based medicine
Year(s) Of Engagement Activity	2018
URL	http://www.nactem.ac.uk/newsitem.php?item=365


Description	Global Evidence Summit Cape Town South Africa
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	We organised two workshops on RobotAnalyst at the Global Evidence Summit Cape Town, South Africa, Sept. 13-16, 2017. The first workshop Screening evidence for systematic reviews using a text mining system: the RobotAnalyst is organised in collaboration with the Public Health and Social Care Centre at the National Institute for Health and Care Excellence (NICE) and Cochrane Switzerland. The second workshop RobotAnalyst: an online system to support citation screening in evidence reviewing is also in collaboration with NICE. The workshops are part of the Mining for Public Health project and cover evaluation and demonstration of the RobotAnalyst system developed by NaCTeM that uses text mining and machine learning to transform the way in which evidence-based public health (EBPH) literature screening is conducted.
Year(s) Of Engagement Activity	2017
URL	https://www.globalevidencesummit.org/workshops/screening-evidence-systematic-reviews-using-text-mini...


Description	ICHI 2016- IEEE International Conference on Healthcare Informatics,Chicago Illinois
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	This session has two main goals. First, it is designed to inform new investigators of the importance of automating evidence synthesis, and emphasize the potential for new research in this area. Second, it provides an opportunity for five of the leading laboratories across the US and UK to come together, to review the current state of the art, and discuss in detail the nuts-and-bolts of different technical approaches to overcoming the key challenges of evidence synthesis. Dr. Kitsiou will give an overview of the different types of literature reviews and evidence synthesis approaches in health informatics, with emphasis on systematic reviews and meta-analyses, for those not in the field: discussing how they are generated, and the bottlenecks in generating them efficiently and with adequate quality. Dr. Cohen will give an overview of the efforts by laboratories worldwide to improve and automate the process of writing and updating systematic reviews in evidence-based medicine. Dr. Smalheiser will present his research in identifying relevant clinical trials to examine, whereas Drs. Jonnalagadda, Wallace, and Ananiadou will discuss and compare their approaches to extracting data from clinical trial articles. Finally, there will be guided general discussion to consider the scope, limitations and potential for text mining techniques to automate, streamline and re-engineer the largely manual process of writing systematic reviews.
Year(s) Of Engagement Activity	2016
URL	http://ieee-ichi.org/panel.html


Description	Invited Speaker, Heidelberg, Scientific Computing for the Improved Diagnosis and Therapy of Sepsis (SCIDATOS)
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	This kick off workshop gathered an international audience of experts working on diagnosis of sepsis. I was the only text mining expert in this workshop.
Year(s) Of Engagement Activity	2016
URL	http://www.uni-heidelberg.de/einrichtungen/iwh/bock2016.html


Description	Invited speaker Cochrane Japan
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	A presentation of the RobotAnalyst to Cochrane Japan. A concrete collaboration plan with Cochrane Japan to support research in Diabetes
Year(s) Of Engagement Activity	2017


Description	Keynote International Symposium on Information Management and Big Data (SIMBig 2019),Lima Peru 21st - 23rd August 2019. Prof. Ananiadou's talk wasentitled Text Mining for Biomedical Applications.
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	Dissemination of text mining research supporting systematic reviews to a wider audience
Year(s) Of Engagement Activity	2019
URL	https://simbig.org/SIMBig2019/index.html


Description	Keynote Open Science Paris (December 2018)
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	I was invited to give a keynote talk at the Journées pour la science ouverte (Days for open science), in Paris, France. These days, organised by the Comité pour la science ouverte, have been arranged following the announcement of the French national plan for open science on July 4, 2018 by the Minister of Higher Education, Research and Innovation, and were an opportunity to mobilise the scientific community around open science and applications
Year(s) Of Engagement Activity	2018
URL	https://bib.cnrs.fr/journees-pour-la-science-ouverte-open-science-days-paris-december-4th-to-6th-201...


Description	Keynote Speaker University of York
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Postgraduate students
Results and Impact	This event celebrated the launch of the University's new high performance compute (HPC) cluster, 'Viking', which promises to empower researchers at York in achieving new heights of research excellence. My talk discussed how text mining needs HPC clusters.
Year(s) Of Engagement Activity	2019
URL	https://www.york.ac.uk/it-services/research-computing/vikingclusterlaunchevent/


Description	Keynote speaker
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	6th International Conference on Computer and Information Science and Technology (CIST'21) July 29 - 31, 2021 The goal of this Computer and information science conference 2021 is to gather scholars from all over the world to present advances in the relevant fields and to foster an environment conducive to exchanging ideas and information. The conference will also provide an ideal environment to develop new collaborations and meet experts on the fundamentals, applications, and products of the mentioned fields.
Year(s) Of Engagement Activity	2021
URL	https://cistseries.com/


Description	Keynote speaker
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Keynote speech about the use of AI in biomedicine; increased interest in the field.
Year(s) Of Engagement Activity	2021
URL	http://www.binfo.ncku.edu.tw/APBC2021/keynote.html


Description	Keynote talk at ISCB/ECCB session on Text Mining for Biology and Healthcare
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	ISMB/ECCB 2019 is the largest and most high profile annual meeting of scientists working in computational biology and provides an intense multidisciplinary forum for disseminating the latest developments in computational tools for data driven biological research.
Year(s) Of Engagement Activity	2019
URL	https://www.iscb.org/ismbeccb2019-program/special-sessions#sst01


Description	Salford Research Week
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	Yes
Geographic Reach	Regional
Primary Audience	Health professionals
Results and Impact	My talk, entitled Unlocking the power of clinical records using Text Mining, took place as part of a special "tomorrow's world" showcase session, which aims to highlight some of the new technologies that could be used in future in hospitals. Joint research proposal with NWeHealth and Salford Royal NHS Foundation Trust.
Year(s) Of Engagement Activity	2014


Description	Speaker Elsevier forum
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Engagement with Elsevier about text mining for enriching their content and discussions on collaboration
Year(s) Of Engagement Activity	2019


Description	Speaker Google London
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Industry/Business
Results and Impact	Collaboration with Google about Biomedical text Mining. Ongoing discussions
Year(s) Of Engagement Activity	2019


Description	TREC Precision Medicine / Clinical Decision track
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	We presented our work to support precision medicine based on the use of a biomedical semantic search engine called Thalia (Text mining for Highlighting, Aggregating and Linking Information in Articles), which has been developed at NaCTeM. The main purpose of Thalia is to enable semantic search in the context of biomedical literature by leveraging previous named entity (NE) annotation efforts, and to apply it to different use cases.
Year(s) Of Engagement Activity	2017
URL	http://www.trec-cds.org/2017.html


Description	Text Mining Workshop for Precision Medicine
Form Of Engagement Activity	Participation in an activity, workshop or similar
Part Of Official Scheme?	No
Geographic Reach	Regional
Primary Audience	Professional Practitioners
Results and Impact	The text mining workshop was aimed at clinical and industry colleagues who would like to attend a showcase of a number of innovative text mining approaches for biomarker discovery. There was also potential to discuss collaborative grant applications where this enabling technology will be an asset.
Year(s) Of Engagement Activity	2019
URL	http://www.nactem.ac.uk/newsitem.php?item=388


Description	Using text mining to facilitate study identification in public health systematic reviews
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Presentation at the Guidelines International Network conference about text mining
Year(s) Of Engagement Activity	2016
URL	http://www.g-i-n.net/conference/13th-conference


Description	Workshop, Japanese Society of Nephrology
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Prof. Ananiadou presented the RobotAnalyst, a text mining system supporting search and screening in evidence reviewing to the Japanese Society of Nephrology at the special theme AI and ICT workshop
Year(s) Of Engagement Activity	2017