Social listening: Applying natural language processing methods to social media data to yield actionable analytics for health and care services

Lead Research Organisation: University of Manchester
Department Name: School of Health Sciences


It is vital to understand public opinion and preferences towards the use of information technology and health data as part of designing future models of healthcare and research. Social media platforms (e.g. Twitter), blogs and online discussion forums provide a rich resource of naturally occurring conversations for examining public attitudes and preferences towards (a) information and digital technologies as part of the delivery of healthcare and (b) the secondary use of health data for purposes beyond direct healthcare, such as research.

Analysing public comments and conversations can be analysed manually using established qualitative research techniques. Whilst such methods provide depth and rigour, they are typically labour intensive and can neither be applied rapidly nor on a large scale basis without significant effort and resource. I propose to investigate promising new techniques from the field of natural language processing (NLP) to rapidly and automatically analyse textual data about public attitudes and preferences towards health and care from publicly available social media data. I will compare the performance of NLP methods against established, qualitative approaches and assess how the two approaches can complement each other to gather insights into public opinion for the purposes of ongoing monitoring, research, evaluation and informing public policy.

I will test advanced methods of data visualisation to report my findings. Leveraging my networks, I will explore how to translate my work into wider applications, within Health Data Research UK, healthcare services and internationally. Throughout the project I will adhere to ethical guidelines for using social media data and will involve citizens from relevant communities (online and offline) in shaping the design and delivery of the research.

Technical Summary

Future healthcare delivery and research relies on public acceptance of information technology and health data uses as part of service delivery. Social media platforms (e.g. Twitter), blogs and online discussion forums provide an increasingly ubiquitous, yet under exploited, source of unstructured textual data for examining public attitudes and preferences relevant to health, information technology, data uses and healthcare delivery. Yet, analysing such data manually is labour intensive and can neither be done rapidly nor at scale.
This project will investigate the accuracy of newer natural language processing (NLP) techniques for the rapid, automated extraction of public attitudes and preferences from large-scale, social media data. Exemplar datasets relevant to public attitudes towards the commercial use of health data and wearable technologies will be extracted from social media data, cleaned and prepared for analysis. NLP techniques (e.g. sentiment analysis, text mining, machine learning and/or rule-based methods) and qualitative analysis (e.g. framework analysis, discursive psychology) will then be: (a) applied to unstructured textual data in parallel; (b) benchmarked against each other; and (c) tested within an integrated mixed methods approach.
Online open source decision support tools will be developed to guide the selection and application of NLP techniques (alone and/or as part of a mixed methods approach) to unstructured social media data, addressing distinct purposes, such as longitudinal monitoring of public opinion and informing policy development. Advanced data visualisations will be developed, evaluated and optimised for data exploration, presentation and informing decision making. This will include sentiment analysis, clustering, frequency analysis, and high-dimensional representations of big text data. Findings, tools and methods will be disseminated widely, with the aim of enabling public opinion data to inform future policy development.


10 25 50
Description EPSRC Healtex Feasability Funding
Amount £29,583 (GBP)
Funding ID EP/N027280/1 
Organisation Engineering and Physical Sciences Research Council (EPSRC) 
Sector Academic/University
Country United Kingdom
Start 05/2018 
End 10/2018
Description Spotlight on... blog 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Produced a short blog introducing the topic of my fellowship project, versions of which appeared as a news item on our department and Faculty webpages. It was also widely retweeted on Twitter. These news pieces are widely read by those who subscribe to departmental updates and Twitter followers, including fellow academics, NHS/healthcare practitioners, interested patients and the public. This piece helped to publicise my new award and has led to further speaking invitations and opportunities to supervise students.
Year(s) Of Engagement Activity 2018
Description Yellow card project public involvement group 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Patients, carers and/or patient groups
Results and Impact A group of 5 people who use social media to discuss long term conditions (mainly arthritis) have been participating in a public involvement group to shape the design of research to test the acceptability of using natural language processing techniques to automatically detect adverse drug reactions reported on social media and link them with the MHRA's 'Yellow Card' reporting system. This group has reviewed the interview schedules and helped with recruitment for a qualitative study, involving 6 focus groups. This study is one of the test cases I am using as part of my overall project and is being done in collaboration with several other researchers from UoM. The plan is to write this up to inform the development of a future grant application as a separate study, on which I would be a Co-Investigator.
Year(s) Of Engagement Activity 2018,2019