Collective attention in online social networks

Lead Research Organisation: University of Exeter
Department Name: Engineering Computer Science and Maths

Abstract

Mass attention in online social media can be powerful. In politics, it can influence election campaigns or foster extremist ideologies; in business, it can boost product sales through viral advertising or damage brands by spreading bad news stories; in social discourse, it can highlight issues and change public perceptions. Yet the mechanisms by which online attention is gathered around a particular event or topic are not understood.

The complexity of networked online communication demands advanced mathematical and computational analyses. This project will build on recent methodological advances by the supervisors that allow measurement and forecasting of collective attention events in online social media. It will develop new mechanistic models of collective attention in social networks, to provide a theoretical basis for understanding online communication.
In recent work sponsored by a leading commercial data science company, we have developed a network-based methodology for analysis of collective attention events in social media. By representing the interactions of millions of social media users with digital content in the form of dynamic bipartite networks, we have shown that collective focus on a given topic can be accurately measured. Furthermore, machine learning methods applied to network timeseries have shown that future collective attention events can be predicted with some accuracy.

This PhD project will develop mechanistic models of the interaction of social media users with online content. Drawing on our existing methods, large archived datasets and ongoing research with our commercial sponsor, the student will refine the machine learning techniques used to derive the predictive signal from network timeseries. They will then draw on recent advances in the literature to develop process-based models of the "social physics" of large numbers of networked social media users interacting with content. They will then apply these models to improve predictions of imminent collective attention events, combining network analysis with natural language processing and machine learning.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509656/1 01/10/2016 30/09/2021
1917433 Studentship EP/N509656/1 01/10/2017 31/03/2021 Tristan Cann
 
Title Data for Ideological biases in social sharing of online information about climate change 
Description This repository contains an anonymised dataset to support the paper "Ideological biases in social sharing of online information about climate change" by Tristan J.B. Cann, Iain S. Weaver and Hywel T.P. Williams, submitted for publication in PLOS ONE.The files present contain the following:tweet_ids - A list of all tweets ids used in the study.coded_urls - A list of the (up to) five most common URLs from each of the 75 most common domains. Where these were not social media sites and content was available, they were graded for political and climate bias by the human coders.domain_bias_grades - A list of domains and the final bias scores assigned to them following the standardisation process we applied to the scores received from our coders. The first line of this file is a header labelling the four columns as political bias, climate change bias, political standard deviation and climate change deviation.The networks folder contains subfolders for each of the seven weeks studied. Three files are provided for each week.week_x_bipartite_edges - A list of source, target pairs to define edges in the bipartite user-URL network. Source and target give the user and URL node IDs respectively. Pairs are not guaranteed to be unique, and duplicates should increment the edge weight.week_x_url_labels - A list of expanded URLs given in the order corresponding to the edge list described above.week_x_user_labels - A list of anonymised user IDs given in the order corresponding to the edge list for this week. These anonymised numeric user identifiers are consitent across each week for cross referencing. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
Impact N/A 
URL https://figshare.com/articles/dataset/Data_for_Ideological_biases_in_social_sharing_of_online_inform...