Multi-scale Dynamical Community Detection for the Digital Economy: from analyzing to influencing policy through Open Government data
Lead Research Organisation:
Imperial College London
Department Name: Institute for Mathematical Sciences
Abstract
The digital age has brought with it an unprecedented gathering of detailed, real-time data from our daily lives, from mobile phone usage to specialized hospital sensors. The availability of such real-world data from a wealth of physical and digital infrastructures coupled with increased computational power offers a unique opportunity to interrogate social behaviour from the level of the individual to the emergence of group dynamics and traits at different levels. Recently, governmental initiatives (specifically in the US and the UK) have been designed to make such datasets available to the wider public. These initiatives offer the possibility to examine quantitatively the influence and effectiveness of policies on different aspects of social dynamics, as well as providing a route for the exercise of citizen participation and feedback. This could lead to improved quality of life in healthcare, traffic, security, or to the design of policies for public spending and usage of resources from the individual level to the collective of groups. These tantalising possibilities have led in the last year to a series of manifesto and even the declaration of the need for a new field, Computational Social Science.. Although those contributions have arisen from different disciplines, they share the belief that the lack of mathematical tools at present for the analysis of such datasets constitutes the fundamental challenge so that the promise of the integration of multi-modal, dynamic datasets can translate into real interpretative results. In particular, there is a need to go beyond the purely (static) statistical methods and to overcome the lack of mathematical, and eventually computational, methodologies that can formalise, interrogate and analyse the data such that hypotheses can be tested and conclusions can be drawn in a rigorous data-driven manner. This proposal, however, goes beyond issues of accessibility and presentation of data and focuses on the development of mathematical tools for the analysis of data in two steps: (1) finding a faithful representation of the data in terms of multi-label, possibly dynamic, networks, and (2) the generation of simplified, intelligible reductions of such networks in terms of a multi-level dynamical hierarchy of communities that can uncover patterns of interaction in the data. The aim of this proposal is to develop robust methodologies for the analysis of networks derived from large, complex social datasets currently made available to the public through the Open Government initiative. Our mathematical tools will address the creation of representative networks from the data and the multi-scale and multi-label analysis of such networks leading to reduced descriptions in terms of dynamical community structures derived from the data without any a priori specification. The datasets chosen will be of current social interest but also exemplify three fundamental characteristics of social datasets that are linked to specific mathematical challenges for their analysis: (i) the multi-scale nature of social networks; (ii) the multi-label characterisation of social datasets; and (iii) the importance of dynamics and flows in social descriptions. The mathematical tools will be specifically applied to the following three areas of high interest for the Digital Economy: Neighbourhood statistics data, the redistricting problem and the recently released budget expenditure data.
Planned Impact
Multiscale, Dynamical Community Detection for the Digital Economy affords three broad categories of opportunities for significant impact: (i)Academic Impact: Spurring Interdisciplinary Research (ii)Government Transparency: Reducing Fraud, Waste and Abuse (iii)Institutional Innovation: Improving Quality of Life in the UK and worldwide The work outlined in this proposal has the potential to create dramatic new collaboration between mathematical and social sciences and lead to an outpouring of research as the methodology is applied to new and diverse social problems. Existing sociological network theory is built on a foundation of one-time snapshot data. Whereas we have the technology to collect minute-by-minute accounts of life across whole nations, computational social science lacks a mathematical framework that is capable of extracting meaningful analysis of vast and seemingly unrelated data. Our multiscale and multilabel theory and algorithm will be providing exactly this missing link, enabling new ways of conceiving human behaviour. The analysis of the Open Government data can help to identify patterns of waste, fraud and abuse, enabling public officials to cut spending and save taxpayer money informed by empirical evidence. It will further enable citizen participation as everybody can then engage with this process. In the digital age, technology has enabled the unprecedented collection of real-time data from our daily lives. The ability to generate and then navigate and visualize complex data sets has the potential to inform policymaking that, in turn, could lead to improved quality of life. Our methodology goes beyond this to analysis which can be applied to stimulate data-driven policy innovations. We have identified two domains for initial focus in addition to the analysis of government spending. First, we will seek to apply our method to the challenge of voter redistricting in an effort to deepen citizens' democratic right to participate in the democratic process. Second, we will look at new storehouses of neighborhood data - quality of life information collected across local communities in the UK in an effort to compare and visualize the outcome of citizen services at the local level.
Organisations
Publications
Lambiotte R
(2011)
Flow graphs: interweaving dynamics and structure.
in Physical review. E, Statistical, nonlinear, and soft matter physics
Schaub MT
(2012)
The Ising decoder: reading out the activity of large neural ensembles.
in Journal of computational neuroscience
Beguerisse-Díaz M
(2012)
Squeeze-and-breathe evolutionary Monte Carlo optimization with local search acceleration and its application to parameter fitting.
in Journal of the Royal Society, Interface
Wu J
(2012)
Robustness of random graphs based on graph spectra.
in Chaos (Woodbury, N.Y.)
Schaub MT
(2012)
Encoding dynamics for multiscale community detection: Markov time sweeping for the map equation.
in Physical review. E, Statistical, nonlinear, and soft matter physics
Stumpf MP
(2012)
Mathematics. Critical truths about power laws.
in Science (New York, N.Y.)
Description | We have developed a series of powerful multiresolution methods that can extract insight from complex big data. In particular, they can identify the different communities of people or topics involved as well a the roles these nodes (or people) play in such diverse areas as social media, neural science data or healthcare . Furthermore, we have developed a quantitative method for understanding cities. Our methods are an an example of unsupervised learning and can provide good features for supervised machine learning. |
Exploitation Route | Our findings are used by industry, government, social policy organisations to better understand their data and hence make more informed decisions. |
Sectors | Agriculture Food and Drink Communities and Social Services/Policy Digital/Communication/Information Technologies (including Software) Education Financial Services and Management Consultancy Healthcare Government Democracy and Justice Culture Heritage Museums and Collections Pharmaceuticals and Medical Biotechnology |
Description | Our methods are now being used to inform policy, transport and healthcare issues as well as in branding. Additionally, they are now being used in education, particularly online learning |
Sector | Education,Healthcare,Pharmaceuticals and Medical Biotechnology |
Impact Types | Cultural Societal Economic Policy & public services |
Description | EPSRC Centres for Mathematics in Healthcare |
Amount | £2,520,000 (GBP) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2016 |
End | 12/2019 |
Description | EPSRC Knowledge Transfer Secondment |
Amount | £68,189 (GBP) |
Organisation | Imperial College London |
Sector | Academic/University |
Country | United Kingdom |
Start | 09/2012 |
End | 09/2013 |
Description | EPSRC Pathways to Impact |
Amount | £69,800 (GBP) |
Organisation | Imperial College London |
Sector | Academic/University |
Country | United Kingdom |
Start | 01/2016 |
End | 12/2017 |
Description | James S McDonnell Foundation Fellowship on Complex Systems |
Amount | $200,000 (USD) |
Organisation | James S. McDonnell Foundation |
Sector | Charity/Non Profit |
Country | United States |
Start | 01/2012 |
End | 04/2015 |
Description | Science Museum Lates |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Public/other audiences |
Results and Impact | Our work on cities was presented to a general audience under the auspices of the Science Museum Lates and as such reached a general audience. |
Year(s) Of Engagement Activity | 2015 |