Using Web Data and Network Science to Detect Spatial Relationships

Lead Research Organisation: University of Bristol
Department Name: Mathematics

Abstract

This PhD project aims to use recently developed Graph Science methodologies and techniques to drive forward the capabilities of modelling and interpreting spatial relationships. By viewing graphs as data and vice versa, it allows different statistical questions to be answered, for example when we care about dependencies. Dynamic network embedding methods are a research area with lots of recent growth that this project will utilise.

This project will use a largely unexplored source of the data - the web - to identify spatial relationships, model and predict dependencies between cities and regions. By spatial relationships we mean how two places connect, e.g. transport links, trade links, etc. The project aims to capture such relationships in new ways, in various spatial and temporal scales, while contextualising these relationships to answer interesting social science questions.

There is an opportunity here to apply new graphical methods to non trivial real life data, as well as creating new questions within the graph methodology that the project can explore and endeavour to answer. It uses new data-driven ideas to answer existing questions, while utilising widely rich and vast web data, that is ever growing and available. One example of data we will use is the JISC UK Web Domain Dataset, a novel cache of geolocated, archived websites. It presents an interesting and useful application for which new mathematical theory can be applied to. The project will use big data sources in using new approaches for data science. This is a multidisciplinary project that will tackle existing problems within Geography, whilst also being an opportunity for new ideas developing within Graph theory to be applied to real data; the project thus falls within the EPSRC Statistics and applied probability research area.

The findings can be interpreted to better understand how relationships and dependencies between places behave, in order to make useful inferences and predictions. It will also exemplify the usefulness of the methodologies, while highlighting and conducting further research into interesting insights the analysis returns.

There is an obvious gap in utilising such web data to create meaningful geographic knowledge about urban and regional interdependencies, to help better design things such as regional policies. The vision of this PhD project is to apply novel graph theory to release the untapped potential of web data to capture spatial relationships that otherwise we would not have been able to understand. The project will positively impact our understanding of the strengths of these new methods, as well as highlighting potential opportunities to expand and develop upon. It has the capability to give a new perspective on how geographers interpret and investigate spatial relationships. In using web data, it can positively impact how we are able to view data in such a way to give it a temporal and spatial aspect, that gives it new avenues in the ways it can be used.

Planned Impact

The COMPASS Centre for Doctoral Training will have the following impact.

Doctoral Students Impact.

I1. Recruit and train over 55 students and provide them with a broad and comprehensive education in contemporary Computational Statistics & Data Science, leading to the award of a PhD. The training environment will be built around a set of multilevel cohorts: a variety of group sizes, within and across year cohort activities, within and across disciplinary boundaries with internal and external partners, where statistics and computation are the common focus, but remaining sensitive to disciplinary needs. Our novel doctoral training environment will powerfully impact on students, opening their eyes to not only a range of modern technical benefits and opportunities, but on the power of team-working with people from a range of backgrounds to solve the most important problems of the day. They will learn to apply their skills to achieve impact by collaborative working with internal and external partners, such as via our Rapid Response Teams, Policy Workshops & Statistical Clinics.

I2. As well as advanced training in computational statistics and data science, our students will be impacted by exposure to, and training in, important cognate topics such as ethics, responsible innovation, equality, diversity and inclusion, policy, effective communication and dissemination, enterprise, impact and consultancy skills. It is vital for our students to understand that their training will enable them to have a powerful impact on the wider world, so, e.g., AI algorithms they develop should not be discriminatory, and statistical methodologies should be reproducible, and statistical results accurately and comprehensibly communicated to the general public and policymakers.

I3. The students will gain experience via collaborations with academic partners within the University in cognate disciplines, and a wide range of external industrial & government partners. The students will be impacted by the structured training programmes of the UK Academy of Postgraduate Training in Statistics, the Bristol Doctoral College, the Jean Golding Institute, the Alan Turing Institute and the Heilbronn Institute for Mathematical Sciences, which will be integrated into our programme.

I4. Having received an excellent training, the students will then impact powerfully on the world in their future fruitful careers, spreading excellence.

Impact on our Partners & ourselves.

I5. Direct impacts will be achieved by students engaging with, and working on projects with, our academic partners, with discipline-specific problems arising in engineering, education, medicine, economics, earth sciences, life sciences and geographical sciences, and our external partners Adarga, the Atomic Weapons Establishment, CheckRisk, EDF, GCHQ, GSK, the Office for National Statistics, Sciex, Shell UK, Trainline and the UK Space Agency. The students will demonstrate a wide range of innovation with these partners, will attract engagement from new partners, and often provide attractive future employment matches for students and partners alike.

Wider Societal Impact

I6. COMPASS will greatly benefit the UK by providing over 55 highly trained PhD graduates in an area that is known to be suffering from extreme, well-known, shortages in the people pipeline nationally. COMPASS CDT graduates will be equipped for jobs in sectors of high economic value and national priority, including data science, analytics, pharmaceuticals, security, energy, communications, government, and indeed all research labs that deal with data. Through their training, they will enable these organisations to make well-informed and statistically principled decisions that will allow them to maximise their international competitiveness and contribution to societal well-being. COMPASS will also impact positively on the wider student community, both now and sustainably into the future.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023569/1 01/04/2019 30/09/2027
2592871 Studentship EP/S023569/1 01/10/2021 19/09/2025 Emerald Dilworth