Efficient Spectral Algorithms for Massive and Dynamic Graphs

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Informatics

Abstract

Spectral graph theory investigates the algebraic properties of matrices associated with graphs. Over the past 20 years, studies in spectral graph theory have successfully overcome fundamental bottlenecks faced by combinatorial algorithms, and have become a major research focus in theoretical computer science, machine learning, and network analysis. In particular, some recent breakthrough results show that spectral techniques can be applied to solve central optimisation and learning problems in nearly-linear time. Designing such highly efficient algorithms is crucial to cope with the emergence of massive graphs and real-time data sets coming from technological, social and biological networks.

In this fellowship I propose three research directions to advance the studies of spectral algorithms and their applications in data science: (1) I propose to advance our understanding of fundamental spectral techniques by studying the spectral properties of different matrices representing directed graphs, and exploring new connections between graphs and other mathematical objects, e.g., manifolds studied in geometry; (2) I propose to investigate new spectral algorithms for two fundamental graph problems in different settings, and further improve the state-of-the-art of the two problems with respect to the algorithms' runtime and performance; (3) spectral algorithms vastly outperform combinatorial algorithms with respect to their runtime in the worst case, however the design of most spectral algorithms usually involve many procedures, which bring the issue of numerical stability for efficient implementations. To address this I propose to develop an open-source algorithmic library for spectral algorithms so that by the end of the fellowship data scientists would be able to use the state-of-the-art algorithms for spectral sparsification and graph clustering in a black box manner.

A successful completion of the fellowship will make a significant step towards understanding the power and limits of the algebraic techniques in designing fast graph algorithms in various settings, and the performance of nearly-linear time spectral algorithms in real world data sets.

Planned Impact

The overall fellowship is to investigate foundational, algorithmic, and applied aspects of spectral methods. By the nature of the proposed work and the tight relationships among the four work packages in the proposal, the fellowship will build connections between a number of active research areas in computer science and mathematics, and the fellowship's outcome would ideally spread to all these fields.

Because of the cross-disciplinary nature of the fellowship, academic researchers in the disciplines directly involved, including algorithms, combinatorial optimisation, machine learning, and numerical linear algebra, will appreciate the technical advances of the fellowship that addresses not only the theoretical foundations but also practical applications of the spectral algorithms for the two fundamental graph problems. The fellowship will further investigate the algorithmic Kadison-Singer problem, and the relationships between graphs and manifolds, whose research outcome will have a significant impact in geometry.

During the fellowship there will be a series of pathways to impact events, including two workshops in algorithmic spectral graph theory, master course on Spectral Algorithms and Their Applications in Data Science to be held at the Alan Turing Institute, as well as the involvement of the Turing Challenge Fortnights. These events provide an ideal platform to not only report the fellowship's progress with the other experts in the area and exchange ideas with researchers in related fields, but also introduce this frontier research field to other early-career researchers and more postgraduate students in the UK.

Finally, the fellowship will produce the first open-source algorithmic library of the state-of-the-art algorithms for spectral sparsification and graph clustering. The algorithmic library's performance with respect to its runtime and numerical stability on real-world data sets will be of great value to researchers in algorithms, machine learning, and numerical linear algebra. In the long term, a large number of end users of spectral algorithms will be also greatly benefited from the algorithmic library.

Funded Value:

£1,205,706

Funded Period:

Jan 20 - Dec 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Fellowship

Project Reference:

EP/T00729X/1

Principal Investigator:

He Sun

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Fundamentals of Computing (100%)

Organisations

People	ORCID iD
He Sun (Principal Investigator / Fellow)

Publications

Author Name

Title Publication Date Published

10 25 50

Bernstein A (2022) Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary

Bogdan-Adrian Manghiuc (2020) Augmenting the Algebraic Connectivity of Graphs

Bogdan-Adrian Manghiuc (2021) Hierarchical clustering: O(1)-approximation for well-clustered graphs

Macgregor P (2022) A Tighter Analysis of Spectral Clustering, and Beyond

Mihai Cucuringu (2020) Hermitian matrices for clustering directed graphs: insights and applications

Peter Macgregor (2021) Finding bipartite components in hypergraphs

Peter Macgregor (2021) Local algorithms for finding densely connected clusters

Steinar Laenen (2020) Higher-order spectral clustering of directed graphs

Key Findings
Impact Summary
Collaboration
Software and Technical Products


Description	- We have studied directed graph clustering and its applications for learning higher-order structures of clusters. In particular, we demonstrate that these algorithms can be used to uncover many significant patterns of clusters, and the use of our designed algorithms in analysing international trade and migration datasets. We further show that these significant patterns can be uncovered with local algorithms; the runtime of local algorithms is usually independent of the size of the underlying graph, and it's very desired for analysing massive datasets. - Spectral clustering is one of most popular algorithms for modern graph clustering tasks, and consist of the two steps: embed the vertices of some graph into k-dimensional Euclidean space using k eigenvectors of some matrix of G, and applies k-means to partition the vertex set of an input graph into k clusters. We prove that, by applying fewer than k eigenvectors to construct the embedding, spectral clustering is able to produce better output for many practical instances; this result is the first of its kind in spectral clustering, and demonstrates that one can improve the quality of clustering outcome through some simple enumeration of eigenvectors.
Exploitation Route	- The others interested in analysing higher-order information of graphs could use our designed algorithms directly. - Our newly developed algorithm for hierarchical clustering can be used by others to better analyse the hierarchical structure of the input set occurring in practice. - Our newly developed algorithm for finding dense components in hypergraphs could potentially be used in other research areas of computer science, including network analysis, and natural language processing.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	Some of our recent works appearing at NeurIPS'20 and AISTATS'20 shows that our designed algorithms can be used to analyse international trade and various migration datasets. In addition, our work appearing at ICML'21 shows that some structures between the clusters can be uncovered with local algorithms, and the performance of our designed algorithms is showcased through the US migration dataset and the Interstate Disputes Dataset. We expect that this line of research would receive more recognition outside computer science in the near future.
First Year Of Impact	2021
Sector	Other
Impact Types	Economic,Policy & public services


Description	Dynamic Graph Sparsification
Organisation	ETH Zurich
Country	Switzerland
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Description	Dynamic Graph Sparsification
Organisation	Rutgers University
Country	United States
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Description	Dynamic Graph Sparsification
Organisation	Stanford University
Country	United States
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Description	Dynamic Graph Sparsification
Organisation	University of California, Berkeley
Country	United States
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Description	Dynamic Graph Sparsification
Organisation	University of Copenhagen
Country	Denmark
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Description	Dynamic Graph Sparsification
Organisation	University of Michigan
Country	United States
Sector	Academic/University
PI Contribution	I initiate the project during a research visit, and contribute some of the key ideas for the collaboration.
Collaborator Contribution	Designing efficient dynamic graph algorithms against an adaptive adversary is an important goal in the field of dynamic graph algorithms and has witnessed many exciting recent developments. Our work presents the first non-trivial efficient adaptive algorithms for maintaining many sparsifiers against an adaptive adversary.
Impact	Fully-Dynamic Graph Sparsifiers Against an Adaptive Adversary. A. Bernstein, J. van den Brand, M. Gutenberg, D. Nanongkai, T. Saranurak, A. Sidford, and H. Sun. Proceedings of the 49th International Colloquium on Automata, Languages, and Programming (ICALP 2022).
Start Year	2020


Title	STAG: Spectral Toolkit of Algorithms for Graphs
Description	STAG is an open-source library of efficient spectral algorithms for graphs. It is the first such algorithmic library mainly written in C++, with a python wrapper around the underlying C++ library for python users. The source code of STAG can be found from GitHub page.
Type Of Technology	Webtool/Application
Year Produced	2023
Open Source License?	Yes
Impact	We have so far implemented the local graph clustering component. Comparing with existing open-source implementation for local graph clustering, STAG connects to a Neo4j database running in the AuraDB cloud service, or a database running locally. With this feature, one can run local graph clustering on the entire Wikipedia graph of more than 20GB with a single PC. It is the first open-source implementation of local graph clustering that connects a graph database.
URL	https://staglibrary.io

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications